mikemccand commented on issue #12536:
URL: https://github.com/apache/lucene/issues/12536#issuecomment-170145
> In theory, if the skipper can tell us how many positions it has skipped
that would work. This will require storing more information in the skip data
than the current scheme.
mikemccand commented on PR #12541:
URL: https://github.com/apache/lucene/pull/12541#issuecomment-1710002412
Oh, it looks like `tidy` is angry -- can you run `./gradlew tidy` @Tony-X?
This will re-style your new comment to match the required styling. Thanks!
--
This is an automated messa
mikemccand commented on issue #12513:
URL: https://github.com/apache/lucene/issues/12513#issuecomment-1710036830
I'm poking around trying to understand Tantivy's FST implementation, and
found it was forked originally from [this FST
implementation](https://github.com/BurntSushi/fst) into thi
mikemccand commented on issue #12513:
URL: https://github.com/apache/lucene/issues/12513#issuecomment-1710044325
How the [blog post](https://blog.burntsushi.net/transducers/) models his cat
reminds me of how I [modeled the scoring of a single tennis game as an
FSA](https://blog.mikemccandle
mikemccand commented on issue #12513:
URL: https://github.com/apache/lucene/issues/12513#issuecomment-1710089496
Aha! This is an interesting approach:
```
It is possible to mitigate the onerous memory required by sacrificing
guaranteed minimality of the resulting FST. Namely, on
mikemccand commented on issue #12513:
URL: https://github.com/apache/lucene/issues/12513#issuecomment-1710091688
The next paragraph in the blog is also very interesting!
```
An interesting consequence of using a bounded hash table which only
stores some of the states is that cons
mikemccand opened a new issue, #12543:
URL: https://github.com/apache/lucene/issues/12543
### Description
[Spinoff from [this
comment](https://github.com/apache/lucene/issues/12513#issuecomment-1710091688)
inspired by Tantivy's FST implementation]
The building of an FST is inh
mikemccand commented on issue #12527:
URL: https://github.com/apache/lucene/issues/12527#issuecomment-1710234616
OK I tested the "read into scratch array" approach from [this
comment](https://github.com/apache/lucene/issues/12527#issuecomment-1708857931):
```
mikemccand commented on PR #12535:
URL: https://github.com/apache/lucene/pull/12535#issuecomment-1710238828
Thanks @uschindler and @rmuir -- I will restore the 30s timeout, try to
improve logging on client and server errors (so we can figure out WTF happened
that caused client not to connec
jimczi commented on PR #12529:
URL: https://github.com/apache/lucene/pull/12529#issuecomment-1710241290
I merged with the latest changes in main, the new random vector scorer
integrates nicely with the changes added
`https://github.com/apache/lucene/pull/12480`. The only difference is that
mikemccand commented on PR #12535:
URL: https://github.com/apache/lucene/pull/12535#issuecomment-1710320639
OK I made the changes!
I also manually tested two failure modes of the clients: taking too long to
initially connect, and throwing some sort of exception after testing the lock
mikemccand commented on PR #12535:
URL: https://github.com/apache/lucene/pull/12535#issuecomment-1710327041
> I did notice one odd thing: on test failure, I seemed to have a leftover
test.lock in the root directory of the checkout, which is very odd. The test
creates a new temp directory an
benwtrent commented on PR #12529:
URL: https://github.com/apache/lucene/pull/12529#issuecomment-1710335794
Well, actually looking at the JFR, I cannot see anything that stands out.
The percentages of compute time are still VERY similar when building index &
querying. I may just be detecting
dweiss commented on issue #12542:
URL: https://github.com/apache/lucene/issues/12542#issuecomment-1710614193
I like it. These options we currently have are not even expert level,
they're God-level...
--
This is an automated message from the Apache Git Service.
To respond to the message, p
jimczi commented on PR #12529:
URL: https://github.com/apache/lucene/pull/12529#issuecomment-1710682508
Thanks for running the benchmarks @benwtrent . I agree that the difference
seems to be in the noise.
--
This is an automated message from the Apache Git Service.
To respond to the messa
Tony-X commented on PR #12541:
URL: https://github.com/apache/lucene/pull/12541#issuecomment-1710736645
@mikemccand sure. Thanks for pointing out the useful `tidy` target :) I got
it fixed
--
This is an automated message from the Apache Git Service.
To respond to the message, please log
madrob commented on issue #12542:
URL: https://github.com/apache/lucene/issues/12542#issuecomment-1710800611
What's the impact of having a non-minimal FST? Longer query times? Is that
something that gets dwarfed by having multiple segments anyway? Maybe different
merge policies have differe
elliotzlin commented on PR #1069:
URL: https://github.com/apache/lucene/pull/1069#issuecomment-1711167476
@dsmiley apologies for my delay in getting back to your comment! I don't
have any qualms about refactoring to deter people from using this. I took up
this ticket more so to get involved
18 matches
Mail list logo