Tjianke commented on issue #11707:
URL: https://github.com/apache/lucene/issues/11707#issuecomment-1449598942
Lucene community has the good tradition of incorporating academic results.
Recent studies show many efficient algorithms like [Partitioned
Elias-Fano](http://groups.di.unipi.it/~ott
kaivalnp commented on code in PR #12160:
URL: https://github.com/apache/lucene/pull/12160#discussion_r1121404545
##
lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java:
##
@@ -73,17 +77,48 @@ public Query rewrite(IndexSearcher indexSearcher) throws
IOExcep
kaivalnp commented on code in PR #12160:
URL: https://github.com/apache/lucene/pull/12160#discussion_r1120780564
##
lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java:
##
@@ -73,17 +77,48 @@ public Query rewrite(IndexSearcher indexSearcher) throws
IOExcep
dantuzi commented on code in PR #12169:
URL: https://github.com/apache/lucene/pull/12169#discussion_r1121445530
##
lucene/test-framework/src/java/org/apache/lucene/tests/analysis/BaseTokenStreamTestCase.java:
##
@@ -221,6 +223,12 @@ public static void assertTokenStreamContents(
dantuzi commented on PR #12169:
URL: https://github.com/apache/lucene/pull/12169#issuecomment-1449791310
@rmuir we did some tests at both query and index time.
We tried to index some documents using the following CustomAnalyzer which
includes our Word2VecSynonymFilter and we verified the
uschindler commented on PR #12042:
URL: https://github.com/apache/lucene/pull/12042#issuecomment-1449822469
Hi @mbien,
This is why the PR is currently in draft status. We build and test it
already with a local install. It is enough to set an env variable. Lucene
always runs Gradle with J
rmuir commented on PR #12169:
URL: https://github.com/apache/lucene/pull/12169#issuecomment-1449913809
I think you misunderstand the question. What happens to `BoostAttribute` at
index-time? absolutely nothing.
--
This is an automated message from the Apache Git Service.
To respond to the
rmuir commented on code in PR #12169:
URL: https://github.com/apache/lucene/pull/12169#discussion_r1121546026
##
lucene/test-framework/src/java/org/apache/lucene/tests/analysis/BaseTokenStreamTestCase.java:
##
@@ -221,6 +223,12 @@ public static void assertTokenStreamContents(
rmuir commented on PR #12169:
URL: https://github.com/apache/lucene/pull/12169#issuecomment-1449923913
From what I can tell, this probably shouldnt be an analyzer at all. Seems it
only works at query-time and will simply do the wrong thing at index-time. The
attempted boost manipulation by
gsmiller commented on PR #12156:
URL: https://github.com/apache/lucene/pull/12156#issuecomment-1450145244
Thanks @uschindler.
> I am not fully sure what default rewrite method is best here.
The nice thing is it's easy to control now (bitset rewrite, boolean scoring,
doc values
gsmiller merged PR #12156:
URL: https://github.com/apache/lucene/pull/12156
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.ap
gsmiller commented on issue #11707:
URL: https://github.com/apache/lucene/issues/11707#issuecomment-1450169614
@Tjianke the [luceneutil](https://github.com/mikemccand/luceneutil)
benchmarks are a great place to start. These power the [nightly
benchmarks](https://home.apache.org/~mikemccand/
gsmiller merged PR #12173:
URL: https://github.com/apache/lucene/pull/12173
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.ap
gsmiller opened a new issue, #12174:
URL: https://github.com/apache/lucene/issues/12174
### Description
This rewrite method (implemented in
`MultiTermQueryConstantScoreBlendedWrapper`) relies on `DefaultBulkScorer` when
there are more than 16 terms (with 16 or fewer, a `BooleanQuery`
gsmiller opened a new pull request, #12175:
URL: https://github.com/apache/lucene/pull/12175
### Description
Now that `TermInSetQuery` extends `MultiTermQuery` (#12156), we can leverage
other `RewriteMethod`s to change the query execution behavior. Because of this,
we can use `DocVal
Trey314159 commented on PR #12172:
URL: https://github.com/apache/lucene/pull/12172#issuecomment-1450316155
_Good catch!_ I didn't consider that the stemmer might also be of a vintage
to only use the older orthography. I've contacted the Snowball mailing list
(message not yet accepted) to s
benwtrent commented on code in PR #12160:
URL: https://github.com/apache/lucene/pull/12160#discussion_r1121925554
##
lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java:
##
@@ -73,17 +77,48 @@ public Query rewrite(IndexSearcher indexSearcher) throws
IOExce
gsmiller merged PR #12175:
URL: https://github.com/apache/lucene/pull/12175
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.ap
rmuir opened a new issue, #12176:
URL: https://github.com/apache/lucene/issues/12176
### Description
TermInSetQuery currently "ping-pong" intersects a sorted list against the
term dictionary.
Instead of sorted-list, it could possibly use Daciuk Mihov Automaton, which
can be bu
rmuir commented on PR #12172:
URL: https://github.com/apache/lucene/pull/12172#issuecomment-1450458707
I think we can merge this stopword list change anyway. But I think a filter
may be worthwhile as a separate PR?
It has the advantage of making the terms conflate regardless of which
rmuir commented on PR #12172:
URL: https://github.com/apache/lucene/pull/12172#issuecomment-1450468166
> I've contacted the Snowball mailing list (message not yet accepted)
fwiw I'm subscribed that list and haven't seen a message in 10 years. I
think they are just using github issues/
rmuir commented on PR #12172:
URL: https://github.com/apache/lucene/pull/12172#issuecomment-1450554452
> I have to admit that it chafes a little to convert everything to the
"wrong" form, but the internal representation is just an internal
representation, I guess, as long as everything is c
kashkambath opened a new issue, #12178:
URL: https://github.com/apache/lucene/issues/12178
### Description
Hi! This is my first time posting a GitHub issue for Apache Lucene. Please
let me know if you need anything further.
https://github.com/apache/lucene/blob/569533bd76a115e
kaivalnp commented on code in PR #12160:
URL: https://github.com/apache/lucene/pull/12160#discussion_r1122648348
##
lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java:
##
@@ -73,17 +77,48 @@ public Query rewrite(IndexSearcher indexSearcher) throws
IOExcep
24 matches
Mail list logo