vsop-479 commented on PR #12846:
URL: https://github.com/apache/lucene/pull/12846#issuecomment-1833216581
@jpountz I applied copy for all level's empty acc, Please take a look when
you get a chance.
As so far I just performed copy on empty acc that cleared by ``clear``. Do
you mean we co
mikemccand commented on PR #12850:
URL: https://github.com/apache/lucene/pull/12850#issuecomment-1833004513
Thanks for the detailed explanation @ChrisHegarty and @jpountz -- fix looks
good! Sneaky random merge policies...
--
This is an automated message from the Apache Git Service.
To re
gsmiller commented on issue #12243:
URL: https://github.com/apache/lucene/issues/12243#issuecomment-1832911784
+1 to shrinking the API surface of the `TermInSetQuery` ctors and getting
rid of the varargs flavors since they're pure syntactic sugar. Looks like
#12837 is the related PR, so lef
gsmiller commented on PR #12837:
URL: https://github.com/apache/lucene/pull/12837#issuecomment-1832910639
+1 to this simplification. Let's also get rid of `public
TermInSetQuery(RewriteMethod rewriteMethod, String field, BytesRef... terms)`?
--
This is an automated message from the Apache
gsmiller commented on issue #12585:
URL: https://github.com/apache/lucene/issues/12585#issuecomment-1832904037
I think I'd be curious if there are many real-world use-cases out there that
have a non-positive association between each document and facet label? I would
guess it's pretty uncomm
gsmiller commented on issue #12418:
URL: https://github.com/apache/lucene/issues/12418#issuecomment-1832886094
I think I know what's going on here, and I believe #12853 fixes it. In both
seeds mentioned here, I've noticed that the drill-sideways query short-circuits
on at least one segment,
stefanvodita commented on PR #12844:
URL: https://github.com/apache/lucene/pull/12844#issuecomment-1832732598
Thanks for the suggestions @zhaih! I have to think about option 2 a bit more.
If we change `ramBytesUsed` back, performance recovers (Mostly? I'm not sure
how noisy this bench
slow-J opened a new pull request, #12856:
URL: https://github.com/apache/lucene/pull/12856
I was looking at this old issue:
https://github.com/apache/lucene/issues/4390 `Flexible Queryparser has some
Locale violation with lowercasing`, which was already fixed with the work done
in [GH#5271
gsmiller opened a new pull request, #12855:
URL: https://github.com/apache/lucene/pull/12855
This extension hook is no longer used on the main branch, so let's remove
it. Opened a corresponding PR to mark this deprecated on 9x (#12854)
--
This is an automated message from the Apache Git S
gsmiller opened a new pull request, #12854:
URL: https://github.com/apache/lucene/pull/12854
This extension hook is only referenced by the also-deprecated
`DrillSideways#search(DrillDownQuery, Collector)`, so let's mark it deprecated
as well and remove it on main.
--
This is an automated
zhaih commented on PR #12844:
URL: https://github.com/apache/lucene/pull/12844#issuecomment-1832571805
@benwtrent Thanks for running the benchmark. I looked at the profile and I
think we call `ramBytesUsed` after every document is indexed to control the
flush
[here](https://github.com/apac
gsmiller commented on PR #12563:
URL: https://github.com/apache/lucene/pull/12563#issuecomment-1832462625
Closing in favor of #12853 that fixes the underlying root cause and adds
testing
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on
gsmiller closed pull request #12563: IntTaxonomyFacets test case
URL: https://github.com/apache/lucene/pull/12563
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe
gsmiller commented on issue #12558:
URL: https://github.com/apache/lucene/issues/12558#issuecomment-1832460996
I opened a PR here that fixes the root cause of this bug:
https://github.com/apache/lucene/pull/12853
--
This is an automated message from the Apache Git Service.
To respond to t
gsmiller commented on code in PR #12853:
URL: https://github.com/apache/lucene/pull/12853#discussion_r1409693152
##
lucene/CHANGES.txt:
##
@@ -112,6 +112,9 @@ Bug Fixes
* GITHUB#12220: Hunspell: disallow hidden title-case entries from compound
middle/end
+* GITHUB#12558: E
gsmiller opened a new pull request, #12853:
URL: https://github.com/apache/lucene/pull/12853
In GH#12558, @Shradha26 discovered that faceting implementations are
sometimes getting empty MatchingDoc lists, resulting in undesired behavior. We
should consider it a bug for MatchingDocs to not b
ChrisHegarty commented on PR #12850:
URL: https://github.com/apache/lucene/pull/12850#issuecomment-1832391520
> This new failure looks like it's due to the fact that
MockRandomMergePolicy now randomly reverses the doc ID order on merge?
That's it exactly. I confirmed this through loca
stefanvodita commented on code in PR #12844:
URL: https://github.com/apache/lucene/pull/12844#discussion_r1409603364
##
lucene/core/src/java/org/apache/lucene/util/ArrayUtil.java:
##
@@ -330,6 +330,29 @@ public static int[] growExact(int[] array, int newLength) {
return cop
stefanvodita commented on PR #12844:
URL: https://github.com/apache/lucene/pull/12844#issuecomment-1832333168
Thank you for running the benchmarks @benwtrent, you were quicker than I
was. I was worried the modified `ramBytesUsed` would be slow. Maybe we can come
up with a smarter way to do
jpountz commented on PR #12846:
URL: https://github.com/apache/lucene/pull/12846#issuecomment-1832112426
> Sorry, I am still confused about this. If we clear an unEmpty treeset, how
to deal with its competitive entries?
This looks similar to the arraycopy that you perform on the `maxF
jpountz commented on PR #12699:
URL: https://github.com/apache/lucene/pull/12699#issuecomment-1832082403
+1 to not reverting #12631. The fact that our terms index used to not share
prefixes properly was a bug, I'm glad it's fixed even though it may make
performance slightly slower because m
dungba88 commented on PR #12624:
URL: https://github.com/apache/lucene/pull/12624#issuecomment-1832050665
I re-ran the Test2BFST with the new change, it looks much better
```
1> TEST: now verify [fst size=4621076364; nodeCount=2252341486;
arcCount=2264078585]
1> 0...: took 0
benwtrent commented on PR #12844:
URL: https://github.com/apache/lucene/pull/12844#issuecomment-1831889329
Yeah, `ramBytesUsed` changes here are adding significant overhead to
indexing. Here is a cpu profile.
[pr12844-768-10-wall.jfr.zip](https://github.com/apache/lucene/files/13
benwtrent commented on PR #12844:
URL: https://github.com/apache/lucene/pull/12844#issuecomment-1831865170
I ran knnPerfTest in Lucene util against this PR. 100k cohere vectors.
Flushing was depending on memory usage (Do we check the ram usage of
onHeapgraph during indexing to determine whe
ChrisHegarty commented on PR #12850:
URL: https://github.com/apache/lucene/pull/12850#issuecomment-1831806426
My understanding of this part of the test is that it asserts the sort order
of the multi-valued string values in the `value` field. The changes here do not
affect this assertion, th
dungba88 commented on code in PR #12844:
URL: https://github.com/apache/lucene/pull/12844#discussion_r1409183001
##
lucene/core/src/java/org/apache/lucene/util/ArrayUtil.java:
##
@@ -330,6 +330,29 @@ public static int[] growExact(int[] array, int newLength) {
return copy;
dungba88 commented on code in PR #12844:
URL: https://github.com/apache/lucene/pull/12844#discussion_r1409183001
##
lucene/core/src/java/org/apache/lucene/util/ArrayUtil.java:
##
@@ -330,6 +330,29 @@ public static int[] growExact(int[] array, int newLength) {
return copy;
dungba88 commented on code in PR #12844:
URL: https://github.com/apache/lucene/pull/12844#discussion_r1409177467
##
lucene/core/src/java/org/apache/lucene/util/ArrayUtil.java:
##
@@ -330,6 +330,29 @@ public static int[] growExact(int[] array, int newLength) {
return copy;
dungba88 opened a new issue, #12852:
URL: https://github.com/apache/lucene/issues/12852
### Description
See this comment:
https://github.com/apache/lucene/pull/12624#issuecomment-1829633268
> I wonder: are there other places in Lucene that might fall prey to this
performance t
dungba88 commented on code in PR #12847:
URL: https://github.com/apache/lucene/pull/12847#discussion_r1409158917
##
lucene/CHANGES.txt:
##
@@ -176,7 +178,8 @@ API Changes
* GITHUB#12799: Make TaskExecutor constructor public and use TaskExecutor for
concurrent
HNSW graph bui
mikemccand commented on PR #12850:
URL: https://github.com/apache/lucene/pull/12850#issuecomment-1831732238
> The test assertions depend on doc order which may be affected by merging.
Hmm, I don't fully understand the test, but did the sort order truly rely on
`docid` tie-breaking? I
mikemccand merged PR #12847:
URL: https://github.com/apache/lucene/pull/12847
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.
mikemccand commented on code in PR #12847:
URL: https://github.com/apache/lucene/pull/12847#discussion_r1409134920
##
lucene/CHANGES.txt:
##
@@ -176,7 +178,8 @@ API Changes
* GITHUB#12799: Make TaskExecutor constructor public and use TaskExecutor for
concurrent
HNSW graph b
dungba88 commented on code in PR #12847:
URL: https://github.com/apache/lucene/pull/12847#discussion_r1409117086
##
lucene/CHANGES.txt:
##
@@ -176,7 +178,8 @@ API Changes
* GITHUB#12799: Make TaskExecutor constructor public and use TaskExecutor for
concurrent
HNSW graph bui
gf2121 commented on PR #12699:
URL: https://github.com/apache/lucene/pull/12699#issuecomment-1831673998
Thanks @mikemccand for feedback! I'll backport this to 9.9.0 then.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use
mikemccand commented on code in PR #12847:
URL: https://github.com/apache/lucene/pull/12847#discussion_r1408286029
##
lucene/CHANGES.txt:
##
@@ -176,7 +178,8 @@ API Changes
* GITHUB#12799: Make TaskExecutor constructor public and use TaskExecutor for
concurrent
HNSW graph b
ChrisHegarty opened a new pull request, #12851:
URL: https://github.com/apache/lucene/pull/12851
Backport of:
* Fix intermittently failing TestSortedSetFieldSource #12850
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub
mikemccand commented on PR #12847:
URL: https://github.com/apache/lucene/pull/12847#issuecomment-1831667475
> The test failed with just `Error: The operation was canceled.` but I can't
tell why it happened. The same PR in my local branch works:
[dungba88#20](https://github.com/dungba88/luce
mikemccand commented on PR #12699:
URL: https://github.com/apache/lucene/pull/12699#issuecomment-1831665131
I don't think we should revert #12631 -- that one was a nice disk space
improvement, and the `PKLookup` performance is quite noisy -- going up or down
by just as much when we upgrade
ChrisHegarty commented on PR #12850:
URL: https://github.com/apache/lucene/pull/12850#issuecomment-1831661212
The test previously has been seen to fail with:
```
org.junit.ComparisonFailure: expected: but was:
at __randomizedtesting.SeedInfo.seed([45D4C56DF2092811:7D67E193D5FAF
mikemccand closed issue #12180: Add Facets#getSpecificValues (bulk) and bulk
path -> ordinal lookup for taxonomy faceting
URL: https://github.com/apache/lucene/issues/12180
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use
mikemccand commented on issue #12180:
URL: https://github.com/apache/lucene/issues/12180#issuecomment-1831660442
Thanks @ChrisHegarty -- this one is done.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above t
ChrisHegarty opened a new pull request, #12850:
URL: https://github.com/apache/lucene/pull/12850
This commit fixes the intermittently failing TestSortedSetFieldSource.
The test assertions depend on doc order which may be affected by merging.
The fix is to trivially avoid merging for t
mikemccand commented on issue #11138:
URL: https://github.com/apache/lucene/issues/11138#issuecomment-1831658096
Ahh thanks @ChrisHegarty -- no more work for this issue.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use
mikemccand closed issue #11138: configuration items of the alg file are adapted
to the 9.0 branch [LUCENE-10100]
URL: https://github.com/apache/lucene/issues/11138
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL a
Shradha26 commented on issue #12558:
URL: https://github.com/apache/lucene/issues/12558#issuecomment-1831628034
Hey @gsmiller, thanks for trying this out again - were you able to reproduce
this in 9_8 again? I'd found the bug in 9_7 and I don't think it was an issue
in 9_8.
--
This is an
ChrisHegarty commented on issue #11138:
URL: https://github.com/apache/lucene/issues/11138#issuecomment-1831542918
Hi,
I'd like to ask a clarifying question as part of the _9.9.0_ release manager
duties, since this currently-unresolved issue is associated with Milestone
9.9.0.
ChrisHegarty commented on issue #12180:
URL: https://github.com/apache/lucene/issues/12180#issuecomment-1831540117
Hi,
I'd like to ask a clarifying question as part of the _9.9.0_ release manager
duties, since this currently-unresolved issue is associated with _Milestone
9.9.0_.
alessandrobenedetti commented on issue #12313:
URL: https://github.com/apache/lucene/issues/12313#issuecomment-1831534483
Hi @david-sitsky, the multi-valued vectors in Lucene's contribution is now
paused for lack of fundings.
I'll resume it from my side if and when I get some sponsors :)
gf2121 commented on PR #12699:
URL: https://github.com/apache/lucene/pull/12699#issuecomment-1831412787
Hi @mikemccand @jpountz !
The [newest nightly
benchmark](https://home.apache.org/~mikemccand/lucenebench/2023.11.28.18.03.59.html)
shows we get back some speed for
[PKLookUp](http
50 matches
Mail list logo