jpountz commented on PR #13885:
URL: https://github.com/apache/lucene/pull/13885#issuecomment-2416031098
It's almost certainly this change that sped up TermTitleSort on October
12th, since TermTitleSort is the task that creates the most `PostingsEnum`
objects. I pushed an annotation.
--
original-brownbear merged PR #13885:
URL: https://github.com/apache/lucene/pull/13885
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...
original-brownbear commented on PR #13885:
URL: https://github.com/apache/lucene/pull/13885#issuecomment-2408517553
Thanks Adrien!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific com
original-brownbear commented on PR #13885:
URL: https://github.com/apache/lucene/pull/13885#issuecomment-2407850593
Ok this was my fault, I tried this approach but it failed for
`EverythingEnum`. Now I realize though, `EverythingEnum` effectively sets up
the `PForUtil` every time anyway. Th
jpountz commented on PR #13885:
URL: https://github.com/apache/lucene/pull/13885#issuecomment-2407713452
I would expect putting something like `if (forUtil == null &&
termState.docFreq >= BLOCK_SIZE) { /* initialize forUtil */ }` in reset() to
work (untested).
--
This is an automated mes
original-brownbear commented on PR #13885:
URL: https://github.com/apache/lucene/pull/13885#issuecomment-2407269525
Hmm I tried and failed miserably to find a solution to initialize in
`reset()` that I fully understand :) It's quite hard to reason about all the
possible paths here for me, b
jpountz commented on PR #13885:
URL: https://github.com/apache/lucene/pull/13885#issuecomment-2405769428
If you're looking at this sort of allocation, you may also want to
specialize BlockDocsEnum into one class that decodes only doc IDs and another
one that decodes docs and freqs. The form
jpountz commented on PR #13885:
URL: https://github.com/apache/lucene/pull/13885#issuecomment-2405765367
Even if it doesn't show up in benchmarks it's disappointing to have these
conditions in hot code paths. Could we instead initialize these objects in
reset() if `docFreq >= BLOCK_SIZE`?
original-brownbear opened a new pull request, #13885:
URL: https://github.com/apache/lucene/pull/13885
Lazy initialize these fields. They consume/cause a lot of memory/GC because
they are allocated frequently (~7% of all allocations in luceneutil's wikimedia
medium run for me). This does no