prudhvigodithi commented on PR #15446: URL: https://github.com/apache/lucene/pull/15446#issuecomment-3568992048
Before I add some tests, tested this behavior using https://github.com/msfroh/lucene-university (will check in the code here as well). Notice in the following logs: - A segment is divided into 5 partitions and part of 5 different slices. - Score supplier is called by all partitions for a the same segment `ctx identity: 857068247`. - All 5 threads get same supplier `called on supplier #1557216666291` (`SegmentDocIdSetSupplier`) done by thread 41 from partition [400000, 800000) - All partitions share same cache entry `supplier identity: 1536099041` (same for all 5). - BKD traversal happens only ONCE `[BUILD_START]` on thread 39, `[BUILD_SKIP]` on 4 other threads, so only 1 thread builds the DocIdSet, the other 4 threads reuse the cached result. - ``` > Task :example.points.IntraSegmentPointRangeTest.main() === Intra-Segment Point Range Query Test === Step 1: Indexing documents... Indexing 2000000 documents... Indexed 500000 documents... Indexed 1000000 documents... Indexed 1500000 documents... Force merging to single segment... Indexing complete! Step 2: Opening reader and creating searcher... Index info: Total docs: 2000000 Number of segments: 1 Segment 0: 2000000 docs Creating IndexSearcher with 4 threads === Slice Information === Number of slices: 5 Slice 0: Number of partitions: 1 Total docs in slice: 400000 Partition 0: Segment: 0 Doc range: [0, 400000) Doc count: 400000 Slice 1: Number of partitions: 1 Total docs in slice: 400000 Partition 0: Segment: 0 Doc range: [400000, 800000) Doc count: 400000 Slice 2: Number of partitions: 1 Total docs in slice: 400000 Partition 0: Segment: 0 Doc range: [800000, 1200000) Doc count: 400000 Slice 3: Number of partitions: 1 Total docs in slice: 400000 Partition 0: Segment: 0 Doc range: [1200000, 1600000) Doc count: 400000 Slice 4: Number of partitions: 1 Total docs in slice: 400000 Partition 0: Segment: 0 Doc range: [1600000, 2000000) Doc count: 400000 Step 3: Executing range query... Query: value:[0 TO 1499999] Expected matches: 1500000 Searching (multi-threaded)... === Multi-threaded Search Results === Total hits: 1500000 Time: 29ms === Verification === Expected: 1500000 Actual: 1500000 Result: ✓ CORRECT === Sample Results (Top 10) === Nov 23, 2025 5:12:50 PM org.apache.lucene.internal.vectorization.VectorizationProvider lookup WARNING: Java vector incubator module is not readable. For optimal vector performance, pass '--add-modules jdk.incubator.vector' to enable Vector API. [SCORER_SUPPLIER] Called for segment 0 partition [0, 400000) on thread 3 ctx identity: 857068247 [SCORER_SUPPLIER] Called for segment 0 partition [400000, 800000) on thread 41 ctx identity: 857068247 [SCORER_SUPPLIER] Called for segment 0 partition [1200000, 1600000) on thread 39 ctx identity: 857068247 [CACHE_LOOKUP] Before computeIfAbsent, cache size: 0 [CACHE_LOOKUP] Before computeIfAbsent, cache size: 0 [SCORER_SUPPLIER] Called for segment 0 partition [800000, 1200000) on thread 40 ctx identity: 857068247 [CACHE_LOOKUP] Before computeIfAbsent, cache size: 0 [SCORER_SUPPLIER] Called for segment 0 partition [1600000, 2000000) on thread 38 ctx identity: 857068247 [CACHE_LOOKUP] Before computeIfAbsent, cache size: 0 [CACHE_LOOKUP] Before computeIfAbsent, cache size: 0 [CACHE_MISS] CREATING new SegmentDocIdSetSupplier for segment 0 on thread 41 [SUPPLIER_CREATED] SegmentDocIdSetSupplier #1557216666291 for segment 0 [CACHE_RESULT] After computeIfAbsent, cache size: 1, supplier identity: 1536099041 [CACHE_RESULT] After computeIfAbsent, cache size: 1, supplier identity: 1536099041 [CACHE_RESULT] After computeIfAbsent, cache size: 1, supplier identity: 1536099041 [CACHE_RESULT] After computeIfAbsent, cache size: 1, supplier identity: 1536099041 [CACHE_RESULT] After computeIfAbsent, cache size: 1, supplier identity: 1536099041 [GET_OR_BUILD] Called on supplier #1557216666291 for segment 0 on thread 39 [BUILD_CHECK] cachedDocIdSet is null, entering synchronized block [BUILD_START] Building DocIdSet for segment 0 on thread 39 [GET_OR_BUILD] Called on supplier #1557216666291 for segment 0 on thread 38 [BUILD_CHECK] cachedDocIdSet is null, entering synchronized block [GET_OR_BUILD] Called on supplier #1557216666291 for segment 0 on thread 3 [BUILD_CHECK] cachedDocIdSet is null, entering synchronized block [GET_OR_BUILD] Called on supplier #1557216666291 for segment 0 on thread 40 [BUILD_CHECK] cachedDocIdSet is null, entering synchronized block [GET_OR_BUILD] Called on supplier #1557216666291 for segment 0 on thread 41 [BUILD_CHECK] cachedDocIdSet is null, entering synchronized block Disconnected from the target VM, address: 'localhost:55600', transport: 'socket' [BUILD_COMPLETE] Built DocIdSet for segment 0 in 901ms [BUILD_SKIP] Another thread already built DocIdSet [BUILD_SKIP] Another thread already built DocIdSet [BUILD_SKIP] Another thread already built DocIdSet [BUILD_SKIP] Another thread already built DocIdSet Doc 0: value=0, score=1.0 Doc 1: value=1, score=1.0 Doc 2: value=2, score=1.0 Doc 3: value=3, score=1.0 Doc 4: value=4, score=1.0 Doc 5: value=5, score=1.0 Doc 6: value=6, score=1.0 Doc 7: value=7, score=1.0 Doc 8: value=8, score=1.0 Doc 9: value=9, score=1.0 === Cleanup === Shutting down executor... Done! ``` #### Visual Flow ``` LeafReaderContext object (ctx identity: 857068247) ↑ ↑ ↑ ↑ ↑ │ │ │ │ │ Partition Partition Partition Partition Partition [0,400K) [400K,800K) [800K,1.2M) [1.2M,1.6M) [1.6M,2M) Thread 3 Thread 41 Thread 40 Thread 39 Thread 38 ``` ``` Thread 41: [CACHE_MISS] Creates supplier ─────────────┐ Thread 39: [CACHE_RESULT] Gets supplier ──┐ │ Thread 38: [CACHE_RESULT] Gets supplier ──┤ │ Thread 3: [CACHE_RESULT] Gets supplier ──┤ │ Thread 40: [CACHE_RESULT] Gets supplier ──┘ │ │ │ ↓ │ All 5 threads have │ same supplier │ │ │ ↓ ↓ Thread 39: [BUILD_START] ← Wins build race, BUILDS Thread 38: [BUILD_SKIP] ← Waits, then reuses Thread 3: [BUILD_SKIP] ← Waits, then reuses Thread 40: [BUILD_SKIP] ← Waits, then reuses Thread 41: [BUILD_SKIP] ← Waits, then reuses (even though it created supplier!) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
