zacharymorn commented on a change in pull request #240:
URL: https://github.com/apache/lucene/pull/240#discussion_r694404654



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/DrillSideways.java
##########
@@ -520,10 +458,39 @@ private DrillDownQuery getDrillDownQuery(
   }
 
   @SuppressWarnings("unchecked")
-  private <R> ConcurrentDrillSidewaysResult<R> searchSequentially(
-      final DrillDownQuery query, final CollectorManager<?, R> 
hitCollectorManager)
+  private <C extends Collector, R> ConcurrentDrillSidewaysResult<R> 
searchSequentially(
+      final DrillDownQuery query, final CollectorManager<C, R> 
hitCollectorManager)
       throws IOException {
 
+    // This mirrors a similar hack from DrillSideways#search(query, collector).
+    // Without this cache, LRU cache will be used, causing acceptDocs to be 
null during collection

Review comment:
       Thanks @gsmiller for the additional context & a new dedicated ticket. 
Just to provide some additional context below to wrap up the discussion in this 
PR before we cut over.
   
   This issue was discovered during my nightly test run with the following seed 
`./gradlew test --tests TestDrillSideways.testRandom 
-Dtests.seed=69A3EF02D8E3465E -Dtests.nightly=true` on commit 
https://github.com/apache/lucene/pull/240/commits/4af77405fcb914d497d82eb60f0011f93079ec7b
 . The failing stack trace looks something like this:
   
   ```
   org.apache.lucene.facet.TestDrillSideways > test suite's output saved to 
/Users/xichen/IdeaProjects/lucene/lucene/facet/build/test-results/test/outputs/OUTPUT-org.apache.lucene.facet.TestDrillSideways.txt,
 copied below:
      >     java.lang.AssertionError: expected:<56> but was:<59>
      >         at 
__randomizedtesting.SeedInfo.seed([69A3EF02D8E3465E:1BEFCA0D6983F02D]:0)
      >         at org.junit.Assert.fail(Assert.java:89)
      >         at org.junit.Assert.failNotEquals(Assert.java:835)
      >         at org.junit.Assert.assertEquals(Assert.java:647)
      >         at org.junit.Assert.assertEquals(Assert.java:633)
      >         at 
org.apache.lucene.facet.TestDrillSideways.verifyEquals(TestDrillSideways.java:1607)
      >         at 
org.apache.lucene.facet.TestDrillSideways.testRandom(TestDrillSideways.java:1185)
   ```
   
   and the following difference with verbose output
   
   ```
     1>     dim0 topN=2 (vs 2 unique values)
     1>       actual
     1>         0: [e1 9d 80 e1 9d 97 e1 9d 82 e1 9d 92]: 74
     1>         1: [ea a5 aa ea a5 a2 ea a5 a0]: 59
     1>       expected (unsorted)
     1>         0: [ea a5 aa ea a5 a2 ea a5 a0]: 56
     1>         1: [e1 9d 80 e1 9d 97 e1 9d 82 e1 9d 92]: 73
      >     java.lang.AssertionError: expected:<56> but was:<59>
   ```
   
   The failing test complained there are more docs collected than expected.
   
   As I have limited context on drillsideway / drilldown logic, I ended up 
spending quite a few hours on debugging this, and found the issue ultimately 
came from this caching issue. Specifically, here are the main key difference in 
the call chain that contributed to this difference:
   
   #### Expected / existing entry path from `DrillSideways#search(Query, 
Collector)` with the hack
   ```
   DrillSideways#search(Query, Collector) -> IndexSearcher#createWeight -> NOT 
cached weight -> ... -> DrillSidewaysScorer#doUnionScoring with INSTANTIATED 
acceptDocs
   ```
   With this path, some docs will not be collected as `acceptDocs.get(docID) == 
false`. The hit collector also uses the same `acceptDocs` so the number of 
matches between hit collector and drillsideway / drilldown collector are also 
the same.
   
   #### With this change, using the entry path from 
`DrillSideways#searchSequentially(Query, CollectorManager)` and without the 
hack: 
   
   ```
   DrillSideways#search(Query, Collector) -> IndexSearcher#createWeight -> 
cached weight -> ... -> DrillSidewaysScorer#doUnionScoring with NULL acceptDocs 
(NULL was passed in via the LRUQueryCache code path above)
   ```
   
   With this path, some deleted docs will be collected as `acceptDocs == null`. 
 As the hit collector uses an instantiated `acceptDocs`, here drillsideway / 
drilldown collector will also collect more docs compared with hit collector.
   
   Hope this clarifies. 
   
   ----
   
   I think for the purpose of this PR, we may still need to leave this hack in 
since otherwise this may break nightly tests once merged. As it sounds like you 
discovered another issue related to this, and that needs to be addressed as 
part of  https://issues.apache.org/jira/browse/LUCENE-10060 as well, please 
feel free to take on that one independently from this, and I can resolve any 
merge conflict in `DrillSideways` had you merged that change earlier than this 
one.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to