zacharymorn commented on a change in pull request #240:
URL: https://github.com/apache/lucene/pull/240#discussion_r694404654
##########
File path: lucene/facet/src/java/org/apache/lucene/facet/DrillSideways.java
##########
@@ -520,10 +458,39 @@ private DrillDownQuery getDrillDownQuery(
}
@SuppressWarnings("unchecked")
- private <R> ConcurrentDrillSidewaysResult<R> searchSequentially(
- final DrillDownQuery query, final CollectorManager<?, R>
hitCollectorManager)
+ private <C extends Collector, R> ConcurrentDrillSidewaysResult<R>
searchSequentially(
+ final DrillDownQuery query, final CollectorManager<C, R>
hitCollectorManager)
throws IOException {
+ // This mirrors a similar hack from DrillSideways#search(query, collector).
+ // Without this cache, LRU cache will be used, causing acceptDocs to be
null during collection
Review comment:
Thanks @gsmiller for the additional context & a new dedicated ticket.
Just to provide some additional context below to wrap up the discussion in this
PR before we cut over.
This issue was discovered during my nightly test run with the following seed
`./gradlew test --tests TestDrillSideways.testRandom
-Dtests.seed=69A3EF02D8E3465E -Dtests.nightly=true` on commit
https://github.com/apache/lucene/pull/240/commits/4af77405fcb914d497d82eb60f0011f93079ec7b
. The failing stack trace looks something like this:
```
org.apache.lucene.facet.TestDrillSideways > test suite's output saved to
/Users/xichen/IdeaProjects/lucene/lucene/facet/build/test-results/test/outputs/OUTPUT-org.apache.lucene.facet.TestDrillSideways.txt,
copied below:
> java.lang.AssertionError: expected:<56> but was:<59>
> at
__randomizedtesting.SeedInfo.seed([69A3EF02D8E3465E:1BEFCA0D6983F02D]:0)
> at org.junit.Assert.fail(Assert.java:89)
> at org.junit.Assert.failNotEquals(Assert.java:835)
> at org.junit.Assert.assertEquals(Assert.java:647)
> at org.junit.Assert.assertEquals(Assert.java:633)
> at
org.apache.lucene.facet.TestDrillSideways.verifyEquals(TestDrillSideways.java:1607)
> at
org.apache.lucene.facet.TestDrillSideways.testRandom(TestDrillSideways.java:1185)
```
and the following difference with verbose output
```
1> dim0 topN=2 (vs 2 unique values)
1> actual
1> 0: [e1 9d 80 e1 9d 97 e1 9d 82 e1 9d 92]: 74
1> 1: [ea a5 aa ea a5 a2 ea a5 a0]: 59
1> expected (unsorted)
1> 0: [ea a5 aa ea a5 a2 ea a5 a0]: 56
1> 1: [e1 9d 80 e1 9d 97 e1 9d 82 e1 9d 92]: 73
> java.lang.AssertionError: expected:<56> but was:<59>
```
The failing test complained there are more docs collected than expected.
As I have limited context on drillsideway / drilldown logic, I ended up
spending quite a few hours on debugging this, and found the issue ultimately
came from this caching issue. Specifically, here are the main key difference in
the call chain that contributed to this difference:
#### Expected / existing entry path from `DrillSideways#search(Query,
Collector)` with the hack
```
DrillSideways#search(Query, Collector) -> IndexSearcher#createWeight -> NOT
cached weight -> ... -> DrillSidewaysScorer#doUnionScoring with INSTANTIATED
acceptDocs
```
With this path, some docs will not be collected as `acceptDocs.get(docID) ==
false`. The hit collector also uses the same `acceptDocs` so the number of
matches between hit collector and drillsideway / drilldown collector are also
the same.
#### With this change, using the entry path from
`DrillSideways#searchSequentially(Query, CollectorManager)` and without the
hack:
```
DrillSideways#search(Query, Collector) -> IndexSearcher#createWeight ->
cached weight -> ... -> DrillSidewaysScorer#doUnionScoring with NULL acceptDocs
(NULL was passed in via the LRUQueryCache code path above)
```
With this path, some deleted docs will be collected as `acceptDocs == null`.
As the hit collector uses an instantiated `acceptDocs`, here drillsideway /
drilldown collector will also collect more docs compared with hit collector.
Hope this clarifies.
----
I think for the purpose of this PR, we may still need to leave this hack in
since otherwise this may break nightly tests once merged. As it sounds like you
discovered another issue related to this, and that needs to be addressed as
part of https://issues.apache.org/jira/browse/LUCENE-10060 as well, please
feel free to take on that one independently from this, and I can resolve any
merge conflict in `DrillSideways` had you merged that change earlier than this
one.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]