jpountz commented on a change in pull request #418:
URL: https://github.com/apache/lucene/pull/418#discussion_r759341441
##########
File path:
lucene/sandbox/src/java/org/apache/lucene/sandbox/search/CombinedFieldQuery.java
##########
@@ -441,6 +495,292 @@ public boolean isCacheable(LeafReaderContext ctx) {
}
}
+ /** Merge impacts for combined field. */
+ static ImpactsSource mergeImpacts(
+ Map<String, List<ImpactsEnum>> fieldsWithImpactsEnums,
+ Map<String, List<Impacts>> fieldsWithImpacts,
+ Map<String, List<Integer>> fieldTermDocFreq,
+ Map<String, Float> fieldWeights) {
+ return new ImpactsSource() {
+ Impacts leadingImpacts = null;
+
+ class SubIterator {
+ final Iterator<Impact> iterator;
+ int previousFreq;
+ Impact current;
+
+ SubIterator(Iterator<Impact> iterator) {
+ this.iterator = iterator;
+ this.current = iterator.next();
+ }
+
+ void next() {
+ previousFreq = current.freq;
+ if (iterator.hasNext() == false) {
+ current = null;
+ } else {
+ current = iterator.next();
+ }
+ }
+ }
+
+ @Override
+ public Impacts getImpacts() throws IOException {
+ // Use the impacts that have the lower next boundary (doc id in skip
entry) as a lead for
+ // each field
+ // They collectively will decide on the number of levels and the block
boundaries.
+
+ if (leadingImpacts == null) {
+ float maxWeight = Float.MIN_VALUE;
+ String maxWeightField = "";
+
+ for (Map.Entry<String, Float> fieldWeightEntry :
fieldWeights.entrySet()) {
+ String field = fieldWeightEntry.getKey();
+ float weight = fieldWeightEntry.getValue();
+
+ if (maxWeight < weight) {
+ maxWeight = weight;
+ maxWeightField = field;
+ }
+ }
Review comment:
Since field weights do not change over time, could we compute the field
that has the higest weight up-front instead of doing it every time `getImpacts`
is called?
##########
File path:
lucene/sandbox/src/java/org/apache/lucene/sandbox/search/CombinedFieldQuery.java
##########
@@ -402,14 +423,30 @@ public Explanation explain(LeafReaderContext context, int
doc) throws IOExceptio
public Scorer scorer(LeafReaderContext context) throws IOException {
List<PostingsEnum> iterators = new ArrayList<>();
List<FieldAndWeight> fields = new ArrayList<>();
+ Map<String, List<ImpactsEnum>> fieldImpactsEnum = new
HashMap<>(fieldAndWeights.size());
+ Map<String, List<Integer>> fieldTermDocFreq = new
HashMap<>(fieldAndWeights.size());
Review comment:
Do we actually nee this list of doc freqs? They would be equal to
impactsEnum#cost all the time?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]