jpountz commented on a change in pull request #418: URL: https://github.com/apache/lucene/pull/418#discussion_r759341441
########## File path: lucene/sandbox/src/java/org/apache/lucene/sandbox/search/CombinedFieldQuery.java ########## @@ -441,6 +495,292 @@ public boolean isCacheable(LeafReaderContext ctx) { } } + /** Merge impacts for combined field. */ + static ImpactsSource mergeImpacts( + Map<String, List<ImpactsEnum>> fieldsWithImpactsEnums, + Map<String, List<Impacts>> fieldsWithImpacts, + Map<String, List<Integer>> fieldTermDocFreq, + Map<String, Float> fieldWeights) { + return new ImpactsSource() { + Impacts leadingImpacts = null; + + class SubIterator { + final Iterator<Impact> iterator; + int previousFreq; + Impact current; + + SubIterator(Iterator<Impact> iterator) { + this.iterator = iterator; + this.current = iterator.next(); + } + + void next() { + previousFreq = current.freq; + if (iterator.hasNext() == false) { + current = null; + } else { + current = iterator.next(); + } + } + } + + @Override + public Impacts getImpacts() throws IOException { + // Use the impacts that have the lower next boundary (doc id in skip entry) as a lead for + // each field + // They collectively will decide on the number of levels and the block boundaries. + + if (leadingImpacts == null) { + float maxWeight = Float.MIN_VALUE; + String maxWeightField = ""; + + for (Map.Entry<String, Float> fieldWeightEntry : fieldWeights.entrySet()) { + String field = fieldWeightEntry.getKey(); + float weight = fieldWeightEntry.getValue(); + + if (maxWeight < weight) { + maxWeight = weight; + maxWeightField = field; + } + } Review comment: Since field weights do not change over time, could we compute the field that has the higest weight up-front instead of doing it every time `getImpacts` is called? ########## File path: lucene/sandbox/src/java/org/apache/lucene/sandbox/search/CombinedFieldQuery.java ########## @@ -402,14 +423,30 @@ public Explanation explain(LeafReaderContext context, int doc) throws IOExceptio public Scorer scorer(LeafReaderContext context) throws IOException { List<PostingsEnum> iterators = new ArrayList<>(); List<FieldAndWeight> fields = new ArrayList<>(); + Map<String, List<ImpactsEnum>> fieldImpactsEnum = new HashMap<>(fieldAndWeights.size()); + Map<String, List<Integer>> fieldTermDocFreq = new HashMap<>(fieldAndWeights.size()); Review comment: Do we actually nee this list of doc freqs? They would be equal to impactsEnum#cost all the time? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org