jtibshirani commented on a change in pull request #1948:
URL: https://github.com/apache/lucene-solr/pull/1948#discussion_r504899193



##########
File path: lucene/core/src/java/org/apache/lucene/index/OrdinalMap.java
##########
@@ -271,13 +273,26 @@ protected boolean lessThan(TermsEnumIndex a, 
TermsEnumIndex b) {
       globalOrd++;
     }
 
-    this.firstSegments = firstSegments.build();
-    this.globalOrdDeltas = globalOrdDeltas.build();
+    long ramBytesUsed = BASE_RAM_BYTES_USED + segmentMap.ramBytesUsed();
+    this.valueCount = globalOrd;
+
+    // If the first segment contains all of the global ords, then we can apply 
a small optimization
+    // and hardcode the first segments and global ord deltas as all zeroes.
+    if (ordDeltaBits.length > 0 && ordDeltaBits[0] == 0L && 
ordDeltas[0].size() == this.valueCount) {

Review comment:
       > Do we (somewhere, couldn't find it here) pre-sort all segments by the 
cardinality descending?
   
   We do in fact -- the segments are sorted by 'weight', which in all call 
sites corresponds to the number of unique terms. This was added in 
[LUCENE-5782](https://issues.apache.org/jira/browse/LUCENE-5782).
   
   > Does our PackedLongValues.monotonicBuilder already optimize for the case 
where it is all 0s, for the case where another segment (not the first) has all 
the global values as well?
   
   It does look like it -- when constructing the individual `PackedInts.Reader` 
instances, we identify the all 0s case and use the lightweight 
`PackedInts.NullReader`. It's great we optimize that case, but it does mean 
this PR doesn't make an enormous space difference.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to