gf2121 commented on code in PR #14494:
URL: https://github.com/apache/lucene/pull/14494#discussion_r2074987659


##########
lucene/core/src/java/org/apache/lucene/codecs/lucene103/blocktree/TrieReader.java:
##########
@@ -74,14 +77,39 @@ IndexInput floorData(TrieReader r) throws IOException {
   final RandomAccessInput access;
   final IndexInput input;
   final Node root;
+  final int[] labelMap;
 
-  TrieReader(IndexInput input, long rootFP) throws IOException {
+  static IOSupplier<TrieReader> readerSupplier(DataInput metaIn, IndexInput 
indexIn)
+      throws IOException {
+    int[] labelMap = TrieReader.labelMap(metaIn);
+    long start = metaIn.readVLong();
+    long rootFP = metaIn.readVLong();
+    long end = metaIn.readVLong();
+    return () -> new TrieReader(indexIn.slice("outputs", start, end - start), 
rootFP, labelMap);
+  }
+
+  private TrieReader(IndexInput input, long rootFP, int[] labelMap) throws 
IOException {
     this.access = input.randomAccessSlice(0, input.length());
+    this.labelMap = labelMap;
     this.input = input;
     this.root = new Node();
     load(root, rootFP);
   }
 
+  private static int[] labelMap(DataInput in) throws IOException {
+    int cnt = in.readVInt();
+    if (cnt == 0) {
+      return null;
+    } else {
+      int[] labelMap = new int[TrieBuilder.BYTE_RANGE];

Review Comment:
   For now, we need a value, like `-1` to represent 'this label does not exist 
in this trie'. So it can not be simply replaced by `byte[]`.
   
   I personally think 256 * 4 = 1KB heap per field is OK. But we can reduce the 
heap usage in cost of looking up overhead, like a `bitset` representing whether 
the value exists, and a `byte[]` to map values. I can make the change if you 
think this is worth :)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to