dweiss commented on a change in pull request #460:
URL: https://github.com/apache/lucene/pull/460#discussion_r754613632



##########
File path: lucene/core/src/java/org/apache/lucene/util/fst/FST.java
##########
@@ -1000,6 +1027,98 @@ private void writePresenceBits(
     assert bytePos - dest == numPresenceBytes;
   }
 
+  private long estimateNodeAddress(

Review comment:
       Thanks. I got the intuition right but some parts of this code were 
written while I was... away (including byte reversals during serialization), 
hence the uncertainty. I keep wondering if there is any other way to get those 
deltas... or make the deltas refer to a different placeholder. I recall tricks 
like this done back in assembly days on Amigas; basically you had a two-level 
data structure - a stream of bytes + known-size "placeholders" for compacting. 
   
   data stream: byte1 byte2 byte3 ... byteN byteN+1 ...
   address placeholder: (data stream@N), (data stream@M), ...
   
   Each placeholder is a fixed-size offset - the compacting routine receives 
the full data stream + placeholders so it has the ability to compute deltas 
(from left to right or from right to left) and shift-compact the data stream 
without knowing anything about the other data bytes. This way you sort of 
postpone the delta-offset computation until you know all of the data and the 
upper bound for its size, then reduce.
   
   I'm sure you gave it some thought too - it's just what popped in my head 
immediately when I saw your code. The requirement for those two extra methods 
on outputs + the need to measure each node (I'd call it explicitly - 
computeNodeAddress) is a bit worrying... but then - if there is no visible 
slowdown then perhaps I'm just overreacting...




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to