dweiss commented on a change in pull request #460: URL: https://github.com/apache/lucene/pull/460#discussion_r754613632
########## File path: lucene/core/src/java/org/apache/lucene/util/fst/FST.java ########## @@ -1000,6 +1027,98 @@ private void writePresenceBits( assert bytePos - dest == numPresenceBytes; } + private long estimateNodeAddress( Review comment: Thanks. I got the intuition right but some parts of this code were written while I was... away (including byte reversals during serialization), hence the uncertainty. I keep wondering if there is any other way to get those deltas... or make the deltas refer to a different placeholder. I recall tricks like this done back in assembly days on Amigas; basically you had a two-level data structure - a stream of bytes + known-size "placeholders" for compacting. data stream: byte1 byte2 byte3 ... byteN byteN+1 ... address placeholder: (data stream@N), (data stream@M), ... Each placeholder is a fixed-size offset - the compacting routine receives the full data stream + placeholders so it has the ability to compute deltas (from left to right or from right to left) and shift-compact the data stream without knowing anything about the other data bytes. This way you sort of postpone the delta-offset computation until you know all of the data and the upper bound for its size, then reduce. I'm sure you gave it some thought too - it's just what popped in my head immediately when I saw your code. The requirement for those two extra methods on outputs + the need to measure each node (I'd call it explicitly - computeNodeAddress) is a bit worrying... but then - if there is no visible slowdown then perhaps I'm just overreacting... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org