Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-04-05 Thread via GitHub
gf2121 commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r2006885578 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/TrieReader.java: ## @@ -0,0 +1,228 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-04-05 Thread via GitHub
mikemccand commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r2022767256 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/TrieBuilder.java: ## @@ -0,0 +1,632 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-04-05 Thread via GitHub
mikemccand commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r2022727361 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/TrieBuilder.java: ## @@ -0,0 +1,552 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-04-05 Thread via GitHub
mikemccand commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r2005470873 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/Trie.java: ## @@ -0,0 +1,486 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-04-05 Thread via GitHub
gf2121 commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r2006940286 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/TrieBuilder.java: ## @@ -0,0 +1,552 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-04-05 Thread via GitHub
gf2121 commented on PR #14333: URL: https://github.com/apache/lucene/pull/14333#issuecomment-2771814782 I roughly implemented the idea. This is my first time forking a new codec, hopefully have not made too many mistakes :) A few thoughts during my refactoring: * I thought i on

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-04-05 Thread via GitHub
gf2121 commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r2006876856 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/TrieBuilder.java: ## @@ -0,0 +1,552 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-04-03 Thread via GitHub
jpountz commented on PR #14333: URL: https://github.com/apache/lucene/pull/14333#issuecomment-2776835446 > I thought i only need to fork a Lucene103BlockTreeTerms to intersect with Lucene101Postings. But it seems challenging based on current API design. I have to fork the new Lucene103Posti

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-04-03 Thread via GitHub
jpountz commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r2027695983 ## lucene/core/src/java/org/apache/lucene/codecs/lucene103/Lucene103PostingsFormat.java: ## @@ -0,0 +1,510 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-04-01 Thread via GitHub
gf2121 commented on PR #14333: URL: https://github.com/apache/lucene/pull/14333#issuecomment-2770201445 > We should add this format to RandomCodec then, so that it gets included as part of codec randomization. OK, did not see this. I know how to do it then. Thanks Adrien :) -- This

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-04-01 Thread via GitHub
gf2121 commented on PR #14333: URL: https://github.com/apache/lucene/pull/14333#issuecomment-2770173066 Thank you very much for all these careful, warm and helpful comments! > Are there any major items / blockers? I think I've addressed all of them (hopefully didn't miss any).

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-04-01 Thread via GitHub
jpountz commented on PR #14333: URL: https://github.com/apache/lucene/pull/14333#issuecomment-2770123261 > Once we think this is ready, we should prolly merge at first as the non-default Codec We should add this format to `RandomCodec` then, so that it gets included as part of codec

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-04-01 Thread via GitHub
mikemccand commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r2022734745 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/TrieBuilder.java: ## @@ -0,0 +1,552 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-03-30 Thread via GitHub
gf2121 commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r2006853334 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/TrieBuilder.java: ## @@ -0,0 +1,552 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-03-27 Thread via GitHub
gf2121 commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r2006889147 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/TrieBuilder.java: ## @@ -0,0 +1,552 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-03-22 Thread via GitHub
gf2121 commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r2006876856 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/TrieBuilder.java: ## @@ -0,0 +1,552 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-03-21 Thread via GitHub
gf2121 commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r2007802395 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/TrieBuilder.java: ## @@ -0,0 +1,552 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-03-20 Thread via GitHub
gf2121 commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r2006940286 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/TrieBuilder.java: ## @@ -0,0 +1,552 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-03-20 Thread via GitHub
gf2121 commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r2006904572 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/IntersectTermsEnumFrame.java: ## @@ -89,8 +89,6 @@ final class IntersectTermsEnumFrame { final

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-03-20 Thread via GitHub
gf2121 commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r2006901371 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/TrieBuilder.java: ## @@ -0,0 +1,552 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-03-20 Thread via GitHub
gf2121 commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r2006876856 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/TrieBuilder.java: ## @@ -0,0 +1,552 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-03-20 Thread via GitHub
gf2121 commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r2006876856 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/TrieBuilder.java: ## @@ -0,0 +1,552 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-03-20 Thread via GitHub
gf2121 commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r2005852050 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/TrieBuilder.java: ## @@ -0,0 +1,552 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-03-20 Thread via GitHub
gf2121 commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r2005865922 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/TrieBuilder.java: ## @@ -0,0 +1,552 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-03-20 Thread via GitHub
gf2121 commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r2005519750 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/Trie.java: ## @@ -0,0 +1,486 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one o

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-03-20 Thread via GitHub
gf2121 commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r2005498999 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/Trie.java: ## @@ -0,0 +1,486 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one o

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-03-20 Thread via GitHub
mikemccand commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r2005473348 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/Trie.java: ## @@ -0,0 +1,486 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-03-20 Thread via GitHub
mikemccand commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r2005466812 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/Trie.java: ## @@ -0,0 +1,486 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-03-20 Thread via GitHub
mikemccand commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r2005463552 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/Trie.java: ## @@ -0,0 +1,486 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-03-20 Thread via GitHub
mikemccand commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r2005462386 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/Trie.java: ## @@ -0,0 +1,486 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-03-20 Thread via GitHub
mikemccand commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r2005458945 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/Trie.java: ## @@ -0,0 +1,486 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-03-16 Thread via GitHub
gf2121 commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r1997503104 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/Trie.java: ## @@ -0,0 +1,486 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one o

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-03-15 Thread via GitHub
gf2121 commented on PR #14333: URL: https://github.com/apache/lucene/pull/14333#issuecomment-2724132677 Thanks for looking :) > I started looking at the code but you would know better: does this new encoding make it easier to know the length of leaf blocks while traversing the terms

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-03-14 Thread via GitHub
jpountz commented on PR #14333: URL: https://github.com/apache/lucene/pull/14333#issuecomment-2724046501 I started looking at the code but you would know better: does this new encoding make it easier to know the length of leaf blocks while traversing the terms index so that we could prefetc

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-03-14 Thread via GitHub
gf2121 commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r1994867386 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/Trie.java: ## @@ -0,0 +1,486 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one o

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-03-13 Thread via GitHub
gf2121 commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r1994852913 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/Trie.java: ## @@ -0,0 +1,486 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one o

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-03-13 Thread via GitHub
gf2121 commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r1993768037 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/Trie.java: ## @@ -0,0 +1,486 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one o

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-03-13 Thread via GitHub
gf2121 commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r1993812719 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/Trie.java: ## @@ -0,0 +1,486 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one o

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-03-13 Thread via GitHub
gf2121 commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r1993802895 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/Trie.java: ## @@ -0,0 +1,486 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one o

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-03-13 Thread via GitHub
gf2121 commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r1993786103 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/Trie.java: ## @@ -0,0 +1,486 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one o

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-03-13 Thread via GitHub
gf2121 commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r1993776438 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/Trie.java: ## @@ -0,0 +1,486 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one o

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-03-13 Thread via GitHub
gf2121 commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r1993758687 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/Trie.java: ## @@ -0,0 +1,486 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one o

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-03-13 Thread via GitHub
gf2121 commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r1993755452 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/Trie.java: ## @@ -0,0 +1,486 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one o

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-03-13 Thread via GitHub
gf2121 commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r1993751047 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/Trie.java: ## @@ -0,0 +1,486 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one o

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-03-13 Thread via GitHub
gf2121 commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r1993741308 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/Trie.java: ## @@ -0,0 +1,486 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one o

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-03-13 Thread via GitHub
mikemccand commented on PR #14333: URL: https://github.com/apache/lucene/pull/14333#issuecomment-2720657669 This is an exciting change @gf2121! Smaller, simpler, and faster!? I'll try to review soon. -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-03-10 Thread via GitHub
jpountz commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r1987259134 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnum.java: ## @@ -857,123 +768,126 @@ public SeekStatus seekCeil(BytesRef target) throw

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-03-09 Thread via GitHub
gf2121 commented on PR #14333: URL: https://github.com/apache/lucene/pull/14333#issuecomment-2709371309 Thanks for feedback! > Since you wrote this, I expected tip files to become bigger, but your data suggests the opposite, tip files are getting smaller? Am I reading it correctly?

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-03-09 Thread via GitHub
jpountz commented on PR #14333: URL: https://github.com/apache/lucene/pull/14333#issuecomment-2709064558 The speedup on PKLookup is exciting! > IMO for tip, performance is more important than storage size, which is usually a very small part of the whole index, and loaded off-heap.

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-03-07 Thread via GitHub
gf2121 commented on PR #14333: URL: https://github.com/apache/lucene/pull/14333#issuecomment-2706652601 All core tests passed so I mark the PR ready for review. I'll fork out a new codec and clean up the codes if this idea gets traction (current code diff is more clear for a review). --