Re: [PR] Add support for index sorting with document blocks [lucene]

2024-01-11 Thread via GitHub
s1monw merged PR #12829: URL: https://github.com/apache/lucene/pull/12829 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apac

Re: [PR] Add support for index sorting with document blocks [lucene]

2024-01-09 Thread via GitHub
s1monw commented on PR #12829: URL: https://github.com/apache/lucene/pull/12829#issuecomment-1882919819 @mikemccand I did another pass on it and change the wording. I think I know why you are confused and I tired to adress it. We use the exact same wording in the soft deletes case. I will w

Re: [PR] Add support for index sorting with document blocks [lucene]

2024-01-05 Thread via GitHub
mikemccand commented on code in PR #12829: URL: https://github.com/apache/lucene/pull/12829#discussion_r1442838857 ## lucene/core/src/test/org/apache/lucene/index/TestAddIndexes.java: ## @@ -1937,4 +1937,97 @@ public void setMergeInfo(SegmentCommitInfo info) { targetDir.clo

Re: [PR] Add support for index sorting with document blocks [lucene]

2024-01-04 Thread via GitHub
s1monw commented on code in PR #12829: URL: https://github.com/apache/lucene/pull/12829#discussion_r1442110320 ## lucene/core/src/java/org/apache/lucene/index/FieldInfos.java: ## @@ -437,6 +488,33 @@ private void verifySoftDeletedFieldName(String fieldName, boolean isSoftDelete

Re: [PR] Add support for index sorting with document blocks [lucene]

2024-01-03 Thread via GitHub
mikemccand commented on code in PR #12829: URL: https://github.com/apache/lucene/pull/12829#discussion_r1440776709 ## lucene/core/src/java/org/apache/lucene/index/FieldInfos.java: ## @@ -437,6 +488,33 @@ private void verifySoftDeletedFieldName(String fieldName, boolean isSoftDe

Re: [PR] Add support for index sorting with document blocks [lucene]

2024-01-03 Thread via GitHub
mikemccand commented on code in PR #12829: URL: https://github.com/apache/lucene/pull/12829#discussion_r1440776102 ## lucene/core/src/java/org/apache/lucene/index/FieldInfos.java: ## @@ -437,6 +488,33 @@ private void verifySoftDeletedFieldName(String fieldName, boolean isSoftDe

Re: [PR] Add support for index sorting with document blocks [lucene]

2024-01-03 Thread via GitHub
mikemccand commented on code in PR #12829: URL: https://github.com/apache/lucene/pull/12829#discussion_r1440668657 ## lucene/core/src/java/org/apache/lucene/index/FieldInfos.java: ## @@ -132,6 +135,13 @@ public FieldInfos(FieldInfo[] infos) { } softDeletesField

Re: [PR] Add support for index sorting with document blocks [lucene]

2024-01-03 Thread via GitHub
mikemccand commented on PR #12829: URL: https://github.com/apache/lucene/pull/12829#issuecomment-1875663308 > [add test and docs to make mike not staring at this forever](https://github.com/apache/lucene/pull/12829/commits/25e6da47f989675d310de5f319ff38088589b4d4) HAHAHA -- This is

Re: [PR] Add support for index sorting with document blocks [lucene]

2024-01-03 Thread via GitHub
s1monw commented on code in PR #12829: URL: https://github.com/apache/lucene/pull/12829#discussion_r1440343745 ## lucene/core/src/java/org/apache/lucene/index/DocumentsWriterPerThread.java: ## @@ -231,7 +244,23 @@ long updateDocuments( final int docsInRamBefore = numDocsI

Re: [PR] Add support for index sorting with document blocks [lucene]

2024-01-03 Thread via GitHub
s1monw commented on code in PR #12829: URL: https://github.com/apache/lucene/pull/12829#discussion_r1440307876 ## lucene/core/src/java/org/apache/lucene/index/DocumentsWriterPerThread.java: ## @@ -245,10 +274,11 @@ long updateDocuments( onNewDocOnRAM.run();

Re: [PR] Add support for index sorting with document blocks [lucene]

2024-01-02 Thread via GitHub
mikemccand commented on code in PR #12829: URL: https://github.com/apache/lucene/pull/12829#discussion_r1439716624 ## lucene/core/src/java/org/apache/lucene/index/CheckIndex.java: ## @@ -1176,34 +1176,44 @@ public static Status.IndexSortStatus testSort( comparators[i] =

Re: [PR] Add support for index sorting with document blocks [lucene]

2023-12-27 Thread via GitHub
s1monw commented on PR #12829: URL: https://github.com/apache/lucene/pull/12829#issuecomment-1870447436 @mikemccand @jpountz I think it's ready. I added some more testing to it and removed storing the no. of children in the DV field to make it as low impact as possible. we can still optimiz

Re: [PR] Add support for index sorting with document blocks [lucene]

2023-12-27 Thread via GitHub
s1monw commented on code in PR #12829: URL: https://github.com/apache/lucene/pull/12829#discussion_r1437134740 ## lucene/core/src/java/org/apache/lucene/index/FieldInfos.java: ## @@ -188,6 +200,26 @@ public static FieldInfos getMergedFieldInfos(IndexReader reader) { } }

Re: [PR] Add support for index sorting with document blocks [lucene]

2023-12-24 Thread via GitHub
s1monw commented on code in PR #12829: URL: https://github.com/apache/lucene/pull/12829#discussion_r1435809808 ## lucene/core/src/java/org/apache/lucene/index/FieldInfos.java: ## @@ -437,6 +486,31 @@ private void verifySoftDeletedFieldName(String fieldName, boolean isSoftDelete

Re: [PR] Add support for index sorting with document blocks [lucene]

2023-12-23 Thread via GitHub
mikemccand commented on code in PR #12829: URL: https://github.com/apache/lucene/pull/12829#discussion_r1435648884 ## lucene/codecs/src/java/org/apache/lucene/codecs/simpletext/SimpleTextSegmentInfoFormat.java: ## @@ -197,7 +197,13 @@ public SegmentInfo read( sortField[

Re: [PR] Add support for index sorting with document blocks [lucene]

2023-12-21 Thread via GitHub
s1monw commented on code in PR #12829: URL: https://github.com/apache/lucene/pull/12829#discussion_r1434195239 ## lucene/core/src/java/org/apache/lucene/index/IndexingChain.java: ## @@ -219,15 +224,41 @@ private Sorter.DocMap maybeSortSegment(SegmentWriteState state) throws IOE

Re: [PR] Add support for index sorting with document blocks [lucene]

2023-12-21 Thread via GitHub
s1monw commented on code in PR #12829: URL: https://github.com/apache/lucene/pull/12829#discussion_r1434193511 ## lucene/core/src/java/org/apache/lucene/index/CheckIndex.java: ## @@ -1180,33 +1180,43 @@ public static Status.IndexSortStatus testSort( comparators[i] = fie

Re: [PR] Add support for index sorting with document blocks [lucene]

2023-12-21 Thread via GitHub
s1monw commented on code in PR #12829: URL: https://github.com/apache/lucene/pull/12829#discussion_r1434189939 ## lucene/core/src/java/org/apache/lucene/index/DocumentsWriterPerThread.java: ## @@ -262,6 +294,35 @@ long updateDocuments( } } + private Iterable addParent

Re: [PR] Add support for index sorting with document blocks [lucene]

2023-12-21 Thread via GitHub
s1monw commented on code in PR #12829: URL: https://github.com/apache/lucene/pull/12829#discussion_r1434180897 ## lucene/core/src/java/org/apache/lucene/codecs/lucene94/Lucene94FieldInfosFormat.java: ## @@ -157,6 +158,8 @@ public FieldInfos read( boolean omitNorms = (

Re: [PR] Add support for index sorting with document blocks [lucene]

2023-12-21 Thread via GitHub
s1monw commented on PR #12829: URL: https://github.com/apache/lucene/pull/12829#issuecomment-1866428159 > If my understanding is correct, we will require a parent field when using blocks as of 10.0. One concern I have about this is that we currently don't require users to know up-front whet

Re: [PR] Add support for index sorting with document blocks [lucene]

2023-12-19 Thread via GitHub
s1monw commented on PR #12829: URL: https://github.com/apache/lucene/pull/12829#issuecomment-1862667309 @mikemccand I don't think we have any impact on performance here at all if this feature is not in use. If you look at DWPT there is a final field that decides if there is further examina

Re: [PR] Add support for index sorting with document blocks [lucene]

2023-12-19 Thread via GitHub
s1monw commented on PR #12829: URL: https://github.com/apache/lucene/pull/12829#issuecomment-1862391122 @mikemccand @jpountz I updated this PR and moved everything to FielInfo / IWC. I think it's ready. the only thing that I'd like to discuss is that we are currently recording the number of

Re: [PR] Add support for index sorting with document blocks [lucene]

2023-12-14 Thread via GitHub
mikemccand commented on PR #12829: URL: https://github.com/apache/lucene/pull/12829#issuecomment-1855769298 I opened https://github.com/mikemccand/luceneutil/issues/252 to try to measure the performance change of `addDocument` N times vs `addDocuments` once. -- This is an automated messag

Re: [PR] Add support for index sorting with document blocks [lucene]

2023-12-14 Thread via GitHub
mikemccand commented on PR #12829: URL: https://github.com/apache/lucene/pull/12829#issuecomment-1855755782 One small observation here: one can use the `add/updateDocuments` API today with no intention of using those as doc blocks at search time, purely as an optimization over calling separ

Re: [PR] Add support for index sorting with document blocks [lucene]

2023-12-14 Thread via GitHub
mikemccand commented on code in PR #12829: URL: https://github.com/apache/lucene/pull/12829#discussion_r1426637555 ## lucene/backward-codecs/src/test/org/apache/lucene/backward_index/TestBackwardsCompatibility.java: ## @@ -2164,6 +2166,83 @@ public void testSortedIndex() throws

Re: [PR] Add support for index sorting with document blocks [lucene]

2023-12-12 Thread via GitHub
s1monw commented on PR #12829: URL: https://github.com/apache/lucene/pull/12829#issuecomment-1852410786 @mikemccand I agree we should not add this to sort but rather tread it the same way we treat the softDeletes field. it's essentially the same thing from an IW perspective. I will go ahead

Re: [PR] Add support for index sorting with document blocks [lucene]

2023-12-12 Thread via GitHub
msokolov commented on code in PR #12829: URL: https://github.com/apache/lucene/pull/12829#discussion_r1424209861 ## lucene/core/src/java/org/apache/lucene/index/IndexingChain.java: ## @@ -219,15 +222,33 @@ private Sorter.DocMap maybeSortSegment(SegmentWriteState state) throws I

Re: [PR] Add support for index sorting with document blocks [lucene]

2023-12-12 Thread via GitHub
s1monw commented on code in PR #12829: URL: https://github.com/apache/lucene/pull/12829#discussion_r1423637103 ## lucene/core/src/java/org/apache/lucene/index/IndexingChain.java: ## @@ -219,15 +222,33 @@ private Sorter.DocMap maybeSortSegment(SegmentWriteState state) throws IOE

Re: [PR] Add support for index sorting with document blocks [lucene]

2023-12-12 Thread via GitHub
s1monw commented on code in PR #12829: URL: https://github.com/apache/lucene/pull/12829#discussion_r1423606084 ## lucene/core/src/java/org/apache/lucene/index/IndexingChain.java: ## @@ -219,15 +222,33 @@ private Sorter.DocMap maybeSortSegment(SegmentWriteState state) throws IOE

Re: [PR] Add support for index sorting with document blocks [lucene]

2023-12-09 Thread via GitHub
msokolov commented on code in PR #12829: URL: https://github.com/apache/lucene/pull/12829#discussion_r1421566628 ## lucene/core/src/java/org/apache/lucene/index/IndexingChain.java: ## @@ -219,15 +222,33 @@ private Sorter.DocMap maybeSortSegment(SegmentWriteState state) throws I

Re: [PR] Add support for index sorting with document blocks [lucene]

2023-12-07 Thread via GitHub
s1monw commented on code in PR #12829: URL: https://github.com/apache/lucene/pull/12829#discussion_r1419505134 ## lucene/core/src/java/org/apache/lucene/search/Sort.java: ## @@ -59,10 +61,28 @@ public Sort() { * is still a tie after all SortFields are checked, the internal L

Re: [PR] Add support for index sorting with document blocks [lucene]

2023-12-07 Thread via GitHub
jpountz commented on code in PR #12829: URL: https://github.com/apache/lucene/pull/12829#discussion_r1419268112 ## lucene/core/src/java/org/apache/lucene/index/IndexingChain.java: ## @@ -219,15 +222,33 @@ private Sorter.DocMap maybeSortSegment(SegmentWriteState state) throws IO

Re: [PR] Add support for index sorting with document blocks [lucene]

2023-12-07 Thread via GitHub
s1monw commented on code in PR #12829: URL: https://github.com/apache/lucene/pull/12829#discussion_r1419216698 ## lucene/core/src/java/org/apache/lucene/index/IndexingChain.java: ## @@ -219,15 +222,33 @@ private Sorter.DocMap maybeSortSegment(SegmentWriteState state) throws IOE

Re: [PR] Add support for index sorting with document blocks [lucene]

2023-12-07 Thread via GitHub
s1monw commented on code in PR #12829: URL: https://github.com/apache/lucene/pull/12829#discussion_r1419210065 ## lucene/core/src/java/org/apache/lucene/index/DocumentsWriterPerThread.java: ## @@ -262,6 +290,44 @@ long updateDocuments( } } + private Iterable filterPar

Re: [PR] Add support for index sorting with document blocks [lucene]

2023-12-07 Thread via GitHub
s1monw commented on code in PR #12829: URL: https://github.com/apache/lucene/pull/12829#discussion_r1419209272 ## lucene/backward-codecs/src/test/org/apache/lucene/backward_index/TestBackwardsCompatibility.java: ## @@ -2164,6 +2166,83 @@ public void testSortedIndex() throws Exce

Re: [PR] Add support for index sorting with document blocks [lucene]

2023-12-07 Thread via GitHub
s1monw commented on code in PR #12829: URL: https://github.com/apache/lucene/pull/12829#discussion_r1419203464 ## lucene/backward-codecs/src/test/org/apache/lucene/backward_index/TestBackwardsCompatibility.java: ## @@ -2164,6 +2166,83 @@ public void testSortedIndex() throws Exce

Re: [PR] Add support for index sorting with document blocks [lucene]

2023-12-05 Thread via GitHub
mikemccand commented on code in PR #12829: URL: https://github.com/apache/lucene/pull/12829#discussion_r1415586966 ## lucene/core/src/test/org/apache/lucene/index/TestAddIndexes.java: ## @@ -1678,6 +1678,51 @@ public void testIllegalIndexSortChange2() throws Exception { IO

Re: [PR] Add support for index sorting with document blocks [lucene]

2023-12-05 Thread via GitHub
mikemccand commented on PR #12829: URL: https://github.com/apache/lucene/pull/12829#issuecomment-1840782006 Thanks @s1monw -- I'll try to review soon. > The only think I am torn on is, if we set the num of children as a value for the DV field then I guess we should have a good usecase

Re: [PR] Add support for index sorting with document blocks [lucene]

2023-12-05 Thread via GitHub
s1monw commented on code in PR #12829: URL: https://github.com/apache/lucene/pull/12829#discussion_r1415196168 ## lucene/core/src/java/org/apache/lucene/index/DocumentsWriterPerThread.java: ## @@ -262,6 +277,73 @@ long updateDocuments( } } + private interface DocValid

Re: [PR] Add support for index sorting with document blocks [lucene]

2023-12-03 Thread via GitHub
s1monw commented on PR #12829: URL: https://github.com/apache/lucene/pull/12829#issuecomment-1837464516 @mikemccand @jpountz I went down a hybrid way. This change now only requires the user to specify the field name and IW adds it to the last doc in the block if it's configured. I will for

Re: [PR] Add support for index sorting with document blocks [lucene]

2023-12-03 Thread via GitHub
s1monw commented on code in PR #12829: URL: https://github.com/apache/lucene/pull/12829#discussion_r1413065291 ## lucene/core/src/java/org/apache/lucene/search/Sort.java: ## @@ -59,10 +61,28 @@ public Sort() { * is still a tie after all SortFields are checked, the internal L

Re: [PR] Add support for index sorting with document blocks [lucene]

2023-11-28 Thread via GitHub
msokolov commented on PR #12829: URL: https://github.com/apache/lucene/pull/12829#issuecomment-1830150503 @s1monw that makes sense. I think I was confusing index-time changes and query-time changes. This whole piece of functionality is a little confusing given how loosely coupled these thin

Re: [PR] Add support for index sorting with document blocks [lucene]

2023-11-28 Thread via GitHub
s1monw commented on PR #12829: URL: https://github.com/apache/lucene/pull/12829#issuecomment-1829488114 @mikemccand @jpountz thanks for your ideas. I'd love to flash this out more before we add anything we write to the index. Today we'd only use this for sorting but if that field can be use

Re: [PR] Add support for index sorting with document blocks [lucene]

2023-11-27 Thread via GitHub
msokolov commented on PR #12829: URL: https://github.com/apache/lucene/pull/12829#issuecomment-1828402628 > I don't think we give up any functionality. can you elaborate what functionality you are referring to? I don't think we should have a list of parent fields that IW requires, what woul

Re: [PR] Add support for index sorting with document blocks [lucene]

2023-11-27 Thread via GitHub
mikemccand commented on PR #12829: URL: https://github.com/apache/lucene/pull/12829#issuecomment-1827825175 > using a doc-value field where only parents documents have a value for the field, and the value must be the number of child documents that the parent has This is a neat idea to

Re: [PR] Add support for index sorting with document blocks [lucene]

2023-11-27 Thread via GitHub
mikemccand commented on code in PR #12829: URL: https://github.com/apache/lucene/pull/12829#discussion_r1406124683 ## lucene/core/src/java/org/apache/lucene/index/DocumentsWriterPerThread.java: ## @@ -262,6 +277,73 @@ long updateDocuments( } } + private interface DocV

Re: [PR] Add support for index sorting with document blocks [lucene]

2023-11-23 Thread via GitHub
s1monw commented on PR #12829: URL: https://github.com/apache/lucene/pull/12829#issuecomment-1824135498 > I'm a little worried about giving up functionality, but I think if we had a list of parent-fields rather than a single parent-field that would cover what we can do today? Maybe one is e

Re: [PR] Add support for index sorting with document blocks [lucene]

2023-11-23 Thread via GitHub
s1monw commented on code in PR #12829: URL: https://github.com/apache/lucene/pull/12829#discussion_r1403178782 ## lucene/core/src/test/org/apache/lucene/index/TestIndexSorting.java: ## @@ -3173,4 +3173,184 @@ public void testSortDocsAndFreqsAndPositionsAndOffsets() throws IOExc

Re: [PR] Add support for index sorting with document blocks [lucene]

2023-11-23 Thread via GitHub
s1monw commented on code in PR #12829: URL: https://github.com/apache/lucene/pull/12829#discussion_r1403174609 ## lucene/core/src/test/org/apache/lucene/index/TestIndexSorting.java: ## @@ -3173,4 +3173,184 @@ public void testSortDocsAndFreqsAndPositionsAndOffsets() throws IOExc

Re: [PR] Add support for index sorting with document blocks [lucene]

2023-11-22 Thread via GitHub
msokolov commented on PR #12829: URL: https://github.com/apache/lucene/pull/12829#issuecomment-1823517625 I like @jpountz's idea to make the value of this field be the number of children. It is simple and makes sense, and is pretty close to having the degree of flexibility that the current

Re: [PR] Add support for index sorting with document blocks [lucene]

2023-11-22 Thread via GitHub
jpountz commented on PR #12829: URL: https://github.com/apache/lucene/pull/12829#issuecomment-1823167735 In general, I like the idea of making block joins more of a first-class citizen. I have been thinking for a long time about changing how blocks are identified from using bitsets to using

Re: [PR] Add support for index sorting with document blocks [lucene]

2023-11-21 Thread via GitHub
msokolov commented on code in PR #12829: URL: https://github.com/apache/lucene/pull/12829#discussion_r1401297534 ## lucene/core/src/test/org/apache/lucene/index/TestIndexSorting.java: ## @@ -3173,4 +3173,184 @@ public void testSortDocsAndFreqsAndPositionsAndOffsets() throws IOE