s1monw merged PR #12829:
URL: https://github.com/apache/lucene/pull/12829
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apac
s1monw commented on PR #12829:
URL: https://github.com/apache/lucene/pull/12829#issuecomment-1882919819
@mikemccand I did another pass on it and change the wording. I think I know
why you are confused and I tired to adress it. We use the exact same wording in
the soft deletes case. I will w
mikemccand commented on code in PR #12829:
URL: https://github.com/apache/lucene/pull/12829#discussion_r1442838857
##
lucene/core/src/test/org/apache/lucene/index/TestAddIndexes.java:
##
@@ -1937,4 +1937,97 @@ public void setMergeInfo(SegmentCommitInfo info) {
targetDir.clo
s1monw commented on code in PR #12829:
URL: https://github.com/apache/lucene/pull/12829#discussion_r1442110320
##
lucene/core/src/java/org/apache/lucene/index/FieldInfos.java:
##
@@ -437,6 +488,33 @@ private void verifySoftDeletedFieldName(String fieldName,
boolean isSoftDelete
mikemccand commented on code in PR #12829:
URL: https://github.com/apache/lucene/pull/12829#discussion_r1440776709
##
lucene/core/src/java/org/apache/lucene/index/FieldInfos.java:
##
@@ -437,6 +488,33 @@ private void verifySoftDeletedFieldName(String fieldName,
boolean isSoftDe
mikemccand commented on code in PR #12829:
URL: https://github.com/apache/lucene/pull/12829#discussion_r1440776102
##
lucene/core/src/java/org/apache/lucene/index/FieldInfos.java:
##
@@ -437,6 +488,33 @@ private void verifySoftDeletedFieldName(String fieldName,
boolean isSoftDe
mikemccand commented on code in PR #12829:
URL: https://github.com/apache/lucene/pull/12829#discussion_r1440668657
##
lucene/core/src/java/org/apache/lucene/index/FieldInfos.java:
##
@@ -132,6 +135,13 @@ public FieldInfos(FieldInfo[] infos) {
}
softDeletesField
mikemccand commented on PR #12829:
URL: https://github.com/apache/lucene/pull/12829#issuecomment-1875663308
> [add test and docs to make mike not staring at this
forever](https://github.com/apache/lucene/pull/12829/commits/25e6da47f989675d310de5f319ff38088589b4d4)
HAHAHA
--
This is
s1monw commented on code in PR #12829:
URL: https://github.com/apache/lucene/pull/12829#discussion_r1440343745
##
lucene/core/src/java/org/apache/lucene/index/DocumentsWriterPerThread.java:
##
@@ -231,7 +244,23 @@ long updateDocuments(
final int docsInRamBefore = numDocsI
s1monw commented on code in PR #12829:
URL: https://github.com/apache/lucene/pull/12829#discussion_r1440307876
##
lucene/core/src/java/org/apache/lucene/index/DocumentsWriterPerThread.java:
##
@@ -245,10 +274,11 @@ long updateDocuments(
onNewDocOnRAM.run();
mikemccand commented on code in PR #12829:
URL: https://github.com/apache/lucene/pull/12829#discussion_r1439716624
##
lucene/core/src/java/org/apache/lucene/index/CheckIndex.java:
##
@@ -1176,34 +1176,44 @@ public static Status.IndexSortStatus testSort(
comparators[i] =
s1monw commented on PR #12829:
URL: https://github.com/apache/lucene/pull/12829#issuecomment-1870447436
@mikemccand @jpountz I think it's ready. I added some more testing to it and
removed storing the no. of children in the DV field to make it as low impact as
possible. we can still optimiz
s1monw commented on code in PR #12829:
URL: https://github.com/apache/lucene/pull/12829#discussion_r1437134740
##
lucene/core/src/java/org/apache/lucene/index/FieldInfos.java:
##
@@ -188,6 +200,26 @@ public static FieldInfos getMergedFieldInfos(IndexReader
reader) {
}
}
s1monw commented on code in PR #12829:
URL: https://github.com/apache/lucene/pull/12829#discussion_r1435809808
##
lucene/core/src/java/org/apache/lucene/index/FieldInfos.java:
##
@@ -437,6 +486,31 @@ private void verifySoftDeletedFieldName(String fieldName,
boolean isSoftDelete
mikemccand commented on code in PR #12829:
URL: https://github.com/apache/lucene/pull/12829#discussion_r1435648884
##
lucene/codecs/src/java/org/apache/lucene/codecs/simpletext/SimpleTextSegmentInfoFormat.java:
##
@@ -197,7 +197,13 @@ public SegmentInfo read(
sortField[
s1monw commented on code in PR #12829:
URL: https://github.com/apache/lucene/pull/12829#discussion_r1434195239
##
lucene/core/src/java/org/apache/lucene/index/IndexingChain.java:
##
@@ -219,15 +224,41 @@ private Sorter.DocMap maybeSortSegment(SegmentWriteState
state) throws IOE
s1monw commented on code in PR #12829:
URL: https://github.com/apache/lucene/pull/12829#discussion_r1434193511
##
lucene/core/src/java/org/apache/lucene/index/CheckIndex.java:
##
@@ -1180,33 +1180,43 @@ public static Status.IndexSortStatus testSort(
comparators[i] = fie
s1monw commented on code in PR #12829:
URL: https://github.com/apache/lucene/pull/12829#discussion_r1434189939
##
lucene/core/src/java/org/apache/lucene/index/DocumentsWriterPerThread.java:
##
@@ -262,6 +294,35 @@ long updateDocuments(
}
}
+ private Iterable addParent
s1monw commented on code in PR #12829:
URL: https://github.com/apache/lucene/pull/12829#discussion_r1434180897
##
lucene/core/src/java/org/apache/lucene/codecs/lucene94/Lucene94FieldInfosFormat.java:
##
@@ -157,6 +158,8 @@ public FieldInfos read(
boolean omitNorms = (
s1monw commented on PR #12829:
URL: https://github.com/apache/lucene/pull/12829#issuecomment-1866428159
> If my understanding is correct, we will require a parent field when using
blocks as of 10.0. One concern I have about this is that we currently don't
require users to know up-front whet
s1monw commented on PR #12829:
URL: https://github.com/apache/lucene/pull/12829#issuecomment-1862667309
@mikemccand I don't think we have any impact on performance here at all if
this feature is not in use. If you look at DWPT there is a final field that
decides if there is further examina
s1monw commented on PR #12829:
URL: https://github.com/apache/lucene/pull/12829#issuecomment-1862391122
@mikemccand @jpountz I updated this PR and moved everything to FielInfo /
IWC. I think it's ready. the only thing that I'd like to discuss is that we are
currently recording the number of
mikemccand commented on PR #12829:
URL: https://github.com/apache/lucene/pull/12829#issuecomment-1855769298
I opened https://github.com/mikemccand/luceneutil/issues/252 to try to
measure the performance change of `addDocument` N times vs `addDocuments` once.
--
This is an automated messag
mikemccand commented on PR #12829:
URL: https://github.com/apache/lucene/pull/12829#issuecomment-1855755782
One small observation here: one can use the `add/updateDocuments` API today
with no intention of using those as doc blocks at search time, purely as an
optimization over calling separ
mikemccand commented on code in PR #12829:
URL: https://github.com/apache/lucene/pull/12829#discussion_r1426637555
##
lucene/backward-codecs/src/test/org/apache/lucene/backward_index/TestBackwardsCompatibility.java:
##
@@ -2164,6 +2166,83 @@ public void testSortedIndex() throws
s1monw commented on PR #12829:
URL: https://github.com/apache/lucene/pull/12829#issuecomment-1852410786
@mikemccand I agree we should not add this to sort but rather tread it the
same way we treat the softDeletes field. it's essentially the same thing from
an IW perspective. I will go ahead
msokolov commented on code in PR #12829:
URL: https://github.com/apache/lucene/pull/12829#discussion_r1424209861
##
lucene/core/src/java/org/apache/lucene/index/IndexingChain.java:
##
@@ -219,15 +222,33 @@ private Sorter.DocMap maybeSortSegment(SegmentWriteState
state) throws I
s1monw commented on code in PR #12829:
URL: https://github.com/apache/lucene/pull/12829#discussion_r1423637103
##
lucene/core/src/java/org/apache/lucene/index/IndexingChain.java:
##
@@ -219,15 +222,33 @@ private Sorter.DocMap maybeSortSegment(SegmentWriteState
state) throws IOE
s1monw commented on code in PR #12829:
URL: https://github.com/apache/lucene/pull/12829#discussion_r1423606084
##
lucene/core/src/java/org/apache/lucene/index/IndexingChain.java:
##
@@ -219,15 +222,33 @@ private Sorter.DocMap maybeSortSegment(SegmentWriteState
state) throws IOE
msokolov commented on code in PR #12829:
URL: https://github.com/apache/lucene/pull/12829#discussion_r1421566628
##
lucene/core/src/java/org/apache/lucene/index/IndexingChain.java:
##
@@ -219,15 +222,33 @@ private Sorter.DocMap maybeSortSegment(SegmentWriteState
state) throws I
s1monw commented on code in PR #12829:
URL: https://github.com/apache/lucene/pull/12829#discussion_r1419505134
##
lucene/core/src/java/org/apache/lucene/search/Sort.java:
##
@@ -59,10 +61,28 @@ public Sort() {
* is still a tie after all SortFields are checked, the internal L
jpountz commented on code in PR #12829:
URL: https://github.com/apache/lucene/pull/12829#discussion_r1419268112
##
lucene/core/src/java/org/apache/lucene/index/IndexingChain.java:
##
@@ -219,15 +222,33 @@ private Sorter.DocMap maybeSortSegment(SegmentWriteState
state) throws IO
s1monw commented on code in PR #12829:
URL: https://github.com/apache/lucene/pull/12829#discussion_r1419216698
##
lucene/core/src/java/org/apache/lucene/index/IndexingChain.java:
##
@@ -219,15 +222,33 @@ private Sorter.DocMap maybeSortSegment(SegmentWriteState
state) throws IOE
s1monw commented on code in PR #12829:
URL: https://github.com/apache/lucene/pull/12829#discussion_r1419210065
##
lucene/core/src/java/org/apache/lucene/index/DocumentsWriterPerThread.java:
##
@@ -262,6 +290,44 @@ long updateDocuments(
}
}
+ private Iterable filterPar
s1monw commented on code in PR #12829:
URL: https://github.com/apache/lucene/pull/12829#discussion_r1419209272
##
lucene/backward-codecs/src/test/org/apache/lucene/backward_index/TestBackwardsCompatibility.java:
##
@@ -2164,6 +2166,83 @@ public void testSortedIndex() throws Exce
s1monw commented on code in PR #12829:
URL: https://github.com/apache/lucene/pull/12829#discussion_r1419203464
##
lucene/backward-codecs/src/test/org/apache/lucene/backward_index/TestBackwardsCompatibility.java:
##
@@ -2164,6 +2166,83 @@ public void testSortedIndex() throws Exce
mikemccand commented on code in PR #12829:
URL: https://github.com/apache/lucene/pull/12829#discussion_r1415586966
##
lucene/core/src/test/org/apache/lucene/index/TestAddIndexes.java:
##
@@ -1678,6 +1678,51 @@ public void testIllegalIndexSortChange2() throws
Exception {
IO
mikemccand commented on PR #12829:
URL: https://github.com/apache/lucene/pull/12829#issuecomment-1840782006
Thanks @s1monw -- I'll try to review soon.
> The only think I am torn on is, if we set the num of children as a value
for the DV field then I guess we should have a good usecase
s1monw commented on code in PR #12829:
URL: https://github.com/apache/lucene/pull/12829#discussion_r1415196168
##
lucene/core/src/java/org/apache/lucene/index/DocumentsWriterPerThread.java:
##
@@ -262,6 +277,73 @@ long updateDocuments(
}
}
+ private interface DocValid
s1monw commented on PR #12829:
URL: https://github.com/apache/lucene/pull/12829#issuecomment-1837464516
@mikemccand @jpountz I went down a hybrid way. This change now only requires
the user to specify the field name and IW adds it to the last doc in the block
if it's configured. I will for
s1monw commented on code in PR #12829:
URL: https://github.com/apache/lucene/pull/12829#discussion_r1413065291
##
lucene/core/src/java/org/apache/lucene/search/Sort.java:
##
@@ -59,10 +61,28 @@ public Sort() {
* is still a tie after all SortFields are checked, the internal L
msokolov commented on PR #12829:
URL: https://github.com/apache/lucene/pull/12829#issuecomment-1830150503
@s1monw that makes sense. I think I was confusing index-time changes and
query-time changes. This whole piece of functionality is a little confusing
given how loosely coupled these thin
s1monw commented on PR #12829:
URL: https://github.com/apache/lucene/pull/12829#issuecomment-1829488114
@mikemccand @jpountz thanks for your ideas. I'd love to flash this out more
before we add anything we write to the index. Today we'd only use this for
sorting but if that field can be use
msokolov commented on PR #12829:
URL: https://github.com/apache/lucene/pull/12829#issuecomment-1828402628
> I don't think we give up any functionality. can you elaborate what
functionality you are referring to? I don't think we should have a list of
parent fields that IW requires, what woul
mikemccand commented on PR #12829:
URL: https://github.com/apache/lucene/pull/12829#issuecomment-1827825175
> using a doc-value field where only parents documents have a value for the
field, and the value must be the number of child documents that the parent has
This is a neat idea to
mikemccand commented on code in PR #12829:
URL: https://github.com/apache/lucene/pull/12829#discussion_r1406124683
##
lucene/core/src/java/org/apache/lucene/index/DocumentsWriterPerThread.java:
##
@@ -262,6 +277,73 @@ long updateDocuments(
}
}
+ private interface DocV
s1monw commented on PR #12829:
URL: https://github.com/apache/lucene/pull/12829#issuecomment-1824135498
> I'm a little worried about giving up functionality, but I think if we had
a list of parent-fields rather than a single parent-field that would cover what
we can do today? Maybe one is e
s1monw commented on code in PR #12829:
URL: https://github.com/apache/lucene/pull/12829#discussion_r1403178782
##
lucene/core/src/test/org/apache/lucene/index/TestIndexSorting.java:
##
@@ -3173,4 +3173,184 @@ public void
testSortDocsAndFreqsAndPositionsAndOffsets() throws IOExc
s1monw commented on code in PR #12829:
URL: https://github.com/apache/lucene/pull/12829#discussion_r1403174609
##
lucene/core/src/test/org/apache/lucene/index/TestIndexSorting.java:
##
@@ -3173,4 +3173,184 @@ public void
testSortDocsAndFreqsAndPositionsAndOffsets() throws IOExc
msokolov commented on PR #12829:
URL: https://github.com/apache/lucene/pull/12829#issuecomment-1823517625
I like @jpountz's idea to make the value of this field be the number of
children. It is simple and makes sense, and is pretty close to having the
degree of flexibility that the current
jpountz commented on PR #12829:
URL: https://github.com/apache/lucene/pull/12829#issuecomment-1823167735
In general, I like the idea of making block joins more of a first-class
citizen. I have been thinking for a long time about changing how blocks are
identified from using bitsets to using
msokolov commented on code in PR #12829:
URL: https://github.com/apache/lucene/pull/12829#discussion_r1401297534
##
lucene/core/src/test/org/apache/lucene/index/TestIndexSorting.java:
##
@@ -3173,4 +3173,184 @@ public void
testSortDocsAndFreqsAndPositionsAndOffsets() throws IOE
52 matches
Mail list logo