benwtrent merged PR #13641:
URL: https://github.com/apache/lucene/pull/13641
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.a
benwtrent commented on code in PR #13641:
URL: https://github.com/apache/lucene/pull/13641#discussion_r1755503855
##
lucene/codecs/src/test/org/apache/lucene/codecs/simpletext/TestSimpleTextKnnVectorsFormat.java:
##
@@ -41,4 +41,14 @@ public void testRandomBytes() throws Excepti
jpountz commented on code in PR #13641:
URL: https://github.com/apache/lucene/pull/13641#discussion_r1755048948
##
lucene/core/src/java/org/apache/lucene/codecs/KnnVectorsWriter.java:
##
@@ -212,14 +213,34 @@ private static void validateFieldEncoding(FieldInfo
fieldInfo, Vector
benwtrent commented on PR #13641:
URL: https://github.com/apache/lucene/pull/13641#issuecomment-2344221805
@bugmakerr I don't want to make that change as the
PerFieldDocValuesFormat & PerFieldPostingsFormat field readers will return
`null` if the field is missing. I want to at least try
bugmakerr commented on PR #13641:
URL: https://github.com/apache/lucene/pull/13641#issuecomment-2344078213
@benwtrent should we do the same check for
`PerFieldKnnVectorsFormat.FieldsReader`? If the target reader is null, throw an
IAE and ensure that we do not return null.
--
This is
benwtrent commented on code in PR #13641:
URL: https://github.com/apache/lucene/pull/13641#discussion_r1754984085
##
lucene/test-framework/src/java/org/apache/lucene/tests/index/BaseKnnVectorsFormatTestCase.java:
##
@@ -230,6 +236,120 @@ public void testIllegalDimChangeTwoWriter
benwtrent commented on PR #13641:
URL: https://github.com/apache/lucene/pull/13641#issuecomment-2344006469
OK, I pushed up a fix for @msokolov's suggestion and your reminder @jpountz
. I added a protection around our access to the KnnVectorReaders during merge.
Additionally, I kept my new t
benwtrent commented on PR #13641:
URL: https://github.com/apache/lucene/pull/13641#issuecomment-2343701781
> Fine with me, I will work on fixing the merging logic then unless you plan
on working on it.
Ah dang, yeah, I completely spaced on that since this PR is so old. I will
jpountz commented on PR #13641:
URL: https://github.com/apache/lucene/pull/13641#issuecomment-2343692465
Fine with me, I will work on fixing the merging logic then unless you plan
on working on it.
--
This is an automated message from the Apache Git Service.
To respond to the message, ple
benwtrent commented on PR #13641:
URL: https://github.com/apache/lucene/pull/13641#issuecomment-2343677555
Unless there are rejections, I am going to move forward with merging this.
Throwing an IAE seems way better than throwing an NPE.
If we want to change this behavior to be more l
jpountz commented on PR #13641:
URL: https://github.com/apache/lucene/pull/13641#issuecomment-2343648325
> Separately, it looks like we call the KnnVectorsReader with the merged
field info, which is not legal, we should call it with the field info of the
segment.
Apologies, I was wro
benwtrent commented on code in PR #13641:
URL: https://github.com/apache/lucene/pull/13641#discussion_r1754379198
##
lucene/test-framework/src/java/org/apache/lucene/tests/index/BaseKnnVectorsFormatTestCase.java:
##
@@ -230,6 +236,120 @@ public void testIllegalDimChangeTwoWriter
jpountz commented on code in PR #13641:
URL: https://github.com/apache/lucene/pull/13641#discussion_r1753859763
##
lucene/test-framework/src/java/org/apache/lucene/tests/index/BaseKnnVectorsFormatTestCase.java:
##
@@ -230,6 +236,120 @@ public void testIllegalDimChangeTwoWriters(
github-actions[bot] commented on PR #13641:
URL: https://github.com/apache/lucene/pull/13641#issuecomment-2308587265
This PR has not had activity in the past 2 weeks, labeling it as stale. If
the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you
for your contributi
benwtrent commented on PR #13641:
URL: https://github.com/apache/lucene/pull/13641#issuecomment-2282231690
@msokolov maybe?
Looking at the code, this has never worked this way. Field entry existence
was just assumed to be handled by the perfield codec.
I tried digging into the
msokolov commented on PR #13641:
URL: https://github.com/apache/lucene/pull/13641#issuecomment-2282193063
> It is possible to inappropriately use the knn formats and attempt to merge
segments with mismatched field names.
I don't understand this - what's inappropriate about it? I guess
benwtrent opened a new pull request, #13641:
URL: https://github.com/apache/lucene/pull/13641
It is possible to inappropriately use the knn formats and attempt to merge
segments with mismatched field names.
None of the formats actually check for `null`, they just all assume that the
17 matches
Mail list logo