mayya-sharipova commented on code in PR #12436:
URL: https://github.com/apache/lucene/pull/12436#discussion_r1275409313


##########
lucene/core/src/java/org/apache/lucene/index/IndexingChain.java:
##########
@@ -621,6 +621,12 @@ private void initializeFieldInfo(PerField pf) throws 
IOException {
       final Sort indexSort = indexWriterConfig.getIndexSort();
       validateIndexSortDVType(indexSort, pf.fieldName, s.docValuesType);
     }
+    if (s.vectorDimension != 0) {
+      validateMaxVectorDimension(
+          pf.fieldName,
+          s.vectorDimension,
+          indexWriterConfig.getCodec().knnVectorsFormat().getMaxDimensions());
+    }

Review Comment:
   @jpountz Thank you for the additional feedback.
   
   > I worry that this adds a hashtable lookup on a hot code path. Maybe it's 
not that bad for vectors, which are slow to index anyway, but I'd rather avoid 
it. 
   
   This is not really a hot code path. We ask for 
`getCodec().knnVectorsFormat().getMaxDimensions`   in the `initializeFieldInfo` 
function, that happens only once per a new field per segment.
   
   > What about making the codec responsible for checking the limit?
   
   Thanks for the suggestion, I experimented with this idea, and encountered 
the following difficulty with it:
   -  we need to create a new `FieldInfo` before passing it to 
`KnnFieldVectorsWriter<?> addField(FieldInfo fieldInfo)`. 
   - The way we create it is : `FieldInfo fi = fieldInfos.add(` by adding to 
the global fieldInfos. This means that if `FieldInfo` contains incorrect number 
of dimensions, it will be stored like this in the global fieldInfos, and we 
can't change it (for example with a second document with correct number of 
dims).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to