benchaplin commented on code in PR #13984:
URL: https://github.com/apache/lucene/pull/13984#discussion_r1838595160


##########
lucene/core/src/java/org/apache/lucene/index/CheckIndex.java:
##########
@@ -2746,6 +2769,84 @@ public static Status.VectorValuesStatus testVectors(
     return status;
   }
 
+  private static HnswGraph getHnswGraph(CodecReader reader) throws IOException 
{
+    KnnVectorsReader vectorsReader = reader.getVectorReader();
+    if (vectorsReader instanceof PerFieldKnnVectorsFormat.FieldsReader) {
+      vectorsReader = ((PerFieldKnnVectorsFormat.FieldsReader) 
vectorsReader).getFieldReader("knn");

Review Comment:
   Thanks, I didn't quite understand fields when I wrote this - I think I get 
it now. Alright, I've done what you suggested (as is also done in 
`testVectors`) and iterated over `FieldInfos`, performing the check only when 
it applies. 
   
   Because we might now parse several HNSW graphs, I've restructured the status 
object to support per-graph data. Successful output will now look like:
   
   ```
       test: open reader.........OK [took 0.010 sec]
       test: check integrity.....OK [took 2.216 sec]
       test: check live docs.....OK [took 0.000 sec]
       test: field infos.........OK [2 fields] [took 0.000 sec]
       test: field norms.........OK [0 fields] [took 0.000 sec]
       test: terms, freq, prox...    test: stored fields.......OK [1500000 
total field count; avg 1.0 fields per doc] [took 0.390 sec]
       test: term vectors........OK [0 total term vector count; avg 0.0 
term/freq vector fields per doc] [took 0.000 sec]
       test: docvalues...........OK [0 docvalues fields; 0 BINARY; 0 NUMERIC; 0 
SORTED; 0 SORTED_NUMERIC; 0 SORTED_SET; 0 SKIPPING INDEX] [took 0.000 sec]
       test: points..............OK [0 fields, 0 points] [took 0.000 sec]
       test: vectors.............OK [1 fields, 1500000 vectors] [took 0.496 sec]
       test: hnsw graphs.........OK [2 fields: (field name: knn1, levels: 4, 
total nodes: 1547684), (field name: knn2, levels: 4, total nodes: 1547684)] 
[took 0.979 sec]
   ```
   
   `testVectors` doesn't do this, it just sums vectors over all fields. I could 
do that too, but this felt most complete.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to