kaivalnp commented on PR #14178:
URL: https://github.com/apache/lucene/pull/14178#issuecomment-2622538569

   > Since Faiss uses multithreading by default, we cannot compare with Lucene
   
   Ah nice catch, the number of threads used by both may be different..
   
   I'm not sure how many threads were used by Faiss above, but the number of 
threads used by Lucene are specified 
[here](https://github.com/mikemccand/luceneutil/blob/9764dffb3e00fc37a9edb4a55381010d4c60c26c/src/python/knnPerfTest.py#L65-L66)
 (I didn't change these)
   
   I set `$OMP_NUM_THREADS=4` (from the 
[link](https://github.com/facebookresearch/faiss/wiki/Threads-and-asynchronous-calls#internal-threading)
 you sent) to keep the number of threads same in both:
   
   Lucene:
   ```
   recall  latency (ms)    nDoc  topK  fanout  maxConn  beamWidth  quantized  
index s  index docs/s  force merge s  num segments  index size (MB)  vec disk 
(MB)  vec RAM (MB)
    0.811         1.439  200000   100      50       32        200         no    
51.98       3847.63           0.00             1           237.75        
228.882       228.882
   ```
   
   Faiss:
   ```
   recall  latency (ms)    nDoc  topK  fanout  maxConn  beamWidth  quantized  
index s  index docs/s  force merge s  num segments  index size (MB)  vec disk 
(MB)  vec RAM (MB)
    0.810         1.110  200000   100      50       32        200         no    
15.92      12565.18          41.44             1           511.21        
228.882       228.882
   ```
   
   Not as high as 10x anymore, but it is still \~3x faster
   
   > Does that mean that you indexed with `-numIndexThreads=1` for the Lucene 
run?
   
   This was [set to 
`8`](https://github.com/mikemccand/luceneutil/blob/9764dffb3e00fc37a9edb4a55381010d4c60c26c/src/python/knnPerfTest.py#L183)
 for both runs (I didn't change the default value)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to