Re: [I] Can we remove `compress` option for quantized KNN vector indexing? [lucene]

via GitHub Thu, 12 Sep 2024 04:27:38 -0700


mikemccand commented on issue #13768:
URL: https://github.com/apache/lucene/issues/13768#issuecomment-2346033320


   Well, I ran 
[`knnPerfTest.py`](https://github.com/mikemccand/luceneutil/blob/f4a07ed8de36c47aacb6033a3709e236bc42aca4/src/python/knnPerfTest.py)
 on my Linux dev box (x86-64 Raptorlake i9-13900K).  This CPU has crazy number 
of flags, but NOT AVX-512:
   
   ```
   flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca 
cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx 
pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl 
xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 
monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 
x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 
3dnowprefetch cpuid_fault ssbd ibrs ibpb stibp ibrs_enhanced tpr_shadow 
flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms 
invpcid rdseed adx smap clflushopt clwb intel_pt sha_ni xsaveopt xsavec xgetbv1 
xsaves split_lock_detect user_shstk avx_vnni dtherm ida arat pln pts hwp 
hwp_notify hwp_act_window hwp_epp hwp_pkg_req hfi vnmi umip pku ospke waitpkg 
gfni vaes vpclmulqdq tme rdpid movdiri movdir64b fsrm md_clear serialize 
pconfig arch_lbr ibt flush_l1d arch_capabilities
   ```
   
   This is with Panama enabled (`INFO: Java vector incubator API enabled; uses 
preferredBitSize=256; FMA enabled`).  I'll try with Panama disabled next.
   
   Results:
   
   ```
   recall  latency (ms)     nDoc  topK  fanout  maxConn  beamWidth  quantized  
index s  force merge s  num segments  index size (MB)
    0.319         0.167  1500000    10       6       32         50     4 bits   
 86.01          64.49             1          5013.19
    0.326         0.187  1500000    10       6       32         50     4 bits   
 89.03          74.99             1          5562.51
   ```
   
   Unfortunately the output doesn't state it, but the first row is 
`compress=True` and 2nd is `compress=False`.  Indeed, latency (search time) got 
faster (187 usec -> 167 usec) with `compress=True`, and this is quite a 
reduction (~10% ish) in index size.  Indexing and force merge time did get a 
bit slower ...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [I] Can we remove `compress` option for quantized KNN vector indexing? [lucene]

Reply via email to