msokolov commented on issue #11830:
URL: https://github.com/apache/lucene/issues/11830#issuecomment-1320183235

   Hey this looks great! Awesome to see the storage gains with no loss in
   query time
   
   On Thu, Nov 17, 2022 at 2:25 PM Benjamin Trent ***@***.***>
   wrote:
   
   > I changed the PR to move towards delta encoding & vint. Even with storing
   > the memory offsets within vex, the storage improvements are much better
   > than PackedInts.
   >
   > Table with some numbers around the size improvements for different data
   > sets & parameters:
   > packed_vex_mb_size vex_mb_size packed_index_build_time index_build_time
   > params dataset percent_reduction
   > 79.9 161.6 767 784 "{'M': 16, 'efConstruction': 100}" glove-100-angular
   > 50.55693069
   > 108.4 464.1 1138 1225 "{'M': 48, 'efConstruction': 100}" glove-100-angular
   > 76.64296488
   > 2.3 8.2 36 36 "{'M': 16, 'efConstruction': 100}" mnist-784-euclidean
   > 71.95121951
   > 2.4 23.5 36 36 "{'M': 48, 'efConstruction': 100}" mnist-784-euclidean
   > 89.78723404
   > 66.1 392.2 501 572 "{'M': 48, 'efConstruction': 100}" sift-128-euclidean
   > 83.1463539
   > 59.7 136.6 449 516 "{'M': 16, 'efConstruction': 100}" sift-128-euclidean
   > 56.29575403
   >
   > For the curious, here are the QPS numbers (higher is better) for packed
   > (delta & vint) vs baseline:
   > Glove
   >
   > [image: image]
   > 
<https://user-images.githubusercontent.com/4357155/202539450-415f1622-cf6f-4cc6-8de5-e714b47cc8a6.png>
   > MNist
   >
   > [image: image]
   > 
<https://user-images.githubusercontent.com/4357155/202539516-235485b1-9b01-497f-81af-ce2d7475ae74.png>
   > SIFT
   >
   > [image: image]
   > 
<https://user-images.githubusercontent.com/4357155/202539592-d3c387e2-60e2-4956-8e92-b5b9361588bb.png>
   >
   > —
   > Reply to this email directly, view it on GitHub
   > <https://github.com/apache/lucene/issues/11830#issuecomment-1319099313>,
   > or unsubscribe
   > 
<https://github.com/notifications/unsubscribe-auth/AAHHUQO5Y2R3FCC3FD6TM4TWI2BD7ANCNFSM6AAAAAAQYJCK7E>
   > .
   > You are receiving this because you commented.Message ID:
   > ***@***.***>
   >
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to