jmazanec15 commented on PR #12050:
URL: https://github.com/apache/lucene/pull/12050#issuecomment-1397643952

   Per [this 
discussion](https://github.com/apache/lucene/pull/12050#discussion_r1061034056),
 I refactored OnHeapHnswGraph to use a TreeMap to represent the graph structure 
for levels greater than 0. I ran performance tests with the same setup as 
https://github.com/apache/lucene/issues/11354#issuecomment-1239961308, and the 
results did not show a significant difference in indexing time between my 
previous implementation, the implementation using the map, and the current 
implementation with no merge optimization. Additionally, the results did not 
show a difference in merge time between by previous implementation and the 
implementation using the map.
   
   Here are the results:
   
   ###  Segment Size 10K
   
   
   Exper. | Total indexing time (s) | Total time to merge numeric vectors (ms) 
| Recall
   -- | -- | -- | --
   Control-1 | 189s | 697280 | 0.979
   Control-2 | 190s | 722042 | 0.979
   Control-3 | 191s | 713402 | 0.979
   Test-array 1 | 190s | 683966 | 0.98
   Test-array 2 | 187s | 683584 | 0.98
   Test-array 3 | 190s | 702458 | 0.98
   Test-map 1 | 189s | 723582 | 0.98
   Test-map 2 | 187s | 658196 | 0.98
   Test-map 3 | 190s | 667777 | 0.98
   
   ###  Segment Size 100K
   
   Exper. | Total indexing time (s) | Total time to merge numeric vectors (ms) 
| Recall
   -- | -- | -- | --
   Control-1 | 366s | 675361 | 0.981
   Control-2 | 370s | 695974 | 0.981
   Control-3 | 367s | 684418 | 0.981
   Test-array 1 | 368s | 651814 | 0.981
   Test-array 2 | 368s | 654862 | 0.981
   Test-array 3 | 368s | 656062 | 0.981
   Test-map 1  | 364s | 637257 | 0.981
   Test-map 2  | 370s | 628755 | 0.981
   Test-map 3 | 366s | 647569 | 0.981
   
   ###  Segment Size 500K
   
   Exper. | Total indexing time (s) | Total time to merge numeric vectors (ms) 
| Recall
   -- | -- | -- | --
   Control-1 | 633s | 655538 | 0.98
   Control-2 | 631s | 664622 | 0.98
   Control-3 | 627s | 635919 | 0.98
   Test-array 1 | 639s | 376139 | 0.98
   Test-array 2 | 636s | 378071 | 0.98
   Test-array 3 | 638s | 352633 | 0.98
   Test-map 1  | 645s | 373572 | 0.98
   Test-map 2  | 635s | 374309 | 0.98
   Test-map 3 | 633s | 381212 | 0.98
   
   Given that the results do not show a significant difference, I switched to 
use the treemap to avoid multiple large array copies.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to