[jira] [Updated] (LUCENE-10527) Use bigger maxConn for last layer in HNSW

Julie Tibshirani (Jira) Wed, 18 May 2022 16:24:05 -0700


     [ 
https://issues.apache.org/jira/browse/LUCENE-10527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Julie Tibshirani updated LUCENE-10527:
--------------------------------------
    Description: 
Recently I was rereading the HNSW paper 
([https://arxiv.org/pdf/1603.09320.pdf)] and noticed that they suggest using a 
different maxConn for the upper layers vs. the bottom one (which contains the 
full neighborhood graph). Specifically, they suggest using maxConn=M for upper 
layers and maxConn=2*M for the bottom. This differs from what we do, which is 
to use maxConn=M for all layers.

I tried updating our logic using a hacky patch, and noticed an improvement in 
latency for higher recall values (which is consistent with the paper's 
observation):

*Results on glove-100-angular*
Parameters: M=32, efConstruction=100
!image-2022-04-20-14-53-58-484.png|width=400,height=367!

As we'd expect, indexing becomes a bit slower:
{code:java}
Baseline: Indexed 1183514 documents in 733s 
Candidate: Indexed 1183514 documents in 948s{code}
When we benchmarked Lucene HNSW against hnswlib in LUCENE-9937, we noticed a 
big difference in recall for the same settings of M and efConstruction. (Even 
adding graph layers in LUCENE-10054 didn't really affect recall.) With this 
change, the recall is now very similar:

*Results on glove-100-angular*
Parameters: M=32, efConstruction=100
{code:java}
k            Approach                                                  Recall   
  QPS
10           luceneknn dim=100 {'M': 32, 'efConstruction': 100}        0.563    
 4410.499
50           luceneknn dim=100 {'M': 32, 'efConstruction': 100}        0.798    
 1956.280
100          luceneknn dim=100 {'M': 32, 'efConstruction': 100}        0.862    
 1209.734
500          luceneknn dim=100 {'M': 32, 'efConstruction': 100}        0.958    
  341.428
800          luceneknn dim=100 {'M': 32, 'efConstruction': 100}        0.974    
  230.396
1000         luceneknn dim=100 {'M': 32, 'efConstruction': 100}        0.980    
  188.757

10           hnswlib ({'M': 32, 'efConstruction': 100})                0.552    
16745.433
50           hnswlib ({'M': 32, 'efConstruction': 100})                0.794    
 5738.468
100          hnswlib ({'M': 32, 'efConstruction': 100})                0.860    
 3336.386
500          hnswlib ({'M': 32, 'efConstruction': 100})                0.956    
  832.982
800          hnswlib ({'M': 32, 'efConstruction': 100})                0.973    
  541.097
1000         hnswlib ({'M': 32, 'efConstruction': 100})                0.979    
  442.163
{code}
I think it'd be nice update to maxConn so that we faithfully implement the 
paper's algorithm. This is probably least surprising for users, and I don't see 
a strong reason to take a different approach from the paper? Let me know what 
you think!

  was:
Recently I was rereading the HNSW paper 
([https://arxiv.org/pdf/1603.09320.pdf)] and noticed that they suggest using a 
different maxConn for the upper layers vs. the bottom one (which contains the 
full neighborhood graph). Specifically, they suggest using maxConn=M for upper 
layers and maxConn=2*M for the bottom. This differs from what we do, which is 
to use maxConn=M for all layers.

I tried updating our logic using a hacky patch, and noticed an improvement in 
latency for higher recall values (which is consistent with the paper's 
observation):

*Results on glove-100-angular*
Parameters: M=32, efConstruction=100
!image-2022-04-20-14-53-58-484.png|width=400,height=367!

As we'd expect, indexing becomes a bit slower:
{code:java}
Baseline: Indexed 1183514 documents in 733s 
Candidate: Indexed 1183514 documents in 948s{code}
When we benchmarked Lucene HNSW against hnswlib in LUCENE-9937, we noticed a 
big difference in recall for the same settings of M and efConstruction. (Even 
adding graph layers in LUCENE-10054 didn't really affect recall.) With this 
change, the recall is now very similar:

*Results on glove-100-angular*
Parameters: M=32, efConstruction=100
{code:java}
k            Approach                                                  Recall   
  QPS
10           luceneknn dim=100 {'M': 32, 'efConstruction': 100}        0.563    
 4410.499
50           luceneknn dim=100 {'M': 32, 'efConstruction': 100}        0.798    
 1956.280
100          luceneknn dim=100 {'M': 32, 'efConstruction': 100}        0.862    
 1209.734
100          luceneknn dim=100 {'M': 32, 'efConstruction': 100}        0.958    
  341.428
800          luceneknn dim=100 {'M': 32, 'efConstruction': 100}        0.974    
  230.396
1000         luceneknn dim=100 {'M': 32, 'efConstruction': 100}        0.980    
  188.757

10           hnswlib ({'M': 32, 'efConstruction': 100})                0.552    
16745.433
50           hnswlib ({'M': 32, 'efConstruction': 100})                0.794    
 5738.468
100          hnswlib ({'M': 32, 'efConstruction': 100})                0.860    
 3336.386
500          hnswlib ({'M': 32, 'efConstruction': 100})                0.956    
  832.982
800          hnswlib ({'M': 32, 'efConstruction': 100})                0.973    
  541.097
1000         hnswlib ({'M': 32, 'efConstruction': 100})                0.979    
  442.163
{code}
I think it'd be nice update to maxConn so that we faithfully implement the 
paper's algorithm. This is probably least surprising for users, and I don't see 
a strong reason to take a different approach from the paper? Let me know what 
you think!


> Use bigger maxConn for last layer in HNSW
> -----------------------------------------
>
>                 Key: LUCENE-10527
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10527
>             Project: Lucene - Core
>          Issue Type: Task
>            Reporter: Julie Tibshirani
>            Assignee: Mayya Sharipova
>            Priority: Minor
>         Attachments: image-2022-04-20-14-53-58-484.png
>
>          Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> Recently I was rereading the HNSW paper 
> ([https://arxiv.org/pdf/1603.09320.pdf)] and noticed that they suggest using 
> a different maxConn for the upper layers vs. the bottom one (which contains 
> the full neighborhood graph). Specifically, they suggest using maxConn=M for 
> upper layers and maxConn=2*M for the bottom. This differs from what we do, 
> which is to use maxConn=M for all layers.
> I tried updating our logic using a hacky patch, and noticed an improvement in 
> latency for higher recall values (which is consistent with the paper's 
> observation):
> *Results on glove-100-angular*
> Parameters: M=32, efConstruction=100
> !image-2022-04-20-14-53-58-484.png|width=400,height=367!
> As we'd expect, indexing becomes a bit slower:
> {code:java}
> Baseline: Indexed 1183514 documents in 733s 
> Candidate: Indexed 1183514 documents in 948s{code}
> When we benchmarked Lucene HNSW against hnswlib in LUCENE-9937, we noticed a 
> big difference in recall for the same settings of M and efConstruction. (Even 
> adding graph layers in LUCENE-10054 didn't really affect recall.) With this 
> change, the recall is now very similar:
> *Results on glove-100-angular*
> Parameters: M=32, efConstruction=100
> {code:java}
> k            Approach                                                  Recall 
>     QPS
> 10           luceneknn dim=100 {'M': 32, 'efConstruction': 100}        0.563  
>    4410.499
> 50           luceneknn dim=100 {'M': 32, 'efConstruction': 100}        0.798  
>    1956.280
> 100          luceneknn dim=100 {'M': 32, 'efConstruction': 100}        0.862  
>    1209.734
> 500          luceneknn dim=100 {'M': 32, 'efConstruction': 100}        0.958  
>     341.428
> 800          luceneknn dim=100 {'M': 32, 'efConstruction': 100}        0.974  
>     230.396
> 1000         luceneknn dim=100 {'M': 32, 'efConstruction': 100}        0.980  
>     188.757
> 10           hnswlib ({'M': 32, 'efConstruction': 100})                0.552  
>   16745.433
> 50           hnswlib ({'M': 32, 'efConstruction': 100})                0.794  
>    5738.468
> 100          hnswlib ({'M': 32, 'efConstruction': 100})                0.860  
>    3336.386
> 500          hnswlib ({'M': 32, 'efConstruction': 100})                0.956  
>     832.982
> 800          hnswlib ({'M': 32, 'efConstruction': 100})                0.973  
>     541.097
> 1000         hnswlib ({'M': 32, 'efConstruction': 100})                0.979  
>     442.163
> {code}
> I think it'd be nice update to maxConn so that we faithfully implement the 
> paper's algorithm. This is probably least surprising for users, and I don't 
> see a strong reason to take a different approach from the paper? Let me know 
> what you think!



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (LUCENE-10527) Use bigger maxConn for last layer in HNSW

Reply via email to