[ 
https://issues.apache.org/jira/browse/LUCENE-10054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17401172#comment-17401172
 ] 

Mayya Sharipova edited comment on LUCENE-10054 at 8/19/21, 11:42 AM:
---------------------------------------------------------------------

Proposed .vem index file structure:

 
{code:java}
+-------------+--------+----------+----------+----------+---------+------+-------+
| FieldNumber | SimFun | VDOffset | VDLength | VIOffset | VILength| dims | 
docIds
+-------------+--------+----------+----------+----------+---------+------+-------+-

+-------------+-----------+-----+-------------+--------+
| LevelsCount | SizeLevel0| ... | SizeLevelmax| docIds 
+-------------+-----------+-----+-------------+--------+

---+------------+-----+--------------+
ep | NodesLevel1| ... | NodesLevelmax
---+------------+-----+--------------+

--------------------+-----+----------------------+
 graphOffsetsLevel0 | ... | graphOffsetsLevelmax |
--------------------+---- +----------------------+
{code}
 LevelCount - number of levels

SizeLevel0, ..., SizeLevelmax - number of nodes of each level

ep - entry point of the graph on the top level as a node ordinal 

NodesLevel1, ..., NodesLevelmax - list of nodes on each level from 1 to max; it 
not necessary to store nodes on level 0 as this level contains all nodes.

graphOffsetsLevelmax, ..., graphOffsetsLevel0 - graph offsets for corresponding 
levels from 0 to max

 

 


was (Author: mayya):
Proposed .vem index file structure:

 
{code:java}
+-------------+--------+----------+----------+----------+---------+------+-------+
| FieldNumber | SimFun | VDOffset | VDLength | VIOffset | VILength| dims | 
docIds
+-------------+--------+----------+----------+----------+---------+------+-------+-

---+-------------+-----------+-----+-------------+------------+-----+------------+
ep | LevelsCount | SizeLevel0| ... | SizeLevelmax| NodesLevel1| ... | 
NodesLevelmax
---+-------------+-----------+-----+-------------+------------+-----+------------+

--------------------+-----+----------------------+
 graphOffsetsLevel0 | ... | graphOffsetsLevelmax |
--------------------+---- +----------------------+
{code}
 

ep - entry point of the graph on the top level as a node ordinal 

LevelCount - number of levels

SizeLevel0, ..., SizeLevelmax - number of nodes of each level

NodesLevel1, ..., NodesLevelmax - list of nodes on each level from 1 to max; it 
not necessary to store nodes on level 0 as this level contains all nodes.

graphOffsetsLevelmax, ..., graphOffsetsLevel0 - graph offsets for corresponding 
levels from 0 to max

 

 

> Handle hierarchy in HNSW graph
> ------------------------------
>
>                 Key: LUCENE-10054
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10054
>             Project: Lucene - Core
>          Issue Type: Task
>            Reporter: Mayya Sharipova
>            Priority: Major
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently HNSW graph is represented as a single layer graph. 
>  We would like to extend it to handle hierarchy as per 
> [discussion|https://issues.apache.org/jira/browse/LUCENE-9004?focusedCommentId=17393216&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17393216].
>  
>  
> TODO tasks:
> - add multiple layers in the HnswGraph class
>  - modify the format in  Lucene90HnswVectorsWriter and 
> Lucene90HnswVectorsReader to handle multiple layers
> - modify graph construction and search algorithm to handle hierarchy
>  - run benchmarks



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to