[
https://issues.apache.org/jira/browse/LUCENE-10054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17401172#comment-17401172
]
Mayya Sharipova edited comment on LUCENE-10054 at 8/19/21, 11:44 AM:
---------------------------------------------------------------------
Proposed .vem index file structure:
{code:java}
+-------------+--------+----------+----------+----------+---------+-------
| FieldNumber | SimFun | VDOffset | VDLength | VIOffset | VILength| dims
+-------------+--------+----------+----------+----------+---------+-------
+-------------+-----------+-----+-------------+--------+
| LevelsCount | SizeLevel0| ... | SizeLevelmax| docIds
+-------------+-----------+-----+-------------+--------+
---+------------+-----+--------------+
ep | NodesLevel1| ... | NodesLevelmax
---+------------+-----+--------------+
--------------------+-----+----------------------+
graphOffsetsLevel0 | ... | graphOffsetsLevelmax |
--------------------+---- +----------------------+
{code}
LevelCount - number of levels
SizeLevel0, ..., SizeLevelmax - number of nodes of each level
ep - entry point of the graph on the top level as a node ordinal
NodesLevel1, ..., NodesLevelmax - list of nodes on each level from 1 to max; it
not necessary to store nodes on level 0 as this level contains all nodes.
graphOffsetsLevelmax, ..., graphOffsetsLevel0 - graph offsets for corresponding
levels from 0 to max
was (Author: mayya):
Proposed .vem index file structure:
{code:java}
+-------------+--------+----------+----------+----------+---------+------+-------+
| FieldNumber | SimFun | VDOffset | VDLength | VIOffset | VILength| dims |
docIds
+-------------+--------+----------+----------+----------+---------+------+-------+-
+-------------+-----------+-----+-------------+--------+
| LevelsCount | SizeLevel0| ... | SizeLevelmax| docIds
+-------------+-----------+-----+-------------+--------+
---+------------+-----+--------------+
ep | NodesLevel1| ... | NodesLevelmax
---+------------+-----+--------------+
--------------------+-----+----------------------+
graphOffsetsLevel0 | ... | graphOffsetsLevelmax |
--------------------+---- +----------------------+
{code}
LevelCount - number of levels
SizeLevel0, ..., SizeLevelmax - number of nodes of each level
ep - entry point of the graph on the top level as a node ordinal
NodesLevel1, ..., NodesLevelmax - list of nodes on each level from 1 to max; it
not necessary to store nodes on level 0 as this level contains all nodes.
graphOffsetsLevelmax, ..., graphOffsetsLevel0 - graph offsets for corresponding
levels from 0 to max
> Handle hierarchy in HNSW graph
> ------------------------------
>
> Key: LUCENE-10054
> URL: https://issues.apache.org/jira/browse/LUCENE-10054
> Project: Lucene - Core
> Issue Type: Task
> Reporter: Mayya Sharipova
> Priority: Major
> Time Spent: 20m
> Remaining Estimate: 0h
>
> Currently HNSW graph is represented as a single layer graph.
> We would like to extend it to handle hierarchy as per
> [discussion|https://issues.apache.org/jira/browse/LUCENE-9004?focusedCommentId=17393216&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17393216].
>
>
> TODO tasks:
> - add multiple layers in the HnswGraph class
> - modify the format in Lucene90HnswVectorsWriter and
> Lucene90HnswVectorsReader to handle multiple layers
> - modify graph construction and search algorithm to handle hierarchy
> - run benchmarks
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]