[ 
https://issues.apache.org/jira/browse/SOLR-12890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16948963#comment-16948963
 ] 

Matt Davis edited comment on SOLR-12890 at 10/10/19 9:52 PM:
-------------------------------------------------------------

A few years ago I did an implementation in LuMongo (now Zulia.io ) which is 
lucene based.  It used superbit to create fields for each bit and then used a 
min should match query based on the similarity (or higher) requested.  For a 
index of 30 million docs with 300 dimensional word vectors projected  into 1000 
bits it was taking like 30 seconds so I figured there was probably a better way 
but I will note this here.  There were a lot of other fields in the index as 
well.

[https://github.com/zuliaio/zuliasearch/blob/master/zulia-server/src/main/java/io/zulia/server/index/ZuliaIndex.java#L347]

[https://github.com/zuliaio/zuliasearch/blob/master/zulia-server/src/main/java/io/zulia/server/index/ZuliaIndex.java#L347]

 

[https://github.com/lumongo/lumongo/issues/116]

 


was (Author: mdavis95):
A few years ago I did an implementation in LuMongo (now Zulia.io ) which is 
lucene based.  It used superbit to create fields for each bit and then used a 
min should match query based on the similarity (or higher) requested.  For a 
index of 30 million docs with 300 dimensional word vectors projected  into 1000 
bits it was taking like 30 seconds so I figured there was probably a better way 
but I will note this here.  There were a lot of other fields in the index as 
well.

[https://github.com/zuliaio/zuliasearch/blob/master/zulia-server/src/main/java/io/zulia/server/index/ZuliaIndex.java#L347]

[https://github.com/zuliaio/zuliasearch/blob/master/zulia-server/src/main/java/io/zulia/server/index/ZuliaIndex.java#L347]

 

> Vector Search in Solr (Umbrella Issue)
> --------------------------------------
>
>                 Key: SOLR-12890
>                 URL: https://issues.apache.org/jira/browse/SOLR-12890
>             Project: Solr
>          Issue Type: New Feature
>            Reporter: mosh
>            Priority: Major
>
> We have recently come across a need to index documents containing vectors 
> using solr, and have even worked on a small POC. We used an URP to calculate 
> the LSH(we chose to use the superbit algorithm, but the code is designed in a 
> way the algorithm picked can be easily chagned), and stored the vector in 
> either sparse or dense forms, in a binary field.
> Perhaps an addition of an LSH URP in conjunction with a query parser that 
> uses the same properties to calculate LSH(or maybe ktree, or some other 
> algorithm all together) should be considered as a Solr feature?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to