[
https://issues.apache.org/jira/browse/LUCENE-10057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17401661#comment-17401661
]
Michael Sokolov edited comment on LUCENE-10057 at 8/19/21, 12:24 PM:
---------------------------------------------------------------------
Oh, did my stab at this not work? I was unable to reproduce so I wasn't sure
... Thank you for hacking at it, [~dweiss]. Your patches LGTM. I don't think I
understand where the issue was coming from and why this fixed it though. UPDATE
- I read the thread on the mailing list that explains we have a fix for how to
unmap mmapped files in Directory/IndexInput, and using those classes enables us
to avail the demo of it. Thanks for looking into it [~uschindler]
Re: source data for the vectors ... I'm not sure what you mean there; these are
a small sample of the (from our perspective precomputed) embeddings downloaded
from https://nlp.stanford.edu/projects/glove/ (there is something about it in
the package-info.java). Originally they were arrived at by training a large
corpus of text (I think these are from a collection of 6B twitter and other
texts).
was (Author: sokolov):
Oh, did my stab at this not work? I was unable to reproduce so I wasn't sure
... Thank you for hacking at it, @Dawid. Your patches LGTM. I don't think I
understand where the issue was coming from and why this fixed it though.
Re: source data for the vectors ... I'm not sure what you mean there; these are
a small sample of the (from our perspective precomputed) embeddings downloaded
from https://nlp.stanford.edu/projects/glove/ (there is something about it in
the package-info.java). Originally they were arrived at by training a large
corpus of text (I think these are from a collection of 6B twitter and other
texts).
> Replace direct mmaped buffer with Lucene abstractions in KnnVectorDict
> ----------------------------------------------------------------------
>
> Key: LUCENE-10057
> URL: https://issues.apache.org/jira/browse/LUCENE-10057
> Project: Lucene - Core
> Issue Type: Bug
> Reporter: Dawid Weiss
> Priority: Major
> Attachments: LUCENE-10057.patch
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]