You may already know this, but just be very careful. Embeddings are useful,
but people often think of them as detecting synonyms, but really just
encode contexts. For example antonyms and words with similar functions
often are seen as similar.

There's also issues with terms that occur in sparsely (you don't get enough
contexts to get a good embedding)
and issues with terms that occur very commonly (they tend to clump together
despite different meanings)

Older form of embedding, but the lessons still apply
https://opensourceconnections.com/blog/2016/03/29/semantic-search-with-latent-semantic-analysis/

I'd also recommend my talk at Activate that spends a ton of time on
building/customizing embeddings for your use case
https://docs.google.com/presentation/d/1-nPKX5VYUR7uue5IL0tm7M2YH0agb0aRO1y9sMKl1Hs/edit#slide=id.g3abdd68a3e_0_192

-Doug

On Tue, Oct 30, 2018 at 5:37 PM Benedict Holland <
benedict.m.holl...@gmail.com> wrote:

> Oh very cool. I will have to look into this more. This is something up and
> coming I take it?
>
> Thanks,
> ~Ben
>
> On Tue, Oct 30, 2018 at 4:36 PM Alexandre Rafalovitch <arafa...@gmail.com>
> wrote:
>
> > Simon Hughes presentation on just finished Activate may be relevant:
> >
> >
> https://www.slideshare.net/SimonHughes13/vectors-in-search-towards-more-semantic-matching
> > The video will be available in a couple of weeks, I am guessing from
> > LucidWorks channel.
> >
> > Related repos:
> > *) https://github.com/DiceTechJobs/VectorsInSearch
> > *) https://github.com/DiceTechJobs/ConceptualSearch (older)
> > *) https://github.com/kojisekig/word2vec-lucene - something else quite
> old
> >
> > These are just keyword matches on your question. I am sure others may
> > have some more real details.
> >
> > Regards,
> >    Alex.
> > On Tue, 30 Oct 2018 at 16:09, Benedict Holland
> > <benedict.m.holl...@gmail.com> wrote:
> > >
> > > Hello all,
> > >
> > > We came up with a fascinating question. We actually have for our
> corpora,
> > > word2vec, doc2vec, and GloVe results. Is it possible to use these
> > datasets
> > > within the search engine? If so, could you please point me to
> > documentation
> > > on how to get Solr to use them?
> > >
> > > Thank you so much,
> > > ~Ben
> >
>
-- 
CTO, OpenSource Connections
Author, Relevant Search
http://o19s.com/doug

Reply via email to