dantuzi opened a new pull request, #12169:
URL: https://github.com/apache/lucene/pull/12169

   If you want to expand your query/documents with synonyms in Apache Lucene, 
you need a predefined file containing the list of terms that share the same 
semantics.
   It's not always easy to find a list of basic synonyms for a language and, 
even if you find it, this doesn’t necessarily match your contextual domain.
   The term "daemon" in the domain of operating system articles is not a 
synonym of "devil" but it's closer to the term "process".
   
   Word2Vec is a two-layer neural network that takes as input a text and 
outputs a vector representation for each word in the dictionary.
   Two words with similar meanings are identified with two vectors close to 
each other.
   
   This contribution integrates this technique with the text analysis pipeline. 
It automatically generates synonyms on the fly from a Word2Vec model generated 
using the library DL4J.
   Please see our presentation at the Berlin Buzzwords conference: 
https://pretalx.com/bbuzz22/talk/UYZAUX/
   
   #### For the reviewer:
   Almost all contribution consists of new code


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to