Re: ngrams with position

Alessandro Benedetti Wed, 09 Mar 2016 15:03:31 -0800

if you store the positions for your tokens ( and it is by default if you
don't omit them), you have the relative position in the index. [1]
I attach a blog post of mine, describing a little bit more in details the
lucene internals.


Apart from that, can you explain the problem you are trying to solve ?
The high level user experience ?
What kind of search/autocompletion/relevancy tuning are you trying to
achieve ?
Maybe we can help better if we start from the problem :)

Cheers

[1]
http://alexbenedetti.blogspot.co.uk/2015/07/exploring-solr-internals-lucene.html

On 9 March 2016 at 15:02, elisabeth benoit <elisaelisael...@gmail.com>
wrote:

> Hello Alessandro,
>
> You may be right. What would you use to keep relative order between, for
> instance, grams
>
> __a
> _am
> ams
> mst
> ste
> ter
> erd
> rda
> dam
> am_
>
> of amsterdam? pf2 and pf3? That's all I can think about. Please let me know
> if you have more insights.
>
> Best regards,
> Elisabeth
>
> 2016-03-08 17:46 GMT+01:00 Alessandro Benedetti <abenede...@apache.org>:
>
> > Elizabeth,
> > out of curiousity, could we know what you are trying to solve with that
> > complex way of tokenisation ?
> > Solr is really good in storing positions along with token, so I am
> curious
> > to know why your are mixing the things up.
> >
> > Cheers
> >
> > On 8 March 2016 at 10:08, elisabeth benoit <elisaelisael...@gmail.com>
> > wrote:
> >
> > > Thanks for your answer Emir,
> > >
> > > I'll check that out.
> > >
> > > Best regards,
> > > Elisabeth
> > >
> > > 2016-03-08 10:24 GMT+01:00 Emir Arnautovic <
> emir.arnauto...@sematext.com
> > >:
> > >
> > > > Hi Elisabeth,
> > > > I don't think there is such token filter, so you would have to create
> > > your
> > > > own token filter that takes token and emits ngram token of specific
> > > length.
> > > > It should not be too hard to create such filter - you can take a look
> > how
> > > > nagram filter is coded - yours should be simpler than that.
> > > >
> > > > Regards,
> > > > Emir
> > > >
> > > >
> > > > On 08.03.2016 08:52, elisabeth benoit wrote:
> > > >
> > > >> Hello,
> > > >>
> > > >> I'm using solr 4.10.1. I'd like to index words with ngrams of fix
> > lenght
> > > >> with a position in the end.
> > > >>
> > > >> For instance, with fix lenght 3, Amsterdam would be something like:
> > > >>
> > > >>
> > > >> a0 (two spaces added at beginning)
> > > >> am1
> > > >> ams2
> > > >> mst3
> > > >> ste4
> > > >> ter5
> > > >> erd6
> > > >> rda7
> > > >> dam8
> > > >> am9 (one more space in the end)
> > > >>
> > > >> The number at the end being the position.
> > > >>
> > > >> Does anyone have a clue how to achieve this?
> > > >>
> > > >> Best regards,
> > > >> Elisabeth
> > > >>
> > > >>
> > > > --
> > > > Monitoring * Alerting * Anomaly Detection * Centralized Log
> Management
> > > > Solr & Elasticsearch Support * http://sematext.com/
> > > >
> > > >
> > >
> >
> >
> >
> > --
> > --------------------------
> >
> > Benedetti Alessandro
> > Visiting card : http://about.me/alessandro_benedetti
> >
> > "Tyger, tyger burning bright
> > In the forests of the night,
> > What immortal hand or eye
> > Could frame thy fearful symmetry?"
> >
> > William Blake - Songs of Experience -1794 England
> >
>



-- 
--------------------------

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England

Re: ngrams with position

Reply via email to