Re: How to create concatenated token

Aman Tandon Wed, 17 Jun 2015 23:28:26 -0700

Hi Erick,

In that issue you forwarded to me, they want to make one token from all
tokens received from token stream but in my case I want to keep the tokens
same and create and extra new token which is concat of all the tokens.



 I'd guess, is the case
> here. I mean do you really want to concatenate 50 tokens?

We are applying it on *title field* of product  so max length can be 10 I
guess and that too will be in rare case.

With Regards
Aman Tandon

On Wed, Jun 17, 2015 at 7:16 PM, Erick Erickson <erickerick...@gmail.com>
wrote:

> If you used the JIRA I linked, vote for it, add any improvements etc.
> Anyone can attach a patch to a JIRA, you just have to create a login.
>
> That said, this may be too rare a use-case to deal with. I just thought
> of shingling which I should have suggested before that will work for
> concatenating small numbers of tokens which, I'd guess, is the case
> here. I mean do you really want to concatenate 50 tokens?
>
> Best,
> Erick
>
> On Wed, Jun 17, 2015 at 12:07 AM, Aman Tandon <amantandon...@gmail.com>
> wrote:
> > Dear Erick,
> >
> > e.g. Solr training
> >> *Porter:-*                  "solr"  "train"
> >>   Position                     1         2
> >> *Concatenated :-*   "solr"  "train"
> >>                                            "solrtrain"
> >>    Position                     1          2
> >
> >
> > I did implemented the filter as per my requirement. Thank you so much for
> > your help and guidance. So how could I contribute it to the solr.
> >
> > With Regards
> > Aman Tandon
> >
> > On Wed, Jun 17, 2015 at 10:14 AM, Aman Tandon <amantandon...@gmail.com>
> > wrote:
> >
> >> Hi Erick,
> >>
> >> Thank you so much, it will be helpful for me to learn how to save the
> >> state of token. I has no idea of how to save state of previous tokens
> due
> >> to this it was difficult to generate a concatenated token in the last.
> >>
> >> So is there anything should I read to learn more about it.
> >>
> >> With Regards
> >> Aman Tandon
> >>
> >> On Wed, Jun 17, 2015 at 9:20 AM, Erick Erickson <
> erickerick...@gmail.com>
> >> wrote:
> >>
> >>> I really question the premise, but have a look at:
> >>> https://issues.apache.org/jira/browse/SOLR-7193
> >>>
> >>> Note that this is not committed and I haven't reviewed
> >>> it so I don't have anything to say about that. And you'd
> >>> have to implement it as a custom Filter.
> >>>
> >>> Best,
> >>> Erick
> >>>
> >>> On Tue, Jun 16, 2015 at 5:55 PM, Aman Tandon <amantandon...@gmail.com>
> >>> wrote:
> >>> > Hi,
> >>> >
> >>> > Any guesses, how could I achieve this behaviour.
> >>> >
> >>> > With Regards
> >>> > Aman Tandon
> >>> >
> >>> > On Tue, Jun 16, 2015 at 8:15 PM, Aman Tandon <
> amantandon...@gmail.com>
> >>> > wrote:
> >>> >
> >>> >> e.g. Intent for solr training: fq=id: 234, 456, 545 title("solr
> >>> training")
> >>> >>
> >>> >>
> >>> >> typo error
> >>> >> e.g. Intent for solr training: fq=id:(234 456 545) title:("solr
> >>> training")
> >>> >>
> >>> >> With Regards
> >>> >> Aman Tandon
> >>> >>
> >>> >> On Tue, Jun 16, 2015 at 8:13 PM, Aman Tandon <
> amantandon...@gmail.com>
> >>> >> wrote:
> >>> >>
> >>> >>> We has some business logic to search the user query in "user
> intent"
> >>> or
> >>> >>> "finding the exact matching products".
> >>> >>>
> >>> >>> e.g. Intent for solr training: fq=id: 234, 456, 545 title("solr
> >>> training")
> >>> >>>
> >>> >>> As we can see it is phrase query so it will took more time than the
> >>> >>> single stemmed token query. There are also 5-7 words phrase query.
> So
> >>> we
> >>> >>> want to reduce the search time by implementing this feature.
> >>> >>>
> >>> >>> With Regards
> >>> >>> Aman Tandon
> >>> >>>
> >>> >>> On Tue, Jun 16, 2015 at 6:42 PM, Alessandro Benedetti <
> >>> >>> benedetti.ale...@gmail.com> wrote:
> >>> >>>
> >>> >>>> Can I ask you why you need to concatenate the tokens ? Maybe we
> can
> >>> find
> >>> >>>> a
> >>> >>>> better solution to concat all the tokens in one single big token .
> >>> >>>> I find it difficult to understand the reasons behind tokenising,
> >>> token
> >>> >>>> filtering and then un-tokenizing again :)
> >>> >>>> It would be great if you explain a little bit better what you
> would
> >>> like
> >>> >>>> to
> >>> >>>> do !
> >>> >>>>
> >>> >>>>
> >>> >>>> Cheers
> >>> >>>>
> >>> >>>> 2015-06-16 13:26 GMT+01:00 Aman Tandon <amantandon...@gmail.com>:
> >>> >>>>
> >>> >>>> > Hi,
> >>> >>>> >
> >>> >>>> > I have a requirement to create the concatenated token of all the
> >>> tokens
> >>> >>>> > created from the last item of my analyzer chain.
> >>> >>>> >
> >>> >>>> > *Suppose my analyzer chain is :*
> >>> >>>> >
> >>> >>>> >
> >>> >>>> >
> >>> >>>> >
> >>> >>>> >
> >>> >>>> > * <tokenizer class="solr.WhitespaceTokenizerFactory" />  <filter
> >>> >>>> > class="solr.WordDelimiterFilterFactory" catenateAll="1"
> >>> >>>> splitOnNumerics="1"
> >>> >>>> > preserveOriginal="1"/>    <filter
> >>> class="solr.EdgeNGramFilterFactory"
> >>> >>>> > minGramSize="2" maxGramSize="15" side="front" />    <filter
> >>> >>>> > class="solr.PorterStemmerFilterFactory"/>*
> >>> >>>> > I want to create a concatenated token plugin to add at
> concatenated
> >>> >>>> token
> >>> >>>> > along with the last token.
> >>> >>>> >
> >>> >>>> > e.g. Solr training
> >>> >>>> >
> >>> >>>> > *Porter:-*                  "solr"  "train"
> >>> >>>> >   Position                     1         2
> >>> >>>> >
> >>> >>>> > *Concatenated :-*   "solr"  "train"
> >>> >>>> >                                            "solrtrain"
> >>> >>>> >    Position                     1          2
> >>> >>>> >
> >>> >>>> > Please help me out. How to create custom filter for this
> >>> requirement.
> >>> >>>> >
> >>> >>>> > With Regards
> >>> >>>> > Aman Tandon
> >>> >>>> >
> >>> >>>>
> >>> >>>>
> >>> >>>>
> >>> >>>> --
> >>> >>>> --------------------------
> >>> >>>>
> >>> >>>> Benedetti Alessandro
> >>> >>>> Visiting card : http://about.me/alessandro_benedetti
> >>> >>>>
> >>> >>>> "Tyger, tyger burning bright
> >>> >>>> In the forests of the night,
> >>> >>>> What immortal hand or eye
> >>> >>>> Could frame thy fearful symmetry?"
> >>> >>>>
> >>> >>>> William Blake - Songs of Experience -1794 England
> >>> >>>>
> >>> >>>
> >>> >>>
> >>> >>
> >>>
> >>
> >>
>

Re: How to create concatenated token

Reply via email to