Hi Erick, In that issue you forwarded to me, they want to make one token from all tokens received from token stream but in my case I want to keep the tokens same and create and extra new token which is concat of all the tokens.
I'd guess, is the case > here. I mean do you really want to concatenate 50 tokens? We are applying it on *title field* of product so max length can be 10 I guess and that too will be in rare case. With Regards Aman Tandon On Wed, Jun 17, 2015 at 7:16 PM, Erick Erickson <erickerick...@gmail.com> wrote: > If you used the JIRA I linked, vote for it, add any improvements etc. > Anyone can attach a patch to a JIRA, you just have to create a login. > > That said, this may be too rare a use-case to deal with. I just thought > of shingling which I should have suggested before that will work for > concatenating small numbers of tokens which, I'd guess, is the case > here. I mean do you really want to concatenate 50 tokens? > > Best, > Erick > > On Wed, Jun 17, 2015 at 12:07 AM, Aman Tandon <amantandon...@gmail.com> > wrote: > > Dear Erick, > > > > e.g. Solr training > >> *Porter:-* "solr" "train" > >> Position 1 2 > >> *Concatenated :-* "solr" "train" > >> "solrtrain" > >> Position 1 2 > > > > > > I did implemented the filter as per my requirement. Thank you so much for > > your help and guidance. So how could I contribute it to the solr. > > > > With Regards > > Aman Tandon > > > > On Wed, Jun 17, 2015 at 10:14 AM, Aman Tandon <amantandon...@gmail.com> > > wrote: > > > >> Hi Erick, > >> > >> Thank you so much, it will be helpful for me to learn how to save the > >> state of token. I has no idea of how to save state of previous tokens > due > >> to this it was difficult to generate a concatenated token in the last. > >> > >> So is there anything should I read to learn more about it. > >> > >> With Regards > >> Aman Tandon > >> > >> On Wed, Jun 17, 2015 at 9:20 AM, Erick Erickson < > erickerick...@gmail.com> > >> wrote: > >> > >>> I really question the premise, but have a look at: > >>> https://issues.apache.org/jira/browse/SOLR-7193 > >>> > >>> Note that this is not committed and I haven't reviewed > >>> it so I don't have anything to say about that. And you'd > >>> have to implement it as a custom Filter. > >>> > >>> Best, > >>> Erick > >>> > >>> On Tue, Jun 16, 2015 at 5:55 PM, Aman Tandon <amantandon...@gmail.com> > >>> wrote: > >>> > Hi, > >>> > > >>> > Any guesses, how could I achieve this behaviour. > >>> > > >>> > With Regards > >>> > Aman Tandon > >>> > > >>> > On Tue, Jun 16, 2015 at 8:15 PM, Aman Tandon < > amantandon...@gmail.com> > >>> > wrote: > >>> > > >>> >> e.g. Intent for solr training: fq=id: 234, 456, 545 title("solr > >>> training") > >>> >> > >>> >> > >>> >> typo error > >>> >> e.g. Intent for solr training: fq=id:(234 456 545) title:("solr > >>> training") > >>> >> > >>> >> With Regards > >>> >> Aman Tandon > >>> >> > >>> >> On Tue, Jun 16, 2015 at 8:13 PM, Aman Tandon < > amantandon...@gmail.com> > >>> >> wrote: > >>> >> > >>> >>> We has some business logic to search the user query in "user > intent" > >>> or > >>> >>> "finding the exact matching products". > >>> >>> > >>> >>> e.g. Intent for solr training: fq=id: 234, 456, 545 title("solr > >>> training") > >>> >>> > >>> >>> As we can see it is phrase query so it will took more time than the > >>> >>> single stemmed token query. There are also 5-7 words phrase query. > So > >>> we > >>> >>> want to reduce the search time by implementing this feature. > >>> >>> > >>> >>> With Regards > >>> >>> Aman Tandon > >>> >>> > >>> >>> On Tue, Jun 16, 2015 at 6:42 PM, Alessandro Benedetti < > >>> >>> benedetti.ale...@gmail.com> wrote: > >>> >>> > >>> >>>> Can I ask you why you need to concatenate the tokens ? Maybe we > can > >>> find > >>> >>>> a > >>> >>>> better solution to concat all the tokens in one single big token . > >>> >>>> I find it difficult to understand the reasons behind tokenising, > >>> token > >>> >>>> filtering and then un-tokenizing again :) > >>> >>>> It would be great if you explain a little bit better what you > would > >>> like > >>> >>>> to > >>> >>>> do ! > >>> >>>> > >>> >>>> > >>> >>>> Cheers > >>> >>>> > >>> >>>> 2015-06-16 13:26 GMT+01:00 Aman Tandon <amantandon...@gmail.com>: > >>> >>>> > >>> >>>> > Hi, > >>> >>>> > > >>> >>>> > I have a requirement to create the concatenated token of all the > >>> tokens > >>> >>>> > created from the last item of my analyzer chain. > >>> >>>> > > >>> >>>> > *Suppose my analyzer chain is :* > >>> >>>> > > >>> >>>> > > >>> >>>> > > >>> >>>> > > >>> >>>> > > >>> >>>> > * <tokenizer class="solr.WhitespaceTokenizerFactory" /> <filter > >>> >>>> > class="solr.WordDelimiterFilterFactory" catenateAll="1" > >>> >>>> splitOnNumerics="1" > >>> >>>> > preserveOriginal="1"/> <filter > >>> class="solr.EdgeNGramFilterFactory" > >>> >>>> > minGramSize="2" maxGramSize="15" side="front" /> <filter > >>> >>>> > class="solr.PorterStemmerFilterFactory"/>* > >>> >>>> > I want to create a concatenated token plugin to add at > concatenated > >>> >>>> token > >>> >>>> > along with the last token. > >>> >>>> > > >>> >>>> > e.g. Solr training > >>> >>>> > > >>> >>>> > *Porter:-* "solr" "train" > >>> >>>> > Position 1 2 > >>> >>>> > > >>> >>>> > *Concatenated :-* "solr" "train" > >>> >>>> > "solrtrain" > >>> >>>> > Position 1 2 > >>> >>>> > > >>> >>>> > Please help me out. How to create custom filter for this > >>> requirement. > >>> >>>> > > >>> >>>> > With Regards > >>> >>>> > Aman Tandon > >>> >>>> > > >>> >>>> > >>> >>>> > >>> >>>> > >>> >>>> -- > >>> >>>> -------------------------- > >>> >>>> > >>> >>>> Benedetti Alessandro > >>> >>>> Visiting card : http://about.me/alessandro_benedetti > >>> >>>> > >>> >>>> "Tyger, tyger burning bright > >>> >>>> In the forests of the night, > >>> >>>> What immortal hand or eye > >>> >>>> Could frame thy fearful symmetry?" > >>> >>>> > >>> >>>> William Blake - Songs of Experience -1794 England > >>> >>>> > >>> >>> > >>> >>> > >>> >> > >>> > >> > >> >