subject:"Re\: Tokenising on Each Letter"

Re: Tokenising on Each Letter

2010-08-24 Thread Erick Erickson

t a quick test and it > definitely seems to work for the examples I've gave. > > Thanks again, > > Scott > > > From: Nikolas Tautenhahn [via Lucene] > Sent: Monday, August 23, 2010 3:14 PM > To: Scottie > Subject: Re: Tokenising on Each Letter > > > Hi Scotti

Re: Tokenising on Each Letter

2010-08-23 Thread Scottie

Nikolas, thanks a lot for that, I've just gave it a quick test and it definitely seems to work for the examples I've gave. Thanks again, Scott From: Nikolas Tautenhahn [via Lucene] Sent: Monday, August 23, 2010 3:14 PM To: Scottie Subject: Re: Tokenising on Each Letter

Re: Tokenising on Each Letter

2010-08-23 Thread Nikolas Tautenhahn

Hi Scottie, > Could you elaborate about N gram for me, based on my schema? just a quick reply: > positionIncrementGap="100"> > > > > > generateNumberParts="0" catenateWords="1" catenateNumbers="0" catenateAll="0" > splitOnCaseChange="1" splitOnNumerics="0

Re: Tokenising on Each Letter

2010-08-23 Thread Scottie

Probably a good idea to post the relevant information! I guess I thought it would be a really obvious answer but it seems its a bit more complex ;) It seems you may be correct about the catenat

Re: Tokenising on Each Letter

2010-08-22 Thread Erick Erickson

I suspect (though I can't say for sure since you didn't include your schema definition, both type and actual field def) that your problem stems from WordDelimiterFilterFactory options. The default in the schema usually has catenateall=0. In which case you have the tokens "ads" and "12" but not "ads

Re: Tokenising on Each Letter

Re: Tokenising on Each Letter

Re: Tokenising on Each Letter

Re: Tokenising on Each Letter

Re: Tokenising on Each Letter

5 matches

Site Navigation

Mail list logo

Footer information