Well, I get the same results in 1.4 and 3.6. The only difference is I didn't put <http://cbrsrvmtr04:8983/solr/WISE/admin/file/?file=schema.xml> in.
In both cases the 12 is missing from the query analysis but is in the index analysis, due to the catenateNumbers being 1 in one case and 0 in the other. So Im guessing there's something else going on that you're overlooking, but don't have any good clue.... Best Erick On Wed, Nov 28, 2012 at 4:34 AM, Frederico Azeiteiro < frederico.azeite...@cision.com> wrote: > I just reload both indexes just to make sure that all definitions are > loaded. > On Analysis tool I can see differences, even that the fields are defined > on the same way: > > Query Analyser for 3.6.1 > org.apache.solr.analysis.WordDelimiterFilterFactory > {protected=protwords.txt, splitOnCaseChange=1, generateNumberParts=0, > catenateWords=0, luceneMatchVersion=LUCENE_36, generateWordParts=1, > catenateAll=0, catenateNumbers=0} > term text: GAMES > > Query Analyser for 1.4.0 > org.apache.solr.analysis.WordDelimiterFilterFactory > {protected=protwords.txt, splitOnCaseChange=1, generateNumberParts=0, > catenateWords=0, generateWordParts=1, catenateAll=0, catenateNumbers=0} > term text: GAMES | 12 > > The "12" is lost on query for 3.6.1. > The only diference I can see on the field definition is the > "luceneMatchVersion=LUCENE_36"... Could it cause this issue? > > Thank you. > Frederico > > -----Mensagem original----- > De: Erick Erickson [mailto:erickerick...@gmail.com] > Enviada: terça-feira, 27 de Novembro de 2012 12:26 > Para: solr-user@lucene.apache.org > Assunto: Re: Search differences between solr 1.4.0 and 3.6.1 > > Using the definition you provided, I don't get the same output. Are you > sure you are doing what you think? The generateNumberParts=0 keeps the '12' > from making it through the filter in 1.4 and 3.6 so I suspect you're not > quite doing something the same way in both. > > Perhaps looking at index tokenization in one and query in the other? > > Best > Erick > > > On Mon, Nov 26, 2012 at 9:06 AM, Frederico Azeiteiro < > frederico.azeite...@cision.com> wrote: > > > Hi, > > > > > > > > While updating our SOLR to 3.6.1 I noticed some results differences > > when using search strings with letters+number. > > > > For a text field defined as: > > > > <analyzer type="index"> > > <http://cbrsrvmtr04:8983/solr/WISE/admin/file/?file=schema.xml> > > > > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > > > > <charFilter class="solr.MappingCharFilterFactory" > > mapping="mapping-ISOLatin1Accent.txt"/> > > > > <filter class="solr.WordDelimiterFilterFactory" > > protected="protwords.txt" splitOnCaseChange="1" catenateAll="0" > > catenateNumbers="1" catenateWords="1" generateNumberParts="0" > > generateWordParts="1" stemEnglishPossessive="0"/> > > > > </analyzer> > > > > <analyzer type="query"> > > <http://cbrsrvmtr04:8983/solr/WISE/admin/file/?file=schema.xml> > > > > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > > > > <filter class="solr.SynonymFilterFactory" ignoreCase="true" > > expand="true" synonyms="synonyms.txt"/> > > > > <filter class="solr.WordDelimiterFilterFactory" > > protected="protwords.txt" splitOnCaseChange="1" catenateAll="0" > > catenateNumbers="0" catenateWords="0" generateNumberParts="0" > > generateWordParts="1"/> > > > > </analyzer> > > > > > > > > Searching for string GAMES12 returns a lot of results on 3.6.1 that > > are not returned on 1.4.0. > > > > > > > > It looks like WordDelimiterFilterFactory is acting different for > > 3.6.1, the numeric part of the keyword is being ignored and the search > > is performed using only GAMES. > > > > > > > > Analisys returns for 1.4.0: > > > > org.apache.solr.analysis.WordDelimiterFilterFactory > > {protected=protwords.txt, splitOnCaseChange=1, generateNumberParts=0, > > catenateWords=0, generateWordParts=1, catenateAll=0, > > catenateNumbers=0} > > > > term position > > > > 1 > > > > 2 > > > > term text > > > > GAMES > > > > 12 > > > > term type > > > > word > > > > word > > > > source start,end > > > > 0,5 > > > > 5,7 > > > > payload > > > > > > > > > > > > AND for 3.6.1 > > > > > > > > org.apache.solr.analysis.WordDelimiterFilterFactory > > {protected=protwords.txt, splitOnCaseChange=1, generateNumberParts=0, > > catenateWords=0, luceneMatchVersion=LUCENE_36, generateWordParts=1, > > catenateAll=0, catenateNumbers=0} > > > > position > > > > 1 > > > > term text > > > > GAMES > > > > startOffset > > > > 0 > > > > endOffset > > > > 5 > > > > type > > > > word > > > > positionLength > > > > 1 > > > > > > > > > > > > Is this something that can be modified/fixed to return the same results? > > > > > > > > Thank you. > > > > > > > > Regards, > > > > Frederico > > > > > > > > > > > > >