If you want to treat test.pdf as a phrase "test pdf", it might work by setting text_sen autoGeneratePhraseQueries="true".
Regards, Shinichiro Abe On 2012/05/17, at 10:39, Katsuyoshi NOGUCHI wrote: > OK, I understand how those words are tokenized by different tokenizer > factories. > My question is that how I can have solr analyze and search for "test" AND > "pdf". > As Solr1.4 gives result of "test" AND "pdf", I want Solr 3.6 to do the same. > (Solr3.6 gives result of "test" OR "pdf"). > > Any idea? > > 2012/5/17 Jack Krupansky <j...@basetechnology.com> > >> The query may be the same, but your analyzers are radically different. >> >> Just a hunch, but maybe GosenTokenizerFactory is treating the "." as a >> space. In 1.4 you were using SenTokenizerFactory. Or maybe >> GosenBasicFormFilterFactory is treating the "." as a space. In any case, my >> hunch is that "test.pdf" gets to WDF as two separate tokens, which is the >> query that is generated on 3.6. >> >> To debug, remove the filters starting with WDF and see if the "." was >> still there before WDF has invoked. No need to reindex, just reload Solr >> and look at the parsed query for test.pdf . >> >> -- Jack Krupansky >> >> -----Original Message----- From: Katsuyoshi NOGUCHI >> Sent: Wednesday, May 16, 2012 6:03 AM >> To: solr-user@lucene.apache.org >> Subject: Dismax query results vary on Solr1.4 and 3.6. >> >> Hi, guys! I need some advice. >> >> When sending the same dismax query to Solr 1.4 and 3.6, >> query results of search words analized by WordDelimiterFilterFactory are >> different as below: >> >> [Search Word] >> test.pdf >> >> [Result] >> Solr1.4: Search results are analized by "test" AND "pdf" >> Solr3.6: Search results are analized by "test" OR "pdf" >> >> In Solr3.6, how can I recieve the same result of "test" AND "pdf" as in >> Solr 1.4? >> >> [Japanese Analizer] >> Solr1.4 -> Sen >> Solr3.6 -> lucene-gosen >> >> >> Here are some examples of debug results in solrAdmin: >> /*solrAdmin debug result-1.4*/ >> <lst name="debug"> >> <str name="rawquerystring">test.**pdf</str> >> <str name="querystring">test.pdf</**str> >> <str name="parsedquery"> >> +DisjunctionMaxQuery((**fcontent_tsn_is:"test pdf" | fname_tbg_is:"test >> pdf")) () >> </str> >> <str name="parsedquery_toString"> >> +(fcontent_tsn_is:"test pdf" | fname_tbg_is:"test pdf") () >> </str> >> … >> <str name="QParser">DisMaxQParser</**str> >> … >> </lst> >> >> /*solrAdmin debug result-3.6*/ >> <lst name="debug"> >> <str name="rawquerystring">test.**pdf</str> >> <str name="querystring">test.pdf</**str> >> <str name="parsedquery"> >> +DisjunctionMaxQuery(((**fcontent_tsn_is:test fcontent_tsn_is:pdf) | >> (fname_tbg_is:test fname_tbg_is:pdf))) >> </str> >> <str name="parsedquery_toString"> >> +((fcontent_tsn_is:test fcontent_tsn_is:pdf) | (fname_tbg_is:test >> fname_tbg_is:pdf)) >> </str> >> ... >> <str name="QParser">**ExtendedDismaxQParser</str> >> … >> </lst> >> >> >> The followings are request handlers used in Solr1.4/3.6: >> >> /*solrconfig.xml-1.4*/ >> <requestHandler name="dismax" class="solr.SearchHandler" > >> <lst name="defaults"> >> <str name="defType">dismax</str> >> <str name="echoParams">explicit</**str> >> <str name="q.alt">*:*</str> >> <str name="qf">fcontent_tsn_is^1.0 fname_tbg_is^1.0 </str> >> </lst> >> </requestHandler> >> >> /*solrconfig.xml-3.6*/ >> <requestHandler name="dismax" class="solr.SearchHandler" > >> <lst name="defaults"> >> <str name="defType">edismax</str> >> <str name="echoParams">explicit</**str> >> <str name="q.alt">*:*</str> >> <str name="qf">content_tsn_is^1.0 name_tbg_is^1.0</str> >> </lst> >> </requestHandler> >> >> >> The followings are schemas used in Solr1.4/3.6: >> /*schema.xml-1.4*/ >> <fieldType name="text_sen" class="solr.TextField"> >> <analyzer> >> <tokenizer class="solrbook.analysis.**SenTokenizerFactory" /> >> <filter class="solr.StopFilterFactory" ignoreCase="true" >> words="stopwords.txt" enablePositionIncrements="**true" /> >> <filter class="solr.**WordDelimiterFilterFactory" generateWordParts="1" >> generateNumberParts="1" catenateWords="1" catenateNumbers="1" >> catenateAll="0" splitOnCaseChange="0"/> >> <filter class="solr.**LowerCaseFilterFactory"/> >> <filter class="solr.TrimFilterFactory" /> >> <filter class="solr.**SynonymFilterFactory" synonyms="synonyms.txt" >> tokenizerFactory="solrbook.**analysis.SenTokenizerFactory" >> ignoreCase="true" >> expand="true"/> >> </analyzer> >> </fieldType> >> >> <fields> >> <dynamicField name="*_tsn_is" type="text_sen" indexed="true" >> stored="true" compressed="false" termVectors="true" termPositions="true" >> termOffsets="true" /> >> <dynamicField name="*_tbg_is" type="text_bigram" indexed="true" >> stored="true" compressed="false" termVectors="true" termPositions="true" >> termOffsets="true" /> >> </fields> >> >> <solrQueryParser defaultOperator="AND"/> >> >> /*schema.xml-3.6*/ >> <fieldType name="text_sen" class="solr.TextField"> >> <analyzer> >> <charFilter class="solr.**MappingCharFilterFactory" >> mapping="ja-mapping.txt"/> >> <filter class="solr.**LowerCaseFilterFactory"/> >> <tokenizer class="solr.**GosenTokenizerFactory"/> >> <filter class="solr.StopFilterFactory" ignoreCase="true" >> words="stopwords.txt" enablePositionIncrements="**true" /> >> <filter class="solr.**GosenBasicFormFilterFactory" /> >> <filter class="solr.**WordDelimiterFilterFactory" generateWordParts="1" >> generateNumberParts="1" catenateWords="1" catenateNumbers="1" >> catenateAll="0" splitOnCaseChange="0"/> >> <filter class="solr.TrimFilterFactory" /> >> </analyzer> >> </fieldType> >> >> <fields> >> <dynamicField name="*_tsn_is" type="text_sen" indexed="true" >> stored="true" compressed="false" termVectors="true" termPositions="true" >> termOffsets="true" /> >> <dynamicField name="*_tbg_is" type="text_bigram" indexed="true" >> stored="true" compressed="false" termVectors="true" termPositions="true" >> termOffsets="true" /> >> </fields> >> >> <solrQueryParser defaultOperator="AND"/> >> >> >> Regards. >>