Try the Solr Admin Analyzer page to see how Solr is indexing that text. I
suspect that the ShingleFilter is generating extra terms with positions so
that PhraseQuery no longer sees the two terms from your quoted phrase as
being adjacent.
Your second query is simply generating a Boolean AND or OR query for the
individual terms without regards to their relative position.
-- Jack Krupansky
-----Original Message-----
From: Arkadi Colson
Sent: Tuesday, December 11, 2012 10:36 AM
To: solr >> "solr-user@lucene.apache.org"
Subject: Searching for phrase
Hi
My schema looks like this:
<fieldType name="text" class="solr.TextField"
positionIncrementGap="100">
<analyzer type="index">
<charFilter class="solr.HTMLStripCharFilterFactory"/>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords_en.txt,stopwords_du.txt" enablePositionIncrements="true"/>
<filter class="solr.ShingleFilterFactory" maxShingleSize="3"
outputUnigrams="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory"
synonyms="synonyms.txt" ignoreCase="true" expand="true"/>-->
<filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords_en.txt,stopwords_du.txt" enablePositionIncrements="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
I inserted these 2 strings into solr:
abcdefg12345 678910
abcdefg12345 xyz 678910
When searching for "abcdefg12345 678910" with quotes I got no result.
Without quotes both string are found.
SolrObject Object
(
[responseHeader] => SolrObject Object
(
[status] => 0
[QTime] => 38
[params] => SolrObject Object
(
[sort] => score desc
[indent] => on
[collection] => intradesk
[wt] => xml
[version] => 2.2
[rows] => 5
[debugQuery] => true
[fl] =>
id,smsc_module,smsc_modulekey,smsc_userid,smsc_ssid,smsc_description,smsc_content,smsc_courseid,smsc_lastdate,score,metadata_stream_size,metadata_stream_source_info,metadata_stream_name,metadata_stream_content_type,last_modified,author,title,subject
[start] => 0
[q] => (smsc_content:\"abcdefg12345 678910\" ||
smsc_description:\"abcdefg12345 678910\") &&
(smsc_lastdate:[2012-11-11T09:59:51Z TO 2013-12-11T09:48:51Z]) &&
(smsc_ssid:929)
)
)
[response] => SolrObject Object
(
[numFound] => 0
[start] => 0
[docs] =>
)
[debug] => SolrObject Object
(
[rawquerystring] => (smsc_content:\"abcdefg12345 678910\"
|| smsc_description:\"abcdefg12345 678910\") &&
(smsc_lastdate:[2012-11-11T09:59:51Z TO 2013-12-11T09:48:51Z]) &&
(smsc_ssid:929)
[querystring] => (smsc_content:\"abcdefg12345 678910\" ||
smsc_description:\"abcdefg12345 678910\") &&
(smsc_lastdate:[2012-11-11T09:59:51Z TO 2013-12-11T09:48:51Z]) &&
(smsc_ssid:929)
[parsedquery] => +(smsc_content:"abcdefg12345
smsc_content:678910" smsc_description:"abcdefg12345
smsc_content:678910") +smsc_lastdate:[1352627991000 TO 1386755331000]
+smsc_ssid:929
[parsedquery_toString] => +(smsc_content:"abcdefg12345
smsc_content:678910" smsc_description:"abcdefg12345
smsc_content:678910") +smsc_lastdate:[1352627991000 TO 1386755331000]
+smsc_ssid:`#8;#0;#0;#7;!
[QParser] => LuceneQParser
[explain] => SolrObject Object
(
)
)
)
Anybody an idea what's wrong?
--
Met vriendelijke groeten
Arkadi Colson
Smartbit bvba . Hoogstraat 13 . 3670 Meeuwen
T +32 11 64 08 80 . F +32 11 64 08 81