Hello,
I think you have to issue a phrase query in such a case because otherwise
each "token" is searched independently in the merchant field : the query
parser splits the query on spaces!

Check the difference between debug outputs when you search for "Jones New
York", you'd get what you expected.

Hope this helps,

--
Tanguy

2012/6/11 Vijay Ramachandran <vijay...@gmail.com>

> Hello. I'm trying to understand the behaviour of edismax in solr 3.4 when
> it comes to searching fields similar to "string" types, i.e., untokenized.
> My document is data about products available in various stores. One of the
> fields in my schema is the name of the merchant, and I would like to match
> only the entire name in the merchant field to cut out false positives. For
> e.g., I want "The Gap" to match in merchant, but not "gap".
>
> To do this, I configured the field as such:
>
>    <fieldType name="text_full_match" class="solr.TextField"
> positionIncrementGap="100">
>      <analyzer type="index">
>    <tokenizer class="solr.KeywordTokenizerFactory"/>
>        <filter class="solr.LowerCaseFilterFactory"/>
>    <filter class="solr.EnglishPossessiveFilterFactory"/>
>        <filter class="solr.SynonymFilterFactory"
> synonyms="names-synonyms.txt" ignoreCase="true" expand="true"/>
>      </analyzer>
>    </fieldType>
>
> All the other fields are product descriptors such as category, product
> name, etc., which I store as "text_en" field from the example schemas.
>
> I have a merchant in the data called "Jones New York". If my query is
> simply the 3 words, i.e., "q=jones+new+york", the merchant field doesn't
> match. The debugQuery shows that the query splits the words up, like thus:
> <str name="parsedquery">+((DisjunctionMaxQuery((summary:jones^2.0 |
> title:jones^3.0 | merchant:jones^3.0 | cats4match:jones)~0.1)
> DisjunctionMaxQuery((merchant:new^3.0)~0.1)
> DisjunctionMaxQuery((summary:york^2.0 | title:york^3.0 | merchant:york^3.0
> | cats4match:york)~0.1))~1) DisjunctionMaxQuery((summary:"jones ?
> york"~3^5.0 | title:"jones ? york"~3^10.0 | cats4match:"jones ?
> york"~3^5.0)~0.1) ()</str>
>
> My edismax is configured this:
>  <requestHandler name="edismax" class="solr.SearchHandler" default="true">
>    <lst name="defaults">
>     <str name="defType">edismax</str>
>     <str name="echoParams">explicit</str>
>     <float name="tie">0.1</float>
>     <str name="fl">
>       dealid,category,subcategory,merchant, merchant_id, title
>     </str>
>     <str name="mm">1</str>
>     <str name="qf">
>       cats4match^1.0 merchant^3.0 title^3.0 summary^2.0
>     </str>
>     <str name="pf">
>       cats4match^5.0 merchant^10.0 title^10.0 summary^5.0
>     </str>
>     <int name="ps">3</int>
>     <str name="pf2">
>       cats4match^5.0 merchant^10.0 title^10.0 summary^5.0
> title_phrases^10.0 summary_phrases^5.0
>     </str>
>     <str name="pf3">
>       cats4match^5.0 merchant^10.0 title^10.0 summary^5.0
> title_phrases^10.0 summary_phrases^5.0
>     </str>
>     <int name="qs">3</int>
>     <str name="q.alt">*:*</str>
>    </lst>
>  </requestHandler>
>
>
> What gives? Can I achieve trying to query a string type field together with
> other tokenized fields? Or am I missing the point entirely, and I need to
> do this some other way?
>
> thanks in advance for your help.
> Vijay
>

Reply via email to