Hi,
This is my first post. I have been working with Lucene for about 4
weeks and
Solr for just about 10 days. We are going to convert our site
search over to
Solr as soon as we figure out some of the nuances.
As I was testing out the synonyms features to decide how we could
best use
it, I searched for iPod (I know it is an example, but we actually
sell
them). I was shocked when the search results were nothing close to
an iPod.
Looking closer, I could see that the description had an iPod word
in it,
just 1. With debug on, that fact is confirmed (this is the first
result):
<str name="id=502999430,internal_docid=6247">
152529.23 = (MATCH) fieldWeight(search_text:ipod in 6247), product
of:
1.0 = tf(termFreq(search_text:ipod)=1)
3.7238584 = idf(docFreq=522)
40960.0 = fieldNorm(field=search_text, doc=6247)
</str>
Here is an explainOther, FOR an actual iPod SKU (in the same search):
<str name="otherQuery">id:650085488</str>
<lst name="explainOther">
<str name="id=650085488,internal_docid=6985">
1.0473351 = (MATCH) fieldWeight(search_text:ipod in 6985), product
of:
3.0 = tf(termFreq(search_text:ipod)=9)
3.7238584 = idf(docFreq=522)
0.09375 = fieldNorm(field=search_text, doc=6985)
</str>
If the term frequency is higher, the only difference is'fieldNorm'
which I
do not understand in the context of relevancy. Does this have to do
with
omitNorms in some way?
In a related factor, I also tried the dismax query with the
following line
in it:
<str name="qf">search_text^0.5 brand^10.0 keywords^5.0 title^20.0
sub_title^1.5 model^2.0 attribute^1.1</str>
As an experiment I boosted the title a bunch, since this is where
the term
iPod exists the most. It made no effect, in fact, it was not even
working.
The title was not being used at all, just the search_text, even
though I
have it indexed.
Here is the relevant schema parts
<field name="id" type="string" indexed="true" stored="true"
required="true" />
<field name="brand" type="string" indexed="true" stored="true" />
<field name="model" type="string" indexed="true" stored="true" />
<field name="manufacturer_model" type="string" indexed="true"
stored="true" />
<field name="keywords" type="string" indexed="true"
stored="false" />
<field name="title" type="string" indexed="true" stored="true" />
<field name="sub_title" type="string" indexed="true"
stored="true" />
<field name="attribute" type="string" indexed="true" stored="true"
multiValued="true" />
<field name="type" type="string" indexed="true" stored="true" />
<field name="description_category" type="string" indexed="true"
stored="true" />
<field name="description" type="string" indexed="true"
stored="true" />
<field name="brand_id" type="string" indexed="false"
stored="true" />
<field name="code" type="string" indexed="false" stored="true" />
<field name="color" type="string" indexed="true" stored="true" />
<field name="description_category_id" type="string" indexed="false"
stored="true" />
<field name="display_price" type="sfloat" indexed="false"
stored="true" />
<field name="line_item_price" type="sfloat" indexed="true"
stored="true"
/>
<field name="main_category" type="string" indexed="true"
stored="true" />
<field name="main_category_id" type="string" indexed="false"
stored="true"
/>
<field name="regular_price" type="sfloat" indexed="false"
stored="true" />
<field name="sku" type="string" indexed="true" stored="true" />
<field name="type_id" type="string" indexed="false" stored="true" />
<field name="upc" type="string" indexed="true" stored="true" />
<field name="size" type="string" indexed="true" stored="true" />
<field name="search_text" type="text" indexed="true" stored="false"
multiValued="true" termVectors="true"/>
<defaultSearchField>search_text</defaultSearchField>
<copyField source="brand" dest="search_text"/>
<copyField source="model" dest="search_text"/>
<copyField source="manufacturer_model" dest="search_text"/>
<copyField source="keywords" dest="search_text"/>
<copyField source="title" dest="search_text"/>
<copyField source="sub_title" dest="search_text"/>
<copyField source="attribute" dest="search_text"/>
<copyField source="description_category" dest="search_text"/>
<copyField source="type" dest="search_text"/>
<copyField source="description" dest="search_text"/>
<copyField source="main_category" dest="search_text"/>
<copyField source="sku" dest="search_text"/>
<copyField source="upc" dest="search_text"/>
Thanks to all who are willing to take a look at this and help.
----------------------------------------------------
Tim Christensen
Director Media & Technology
Vann's Inc.
406-203-4656
[EMAIL PROTECTED]
http://www.vanns.com