RE: dismax debugging hyphens dashes

Markus Jelsma Sat, 07 Aug 2010 16:52:55 -0700

Well, that smells like a WordDelimiterFilterFactory [1]. It splits, as your 
debug output shows, value into three separate tokens. This means that (at 
least)  the strings 'abc', '12' and 'def' are in your index and can be found. 
The abc12 value is not present. If you want to query for substrings, you can 
try NGramFilterFactory [2]. It's not really documented on the wiki but 
searching will help [3].


 

[1]: 
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.WordDelimiterFilterFactory

[2]: http://search.lucidimagination.com/search/document/CDRG_ch05_5.5.6
[3]: 
http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/
-----Original message-----
From: j <jta...@gmail.com>
Sent: Sat 07-08-2010 19:18
To: solr-user@lucene.apache.org; 
Subject: dismax debugging hyphens dashes

How does one debug index vs. dismax query parser?

I have a solr instance with 1 document whose title is "ABC12-def". I
am using dismax. While "abc", "12", and "def" do match, "abc12" and
"def" do not. Here is a the parsedquery_toString, I'm having trouble
understanding it:

+(id:abc12^3.0 | title:"(abc12 abc) 12"^1.5) (id:abc12^3.0 |
title:"(abc12 abc) 12"^1.5)

Does anyone have advice for getting this to work?

RE: dismax debugging hyphens dashes

Reply via email to