These query parsing results don't match with the config you've posted. Double-check the type of the "name" field and that you have restarted Solr since changing the schema.xml
-Yonik On Tue, Oct 28, 2008 at 11:25 AM, Stephen Weiss <[EMAIL PROTECTED]> wrote: > Thanks for the reply. I've been looking at the debug page... and I really > don't see any clues there (maybe I don't know how to read it). > > <?xml version="1.0" encoding="UTF-8"?> > <response> > > <lst name="responseHeader"> > <int name="status">0</int> > <int name="QTime">1</int> > <lst name="params"> > <str name="wt">standard</str> > <str name="rows">10</str> > > <str name="start">0</str> > <str name="explainOther"/> > <str name="hl.fl"/> > <str name="indent">on</str> > <str name="q">name:(stm 0810 m_*)</str> > <str name="fl">*,score</str> > <str name="qt">standard</str> > > <str name="debugQuery">on</str> > <str name="version">2.2</str> > </lst> > </lst> > <result name="response" numFound="0" start="0" maxScore="0.0"/> > <lst name="debug"> > <str name="rawquerystring">name:(stm 0810 m_*)</str> > <str name="querystring">name:(stm 0810 m_*)</str> > > <str name="parsedquery">+name:stm +name:0810 +name:m_*</str> > <str name="parsedquery_toString">+name:stm +name:0810 +name:m_*</str> > <lst name="explain"/> > </lst> > </response> > > I mean, as far as I can tell, that seems right. I think I'm missing > something here. > > The wiki page is awesome though, thank you. The catenateAll option does > seem to do what I think it did... but should I perhaps just remove any kind > of filter or analyzer on this field? It's really not a big deal if someone > has to get the dashes and underscores exactly right - it's a worse problem > if they do get them right, but it still doesn't work (usually they copy and > paste these from an e-mail or something). Just in general, it's never > really critical for someone to search by parts of the filename - except for > searching with wildcard (that is, stm0810m_* and the like), and it would be > a lot easier if they didn't have to put spaces where letters change to > numbers & vice versa. > > Thanks again for your input. > > -- > Steve > > On Oct 28, 2008, at 10:49 AM, Feak, Todd wrote: > >> You may want to take a very close look at what the WordDelimiterFilter >> is doing. I believe the underscore is dropped entirely during indexing >> AND searching as it's not alphanumeric. >> >> Wiki doco here >> http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters?highlight=(t >> okenizer)#head-1c9b83870ca7890cd73b193cefed83c283339089 >> >> The admin analysis page and query debug will help a lot to see what's >> going on. >> >> -Todd >> >> -----Original Message----- >> From: Stephen Weiss [mailto:[EMAIL PROTECTED] >> Sent: Monday, October 27, 2008 10:32 PM >> To: solr-user@lucene.apache.org >> Subject: Question about textTight >> >> Hi, >> >> So I've been using the textTight field to hold filenames, and I've run >> into a weird problem. Basically, people want to search by part of a >> filename (say, the filename is stm0810m_ws_001ftws and they want to >> find everything starting with stm0810m_ (stm0810m_*). I'm hoping >> someone might have done this before (I bet someone has). >> >> Lots of things work - you can search for stm0810m_ws_001ftws and get a >> result, or (stm 0810 m*), or various other combinations. What does >> not work, is searching for (stm0810m_*) or (stm 0810 m_*) or anything >> like that - a problem, because often they don't want things with ma_ >> or mx_, but just m_. It's almost like underscores just break >> everything, escaping them does nothing. >> >> Here's the field definition (it should be what came with my solr): >> >> <fieldType name="textTight" class="solr.TextField" >> positionIncrementGap="100" > >> <analyzer> >> <tokenizer class="solr.WhitespaceTokenizerFactory"/> >> <filter class="solr.SynonymFilterFactory" >> synonyms="synonyms.txt" ignoreCase="true" expand="false"/> >> <filter class="solr.StopFilterFactory" ignoreCase="true" >> words="stopwords.txt"/> >> <filter class="solr.WordDelimiterFilterFactory" >> generateWordParts="0" generateNumberParts="0" catenateWords="1" >> catenateNumbers="1" catenateAll="0"/> >> <filter class="solr.LowerCaseFilterFactory"/> >> <filter class="solr.EnglishPorterFilterFactory" >> protected="protwords.txt"/> >> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> >> </analyzer> >> </fieldType> >> >> and usage: >> >> <field name="name" type="textTight" >> indexed="true" stored="true" omitNorms="true" >> /> >> >> >> Now, I thought textTight would be good because it's the one best >> suited for SKU's, but I guess I'm wrong. What should I be using for >> this? Would changing any of these "generateWordParts" or >> "catenateAll" options help? I can't seem to find any documentation so >> I'm really not sure what it would do, but reindexing this whole thing >> will take quite some time so I'd rather know what will actually work >> before I just start changing things. >> >> Thanks so much for any insight! >> >> -- >> Steve >> > >