: way.  In particular, I'm doing phrase searching into a corpus of
: descriptions, such as "I need help with a foo" where I have a bunch of "foo:
: a foo is a subset of a bar often used to create briznatzes", etc.
: 
: With Sphinx, I could convert "I need help with a foo" into "*need* *help*
: *with* *foo*" and get pretty nice matches. With Solr, my understanding is
: that you can only do wildcard matches on the suffix. In addition, stemming
: only happens on non-wildcard terms. So, my first thought would be to convert
: "I need help with a foo" into "need need* help help* with with* foo foo*".

First off, we need to make sure we have all our terminology in sync -- i'm 
not very familiar with Sphinx, so i'm not sure what types of vernacular 
are used there to describe various things, but in Solr/Lucene you have 
options regarding how you want text to be "analyzed" when it's indexed -- 
this analysis is what converts an arbitrary stream of characters into 
"Terms" that get indexed.  at query time, it's very easy to match on 
terms, or boolean combinations of terms, and sequential phrases of terms 
-- you only need wildcard type functionality if you want to provide a 
wildcard expression that could match more then one individual term.

In your specific example, if you just configured a basic wildcard 
tokenizer when you indexed your documents (ie: "foo: a foo is a subset of 
a bar often used to create briznatzes") then at query time any of the 
individual words ("foo", "bar", etc...) would match that document.  
likewise a phrase query like "need help with foo" would match that text if 
you defined some stop words (like "need" and "with") and specified a small 
amount of slop on your phrase queries.


The point is: there are a lot of differnet ways to use Solr, and the 
terminology you are use to with Sphinx may not map exactly to some of the 
terminology you'll see in the SOlr docs/configs -- so please feel free to 
ask.

-Hoss

Reply via email to