Hi Shawn,

Thanks for your help - I'm still finding my way in the weeds of SOLR.

Combining everything into one query is what I'd prefer because as you said, one 
would think that with everything in the same query, the score would organize 
everything nicely.

>>Assuming you're using the default relevancy sort
Yes

>> does the order of your search results change dramatically from one version 
>> to the other?  If it does, is the order generally better from a relevance 
>> standpoint, or generally worse?  If you are specifying an explicit sort, 
>> then the scores will likely be ignored.

Here's what we do - we have a list of policies with names (among other things, 
but I'll just use names for an example.

We search for several business names to see if we have policies in common with 
the names so that we don’t have too much risk with them.

So let's say I'm doing a search against three business names

Bob's carpentry
Conslidated carpentry of the Greater North West
Carpentry Land

q=(IDX_CompanyName:bob's AND carpentry) OR (IDX_CompanyName: conslidated AND 
carpentry AND of AND the AND Greater AND North AND West) OR (IDX_CompanyName: 
Carpentry AND Land)

Searching for 750 rows has hits that are all focused on Consolidated (seemingly 
because the number of words causes the SOLR score to go up into a higher range 
for all Consolidated results, as mentioned in my previous email.) Searching for 
all 3 things at the same time doesn’t insure that all 3 companies will be in 
the results, even when run separately there are results for all 3. If I boost 
maxrows to 4000, I see a few bob's carpentry but most are still Consolidated

So the way we had addressed it was running 3 separate SOLR queries and 
combining them and sorting them by descending score - wasn’t perfect, but it 
worked, and helped me to reduce the number of results we hand off to a scoring 
engine that applies 3 algorithms (Monge-Elkan, Jaro-Winkler, and SmithWindowed 
Affline) to further hone the results - which can take LOTS of time if there are 
a lot of results, so 


What I am describing is also why it's strongly recommended that you never try 
to convert scores to percentages:

https://wiki.apache.org/lucene-java/ScoresAsPercentages

Thanks,
Shawn

Reply via email to