Thank you for the explanation.
To close the loop, I was able to track the problem down to the Lucene Query
parser on 5.2.1 which returned +body:"123 234 345 456" for a query string
123456.
Turned out that It is possible to get the same behavior by turning on split
on white-space and auto Generate
I am not familiar with Lucene method to create analyzer. Perhaps it
was already doing just analyzes phase. But here is what the NGram
would do to a string of '123456' with just trigrams:
123
234
345
456
So, if you only apply it on the index side, and your query is '2345' -
there is no such token i
> 1) if you want face to match interface, you need max value to be at least
4.
Can you please explain this a bit more? I am not following this one. Values
are set to 3,3 and Solr already matches interface and interfaces when
searched for face. In addition to that Solr matches the trigrams of face
Two things:
1) if you want face to match interface, you need max value to be at least 4.
2) you probably have the factory symmetrically or on Query analyzer. You
probably want it on Index analyzer side only. Otherwise you are trying to
match any 3-letter query substring against yoir index.
Admin U
It is correct that a search string causes following query to be generated:
+(field:fac field:ace)
Hence the results... However, I fail to see how (fac OR ace) is a relevant
query, shouldn't it be
+field:fac +field:ace
instead?
What is the suggested way to change this this behaviour?
On Mon, Jul 2
Take a look at two things:
1> the admin/analysis page. This is probably mostly a sanity check to
insure you're seeing what you expect.
2> add debug=query to the query and look at the parsed query. My bet
is that the grams are being OR'd together
and your search term is effectively
fac OR ace