Would you mind explaining how omitNorm has any effect on the IDF problem
I described earlier?
I agree with your second sentence. I had to use the NGramTokenFilter to
accommodate partial matches.
On 10/05/2009 12:11 PM, Avlesh Singh wrote:
Using synonyms might be a better solution because the use of
EdgeNGramTokenizerFactory has the potential of creating a large number of
token which will artificially increase the number of tokens in the index
which in turn will affect the IDF score.
Well, I don't see a reason as to why someone would need a length based
normalization on such matches. I always have done omitNorms while using
fields with this filter.
Yes, synonyms might an answer when you have limited number of such words
(phrases) and their possible combinations.
Cheers
Avlesh
On Mon, Oct 5, 2009 at 10:32 PM, Christian Zambrano<czamb...@gmail.com>wrote:
Using synonyms might be a better solution because the use of
EdgeNGramTokenizerFactory has the potential of creating a large number of
token which will artificially increase the number of tokens in the index
which in turn will affect the IDF score.
A query for "borderland" should have returned results though. It is
difficult to troubleshoot why it didn't without knowing what query you used,
and what kind of analysis is taking place.
Have you tried using the analysis page on the admin section to see what
tokens gets generated for 'Borderlands'?
Christian
On 10/05/2009 11:01 AM, Avlesh Singh wrote:
We have indexed a product database and have come across some search terms
where zero results are returned. There are products in the index with
'Borderlands xxx xxx', 'Dragonfly xx xxx' in the title. Searches for
'Borderland' or 'Border Land' and 'Dragon Fly' return zero results
respectively.
"Borderland" should have worked for a regular text field. For all other
desired matches you can use EdgeNGramTokenizerFactory.
Cheers
Avlesh
On Mon, Oct 5, 2009 at 7:51 PM, Andrew McCombe<eupe...@gmail.com> wrote:
Hi
I am hoping someone can point me in the right direction with regards to
indexing words that are concatenated together to make other words or
product
names.
We have indexed a product database and have come across some search terms
where zero results are returned. There are products in the index with
'Borderlands xxx xxx', 'Dragonfly xx xxx' in the title. Searches for
'Borderland' or 'Border Land' and 'Dragon Fly' return zero results
respectively.
Where do I look to resolve this? The product name field is indexed using
a
text field type.
Thanks in advance
Andrew