Zambrano, I was too quick to respond to your idf explanation. I definitely
did not mean that "idf" and "length-norms" are the same thing.
Andrew, this is how i would have done it -
First, I would create a field called "prefix_text" as undeneath in my
schema.xml
fyi, if you don't want to turn off norms entirely, try this option in
lucene 2.9 DefaultSimilarity:
public void setDiscountOverlaps(boolean v)
Determines whether overlap tokens (Tokens with 0 position increment)
are ignored when computing norm. By default this is false, meaning
overlap tokens are
Would you mind explaining how omitNorm has any effect on the IDF problem
I described earlier?
I agree with your second sentence. I had to use the NGramTokenFilter to
accommodate partial matches.
On 10/05/2009 12:11 PM, Avlesh Singh wrote:
Using synonyms might be a better solution because the
>
> Using synonyms might be a better solution because the use of
> EdgeNGramTokenizerFactory has the potential of creating a large number of
> token which will artificially increase the number of tokens in the index
> which in turn will affect the IDF score.
>
Well, I don't see a reason as to why s
Using synonyms might be a better solution because the use of
EdgeNGramTokenizerFactory has the potential of creating a large number
of token which will artificially increase the number of tokens in the
index which in turn will affect the IDF score.
A query for "borderland" should have returned
>
> We have indexed a product database and have come across some search terms
> where zero results are returned. There are products in the index with
> 'Borderlands xxx xxx', 'Dragonfly xx xxx' in the title. Searches for
> 'Borderland' or 'Border Land' and 'Dragon Fly' return zero results
> resp
Hi
I am hoping someone can point me in the right direction with regards to
indexing words that are concatenated together to make other words or product
names.
We have indexed a product database and have come across some search terms
where zero results are returned. There are products in the index