gt; and length. Please correct my wrongs :)
> Markus
>
>
>
> -Original message-
> > From:Tom Burton-West
> > Sent: Thursday 3rd April 2014 20:18
> > To: solr-user@lucene.apache.org
> > Subject: Re: tf and very short text fields
> >
> >
lease correct my wrongs :)
Markus
-Original message-
> From:Tom Burton-West
> Sent: Thursday 3rd April 2014 20:18
> To: solr-user@lucene.apache.org
> Subject: Re: tf and very short text fields
>
> Hi Markus and Wunder,
>
> I'm missing the original conte
16.0 = fieldLength
You can clearly see the final TF norm being 1, despite the term frequency and
length. Please correct my wrongs :)
Markus
-Original message-
> From:Tom Burton-West
> Sent: Thursday 3rd April 2014 20:18
> To: solr-user@lucene.apache.org
> Subject: Re: tf and ve
Hi Markus and Wunder,
I'm missing the original context, but I don't think BM25 will solve this
particular problem.
The k1 parameter sets how quickly the contribution of tf to the score falls
off with increasing tf. It would be helpful for making sure really long
documents don't get too high a
On 4/3/14 7:46 AM, Michael Sokolov wrote:
On 4/1/14 2:32 PM, Walter Underwood wrote:
And here is another peculiarity of short text fields.
The movie "New York, New York" should not be twice as relevant for
the query "new york". Is there a way to use a binary term frequency
rather than a count
On 4/1/14 2:32 PM, Walter Underwood wrote:
And here is another peculiarity of short text fields.
The movie "New York, New York" should not be twice as relevant for the query "new
york". Is there a way to use a binary term frequency rather than a count?
wunder
--
Walter Underwood
wun...@wunderw
Thanks! We'll try that out and report back. I keep forgetting that I want to
try BM25, so this is a good excuse.
wunder
On Apr 1, 2014, at 12:30 PM, Markus Jelsma wrote:
> Also, if i remember correctly, k1 set to zero for bm25 automatically omits
> norms in the calculation. So thats easy to p
Also, if i remember correctly, k1 set to zero for bm25 automatically omits
norms in the calculation. So thats easy to play with without reindexing.
Markus Jelsma schreef:Yes, override
tfidfsimilarity and emit 1f in tf(). You can also use bm25 with k1 set to zero
in your schema.
Walter Under
Yes, override tfidfsimilarity and emit 1f in tf(). You can also use bm25 with
k1 set to zero in your schema.
Walter Underwood schreef:And here is another
peculiarity of short text fields.
The movie "New York, New York" should not be twice as relevant for the query
"new york". Is there a way