Re: Field length and scoring

Erick Erickson Fri, 23 Mar 2012 12:58:27 -0700

Erik:

The field length is, I believe, based on _tokens_, not characters.
Both of your examples
are exactly one token long, so the scores are probably identical....


Also, the field length is enocded in a byte (as I remember). So it's
quite possible that,
even if the lengths of these fields were 3 and 4 instead of both being
1, the value
stored for the length norms would be the same number.

HTH
Erick

On Fri, Mar 23, 2012 at 2:40 PM, Erik Fäßler <erik.faess...@uni-jena.de> wrote:
> Hello there,
>
> I have a quite basic question but my Solr is behaving in a way I'm not quite 
> sure of why it does so.
>
> The setup is simple: I have a field "suggestionText" in which single strings 
> are indexed. Schema:
>
>  <field name="suggestionText" type="prefixNGram" indexed="true" 
> stored="true"/>
>
> Since I want this field to serve for a suggestion-search, the input string is 
> analyzed by a EdgeNGramFilter.
>
> Lets have a look on two cases:
>
> case1: Input string was 'il2'
> case2: Input string was 'il24'
>
> As I can see from the Solr-admin-analysis-page, case1 is analysed as
>
> i
> il
> il2
>
> and case2 as
>
> i
> il
> il2
> il24
>
> As you would expect. The point now is: When I search for 'il2' I would expect 
> case1 to have a higher score than case2. I thought this way because I did not 
> omit norms and thus I thought, the shorter field would get a (slightly) 
> higher score. However, the scores in both cases are identical and so it 
> happens that 'il24' is suggested prior to 'il2'.
>
> Perhaps I did understand the norms or the notion of "field length" wrong. I 
> would be grateful if you could help me out here and give me advice on how to 
> accomplish the wished behavior.
>
> Thanks and best regards,
>
>        Erik

Re: Field length and scoring

Reply via email to