Re: Bad fieldNorm when using morphologic synonyms

2013-12-26 Thread Isaac Hebsh
Attached patch into the JIRA issue. Reviews are welcome. On Thu, Dec 19, 2013 at 7:24 PM, Isaac Hebsh wrote: > Roman, do you have any results? > > created SOLR-5561 > > Robert, if I'm wrong, you are welcome to close that issue. > > > On Mon, Dec 9, 2013 at 10:50 PM, Isaac Hebsh wrote: > >> You

Re: Bad fieldNorm when using morphologic synonyms

2013-12-19 Thread Isaac Hebsh
Roman, do you have any results? created SOLR-5561 Robert, if I'm wrong, you are welcome to close that issue. On Mon, Dec 9, 2013 at 10:50 PM, Isaac Hebsh wrote: > You can see the norm value, in the "explain" text, when setting > debugQuery=true. > If the same item gets different norm before/a

Re: Bad fieldNorm when using morphologic synonyms

2013-12-09 Thread Isaac Hebsh
You can see the norm value, in the "explain" text, when setting debugQuery=true. If the same item gets different norm before/after, that's it. Note that this configuration is in schema.xml (not solrconfig.xml...) On Monday, December 9, 2013, Roman Chyla wrote: > Isaac, is there an easy way to re

Re: Bad fieldNorm when using morphologic synonyms

2013-12-09 Thread Roman Chyla
Isaac, is there an easy way to recognize this problem? We also index synonym tokens in the same position (like you do, and I'm sure that our positions are set correctly). I could test whether the default similarity factory in solrconfig.xml had any effect (before/after reindexing). --roman On Mo

Re: Bad fieldNorm when using morphologic synonyms

2013-12-09 Thread Isaac Hebsh
Hi Robert and Manuel. The DefaultSimilarity indeed sets discountOverlap to true by default. BUT, the *factory*, aka DefaultSimilarityFactory, when called by IndexSchema (the getSimilarity method), explicitly sets this value to the value of its corresponding class member. This class member is initi

Re: Bad fieldNorm when using morphologic synonyms

2013-12-09 Thread Robert Muir
no, its turned on by default in the default similarity. as i said, all that is necessary is to fix your analyzer to emit the proper position increments. On Mon, Dec 9, 2013 at 12:27 PM, Manuel Le Normand wrote: > In order to set discountOverlaps to true you must have added the > to the schema.x

Re: Bad fieldNorm when using morphologic synonyms

2013-12-09 Thread Manuel Le Normand
In order to set discountOverlaps to true you must have added the to the schema.xml, which is commented out by default! As by default this param is false, the above situation is expected with correct positioning, as said. In order to fix the field norms you'd have to reindex with the similarity c

Re: Bad fieldNorm when using morphologic synonyms

2013-12-08 Thread Robert Muir
its accurate, you are wrong. please, look at setDiscountOverlaps in your similarity. This is really easy to understand. On Sun, Dec 8, 2013 at 7:23 AM, Manuel Le Normand wrote: > Robert, you last reply is not accurate. > It's true that the field norms and termVectors are independent. But this >

Re: Bad fieldNorm when using morphologic synonyms

2013-12-08 Thread Manuel Le Normand
Robert, you last reply is not accurate. It's true that the field norms and termVectors are independent. But this issue of higher norms for this case is expected with well assigned positions. The LengthNorm is assigned as FieldInvertState.length which is the count of incrementToken and not num of po

Re: Bad fieldNorm when using morphologic synonyms

2013-12-06 Thread Robert Muir
termvectors have nothing to do with any of this. please, fix your analyzer first. if you want to add a synonym, it should be position increment of zero. i bet exact phrase queries aren't working correctly either. On Fri, Dec 6, 2013 at 12:50 AM, Isaac Hebsh wrote: > 1) positions look all right

Re: Bad fieldNorm when using morphologic synonyms

2013-12-06 Thread Isaac Hebsh
1) positions look all right (for me). 2) fieldNorm is determined by the size of the termVector, isn't it? the termVector size isn't affected by the positions. On Fri, Dec 6, 2013 at 10:46 AM, Robert Muir wrote: > Your analyzer needs to set positionIncrement correctly: sounds like its > broken.

Re: Bad fieldNorm when using morphologic synonyms

2013-12-06 Thread Robert Muir
Your analyzer needs to set positionIncrement correctly: sounds like its broken. On Thu, Dec 5, 2013 at 1:53 PM, Isaac Hebsh wrote: > Hi, > we implemented a morphologic analyzer, which stems words on index time. > For some reasons, we index both the original word and the stem (on the same > positi

Re: Bad fieldNorm when using morphologic synonyms

2013-12-05 Thread Isaac Hebsh
The field is our main textual field. In the standard case, the length-normalization makes a significant work with tf-idf, we don't want to avoid it. Removing duplicates won't help here, because the terms are not dup. One term is stemmed, and the other is not. On Fri, Dec 6, 2013 at 9:48 AM, Ahme

Re: Bad fieldNorm when using morphologic synonyms

2013-12-05 Thread Ahmet Arslan
Hi Isaac, Did you consider omitting norms completely for that field? omitNorms="true" Are you using solr.RemoveDuplicatesTokenFilterFactory? On Thursday, December 5, 2013 8:55 PM, Isaac Hebsh wrote: Hi, we implemented a morphologic analyzer, which stems words on index time. For some reasons