I tried to play a little with the tools you suggested. However, I probably miss something because the term frequency is not that expected. My itemid field is defined (in schema.xml) as:
<field name="itemid" type="string" indexed="true" stored="true" multiValued="true"/> I was supposing that indexing via post.sh the xml mentioned in the first mail, the term frequency of itemid 1000 was 3 in the first doc and 1 in the second! Instead, I got that result only if I change my settings to: <field name="itemid" type="text_ws" indexed="true" stored="true" multiValued="true"/> <fieldType name="text_ws" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.WhitespaceTokenizerFactory"/> </analyzer> </fieldType> and I modify my populating xml as: <doc> <id>1</id> <authorid>11</authorid> <authorid>9</authorid> <itemid>1000 1000 1000</itemid> <itemid>5000</itemid> </doc> <doc> <id>2</id> <authorid>3</authorid> <itemid>1000</itemid> </doc> Is there a way to achieve termFrequency=3 for doc1 also using my initial settings (itemid as string and just one value per itemid-tag)? Best, Flavio On Wed, Jun 26, 2013 at 12:38 PM, Upayavira <u...@odoko.co.uk> wrote: > I mentioned two features, [explain] and termfreq(field, 'value'). > Neither of these require anything special, as they are using stuff > central to Lucene's scoring mechanisms. I think you can turn off the > storage of term frequencies, obviously that would spoil things, but > that's certainly not on my default. > > I typed the syntax below from memory, so I might not have got it exactly > right. > > Upayavira > > On Wed, Jun 26, 2013, at 10:22 AM, Flavio Pompermaier wrote: > > So, in order to achieve that feature I have to declare my fileds > > (authorid > > and itemid) with termVectors="true" termPositions="true" > > termOffsets="false"? > > Should it be enough? > > > > > > On Wed, Jun 26, 2013 at 10:42 AM, Upayavira <u...@odoko.co.uk> wrote: > > > > > Add fl=[explain],* to your query, and review the output in the new > > > field. It will tell you how the score was calculated. Look at the TF or > > > termfreq values, as this is the number of times the term appears. > > > > > > Also, you could add this to your fl= param: count:termfreq(authorid, > > > '1000’) which would give you a new field telling you how many times the > > > term 1000 appears in the authorid field for each document. > > > > > > Upayavira > > > > > > On Wed, Jun 26, 2013, at 09:34 AM, Flavio Pompermaier wrote: > > > > Hi to everybody, > > > > I have some multiValued (single-token) field, for example authorid > and > > > > itemid, and what I'd like to know if there's the possibility to know > how > > > > many times a match was found in that document for some field and if > the > > > > score is higher when multiple match are found. For example, my docs > are: > > > > > > > > <doc> > > > > <id>1</id> > > > > <authorid>11</authorid> > > > > <authorid>9</authorid> > > > > <itemid>1000</itemid> > > > > <itemid>1000</itemid> > > > > <itemid>1000</itemid> > > > > <itemid>5000</itemid> > > > > </doc> > > > > <doc> > > > > <id>2</id> > > > > <authorid>3</authorid> > > > > <itemid>1000</itemid> > > > > </doc> > > > > > > > > Whould the first document have an higher score than the second if I > > > > search > > > > for itemid=1000? Is it possible to know how many times the match was > > > > found > > > > (3 for the doc1 and 1 for doc2)? > > > > > > > > Otherwise, how could I achieve that result? > > > > > > > > Best, > > > > Flavio > > > > -- > > > > > > > > Flavio Pompermaier > > > > *Development Department > > > > *_______________________________________________ > > > > *OKKAM**Srl **- www.okkam.it* > > > > > > > > *Phone:* +(39) 0461 283 702 > > > > *Fax:* + (39) 0461 186 6433 > > > > *Email:* f.pomperma...@okkam.it > > > > *Headquarters:* Trento (Italy), fraz. Villazzano, Salita dei Molini 2 > > > > *Registered office:* Trento (Italy), via Segantini 23 > > > > > > > > Confidentially notice. This e-mail transmission may contain legally > > > > privileged and/or confidential information. Please do not read it if > you > > > > are not the intended recipient(S). Any use, distribution, > reproduction or > > > > disclosure by any other person is strictly prohibited. If you have > > > > received > > > > this e-mail in error, please notify the sender and destroy the > original > > > > transmission and its attachments without reading or saving it in any > > > > manner. > > > > > > > > > > > -- > > > > Flavio Pompermaier > > *Development Department > > *_______________________________________________ > > *OKKAM**Srl **- www.okkam.it* > > > > *Phone:* +(39) 0461 283 702 > > *Fax:* + (39) 0461 186 6433 > > *Email:* f.pomperma...@okkam.it > > *Headquarters:* Trento (Italy), fraz. Villazzano, Salita dei Molini 2 > > *Registered office:* Trento (Italy), via Segantini 23 > > > > Confidentially notice. This e-mail transmission may contain legally > > privileged and/or confidential information. Please do not read it if you > > are not the intended recipient(S). Any use, distribution, reproduction or > > disclosure by any other person is strictly prohibited. If you have > > received > > this e-mail in error, please notify the sender and destroy the original > > transmission and its attachments without reading or saving it in any > > manner. >