Re: how does Solr/Lucene index multi-value fields

2011-05-31 Thread Ian Holsman
Thanks Erick. sadly in my use-case I don't that wouldn't work. I'll go back to storing them at the story level, and hitting a DB to get related stories I think. --I On May 31, 2011, at 12:27 PM, Erick Erickson wrote: > Hmmm, I may have mis-lead you. Re-reading my text it > wasn't very well writ

Re: how does Solr/Lucene index multi-value fields

2011-05-31 Thread Erick Erickson
Hmmm, I may have mis-lead you. Re-reading my text it wasn't very well written TF/IDF calculations are, indeed, per-field. I was trying to say that there was no difference between storing all the data for an individual field as a single long string of text in a single-valued field or as several

Re: how does Solr/Lucene index multi-value fields

2011-05-31 Thread Jonathan Rochkind
On 5/31/2011 12:16 PM, Ian Holsman wrote: we have a collection of related stories. when a user searches for something, we might not want to display the story that is most-relevant (according to SOLR), but according to other home-grown rules. by combing all the possibilities in one SolrDocument,

Re: how does Solr/Lucene index multi-value fields

2011-05-31 Thread Ian Holsman
On May 31, 2011, at 12:11 PM, Erick Erickson wrote: > Can you explain the use-case a bit more here? Especially the post-query > processing and how you expect the multiple documents to help here. > we have a collection of related stories. when a user searches for something, we might not want to

Re: how does Solr/Lucene index multi-value fields

2011-05-31 Thread Erick Erickson
Can you explain the use-case a bit more here? Especially the post-query processing and how you expect the multiple documents to help here. But TF/IDF is calculated over all the values in the field. There's really no difference between a multi-valued field and storing all the data in a single field

how does Solr/Lucene index multi-value fields

2011-05-31 Thread Ian Holsman
Hi. I want to store a list of documents (say each being 30-60k of text) into a single SolrDocument. (to speed up post-retrieval querying) In order to do this, I need to know if lucene calculates the TF/IDF score over the entire field or does it treat each value in the list as a unique field?