Yes, stored fields are placed verbatim for every doc. But I wonder at the utility of trying to share stored information. The stored info is put in certain files in the index, see: http://lucene.apache.org/java/3_0_2/fileformats.html#file-names
and the files that store data are pretty much irrelevant to searching, the data in them is only referenced when assembling the document for return. So by adding this complexity you'll be saving a bit on file transfers when replicating your index, but not much else. Is it worth it? If so, why? Best Erick On Mon, Oct 17, 2011 at 11:07 AM, lee carroll <lee.a.carr...@googlemail.com> wrote: > Just as a follow up > > it looks like stored fields are stored verbatim for every doc. > > hotel index and store dest attributes > index size: 131M > number of records 49147 > > hotel index only dest attributes > > index size: 111m > number of records 49147 > > > ~400 chars(bytes) of destination data * 49147 (number of hotel docs) = ~19m > > basically everything is being stored > > No difference in time to index (very rough and not scientific :-) ) > > So it does seem an ok strategy to denormalise docs with index fields > but normalise with stored fields ? > Or have i missed some problems with this ? > > cheers lee c > > > > On 16 October 2011 11:54, lee carroll <lee.a.carr...@googlemail.com> wrote: >> Hi Chris thanks for the response >> >>> It's an inverted index, so *tems* exist once (per segment) and those terms >>> "point" to the documents -- so having the same terms (in the same fields) >>> for multiple types of documents in one index is going to take up less >>> overall space then having distinct collections for each type of document. >> >> I'm not asking about the indexed terms but rather the stored values. >> By having two doc types are we gaining anything by "storing" >> attributes only for that doc type >> >> cheers lee c >> >