> > when you store raw (non > > tokenized, non indexed) "text" value with a document (which almost everyone > > does). Try to store 1,000,000 documents with 1000 bytes non-tokenized field: > > you will need 1Gb just for this array. > > Nope. You shouldn't even need 1GB of buffer space for that. > The size specified is for all things that the indexing process needs > to temporarily keep in memory... stored fields are normally > immediately written to disk. > > -Yonik > http://www.lucidimagination.com
-Ok, thanks for clarification! What about term vectors, what about non-trivial schema having 10 tokenized fields? Buffer will need 10 arrays (up to 2048M each) for that. My understanding is probably very naive... -Fuad http://www.linkedin.com/in/liferay