I think you could approximate this with some empirical measurements, i.e. index 1,000 'typical' documents and see what the resulting index size it. Of course you may need to adjust this number upwards if there is a lot of variability in document size.
When I built the search engine that ran feedster, I noticed there was a 1:1 correlation between the size of the source documents and the index produces, 1M documents produced 1GB of source text which in turn produced 1GB of index. That was useful to me in determining the number of documents to put in each shard (1M) as documents were crawled and indexed. François On Apr 19, 2011, at 8:28 AM, Erick Erickson wrote: > There's no way I know of to do this. > > Why is this important to you? Because I'm not > sure what actionable information this gives you. > The number will vary based on whether the fields > are stored or not. And storing the fields has > very little effect on search memory requirements. > > What are you hoping to do with that information? > Maybe we can suggest a better approach if you > state the higher-level problem... > > Best > Erick > > > On Tue, Apr 19, 2011 at 7:49 AM, rahul <asharud...@gmail.com> wrote: >> Hi, >> >> Is there a way to find out Solr indexing size for a particular document. I >> am using Solrj to index the documents. >> >> Assume, I am indexing multiple fields like title, description, content, and >> few integer fields in schema.xml, then once I index the content, is there a >> way to identify the index size for the particular document during indexing >> or after indexing..?? >> >> Because, most of the common words are excluded from StopWords.txt using >> StopFilterFactory. I just want to calculate the actual index size of the >> particular document. Is there any way in current Solr ?? >> >> thanks, >> >> >> -- >> View this message in context: >> http://lucene.472066.n3.nabble.com/Solr-indexing-size-for-a-particular-document-tp2838416p2838416.html >> Sent from the Solr - User mailing list archive at Nabble.com. >>