Re: Index size - to determine storage

2014-01-14 Thread Sumit Arora
Hi Amit, This excel sheet will help you estimating the index size. size-estimator-lucene-solr.xls <http://lucene.472066.n3.nabble.com/file/n4111365/size-estimator-lucene-solr.xls> - Sumit Arora -- View this message in context: http://lucene.472066.n3.nabble.com/Index-s

Re: Index size - to determine storage

2014-01-09 Thread Alexandre Rafalovitch
Try running PDF through standalone Tika and see what comes back. That's the size of the input. It usually be quite a small proportion of PDF size. Possibly down to metadata only and no text, if your PDF does not include text layer. Then, it depends on your storing and indexing options, your tokeni

Re: Index size - to determine storage

2014-01-09 Thread Michael Della Bitta
Hi Amit, It really boils down to how much of that 100kb is actually text, and how you analyze and store the text. Meaning, it's really hard for us to say. You're probably going to need to experiment to figure out what the storage needs for your use case are. Michael Della Bitta Applications Deve

Index size - to determine storage

2014-01-09 Thread Amit Jha
Hi, I would like to know if I index a file I.e PDF of 100KB then what would be the size of index. What all factors should be consider to determine the disk size? Rgds AJ