Hi Amit,
This excel sheet will help you estimating the index size.
size-estimator-lucene-solr.xls
<http://lucene.472066.n3.nabble.com/file/n4111365/size-estimator-lucene-solr.xls>
-
Sumit Arora
--
View this message in context:
http://lucene.472066.n3.nabble.com/Index-s
Try running PDF through standalone Tika and see what comes back. That's the
size of the input. It usually be quite a small proportion of PDF size.
Possibly down to metadata only and no text, if your PDF does not include
text layer.
Then, it depends on your storing and indexing options, your tokeni
Hi Amit,
It really boils down to how much of that 100kb is actually text, and how
you analyze and store the text. Meaning, it's really hard for us to say.
You're probably going to need to experiment to figure out what the storage
needs for your use case are.
Michael Della Bitta
Applications Deve
Hi,
I would like to know if I index a file I.e PDF of 100KB then what would be the
size of index. What all factors should be consider to determine the disk size?
Rgds
AJ