On 6/21/2019 10:32 AM, Matheo Software Info wrote:
My question is very simple JI would like to know if Solr can process
around 30To of data (Pdf, Text, Word, etc…) ?
What is the best way to index this huge data ? several servers ? several
shards ? other ?
Sure, Solr can do that. Whether you have enough resources or expertise
available to accomplish it is an entirely different question.
Handling that much data is likely going to require a LOT of expensive
hardware. The index will almost certainly need to be sharded. Knowing
exactly what numbers are involved is impossible with the information
available ... and even with more information, it will most likely
require experimentation with your actual data to find an optimal solution.
https://lucidworks.com/2012/07/23/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/
Thanks,
Shawn