Hello Shawn,

Good news that Solr can do that.

I know that with 30Tb of data, hardware will be the first thing to have.
Concerning Expertise, it's the real problem for me.

First I think I will do several tests before seeing how Solr works with
non-xml document (I have only experience with XML documents)

Thanks,
Bruno

On 6/21/2019 10:32 AM, Matheo Software Info wrote:
> My question is very simple JI would like to know if Solr can process
> around 30To of data (Pdf, Text, Word, etc.) ?
>
> What is the best way to index this huge data ? several servers ?
> several shards ? other ?

Sure, Solr can do that.  Whether you have enough resources or expertise
available to accomplish it is an entirely different question.

Handling that much data is likely going to require a LOT of expensive
hardware.  The index will almost certainly need to be sharded.  Knowing
exactly what numbers are involved is impossible with the information
available ... and even with more information, it will most likely require
experimentation with your actual data to find an optimal solution.

https://lucidworks.com/2012/07/23/sizing-hardware-in-the-abstract-why-we-don
t-have-a-definitive-answer/

Thanks,
Shawn


---
L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel 
antivirus Avast.
https://www.avast.com/antivirus

Reply via email to