Hello,
 
Inline

----- Original Message -----
> From: mustafozbek <mustafoz...@gmail.com>
> 
> I am an apache solr user about a year. I used solr for simple search tools
> but now I want to use solr with 5TB of data. I assume that 5TB data will be
> 7TB when solr index it according to filter that I use. And then I will add
> nearly 50MB of data per hour to the same index.
> 1-    Are there any problem using single solr server with 5TB data. (without
> shards)
>    a-    Can solr server answers the queries in an acceptable time

Not likely, unless the diversity of queries is very small and OS can keep the 
relevant parts of the index cached and Solr caches get hit a lot.

>    b-    what is the expected time for commiting of 50MB data on 7TB index.

Depends on settings like ramBufferSizeMB and how you add the data (e.g. via 
DIH, via SolrJ, via csvn import...)

>    c-    Is there an upper limit for index size.

Yes, there are Lucene doc IDs that limit its size, but you will hit you will 
hit hardware limits before you hit that limit.

> 2-    what are the suggestions that you offer
>    a-    How many shards should I use

Depends primarily on the number of servers available and their capacity.

>    b-    Should I use solr cores

Sounds like you should really start by using SolrCloud.

>    c-    What is the committing frequency you offered. (is 1 hour OK)

Depends on how often you want to see new data show up in search results.  Some 
people need that to be immediately, or 1 second or 1 hour, while some are OK 
with 24h.


> 3-    are there any test results for this kind of large data


Nothing official, but it's been done.  For example, we've done large-scale 
stuff like this with Solr for our clients at Sematext, but we can't publish 
technical details.

> There is no available 5TB data, I just want to estimate what will be the
> result.
> Note: You can assume that hardware resourses are not a problem.


Otis
----
Performance Monitoring SaaS for Solr - 
http://sematext.com/spm/solr-performance-monitoring/index.html

Reply via email to