Re: Largest number of indexed documents used by Solr

2018-04-05 Thread Joe Obernberger
50 billion per day?  Wow!  How large are these documents? We have a cluster with one large collection that contains 2.4 billion documents spread across 40 machines using HDFS for the index.  We store our data inside of HBase, and in order to re-index data we pull from HBase and index with solr

Re: Largest number of indexed documents used by Solr

2018-04-05 Thread Kelly, Frank
For us we have ~ 350M documents stored using r3.xlarge nodes with 8GB Heap and about 31GB of RAM We are using Solr 5.3.1 in a SolrCloud setup (3 collections, each with 3 shards and 3 replicas). For us lots of RAM memory is not as important as CPU (as the EBS disk we run on top of is quite fast a

Re: Largest number of indexed documents used by Solr

2018-04-04 Thread 苗海泉
When we have 49 shards per collection, there are more than 600 collections. Solr will have serious performance problems. I don't know how to deal with them. My advice to you is to minimize the number of collections. Our environment is 49 solr server nodes, each with 32cpu/128g, and the data volume

Re: Largest number of indexed documents used by Solr

2018-04-03 Thread Yago Riveiro
Hi, In my company we are running a 12 node cluster with 10 (american) Billion documents 12 shards / 2 replicas. We do mainly faceting queries with a very reasonable performance. 36 million documents it's not an issue, you can handle that volume of documents with 2 nodes with SSDs and 32G of ra

Re: Largest number of indexed documents used by Solr

2018-04-03 Thread Walter Underwood
We have a 24 million document index. Our documents are a bit smaller than yours, homework problems. The Hathi Trust probably has the record. They haven’t updated their blog for a while, but they were at 11 million books and billions of pages in 2014. https://www.hathitrust.org/blogslarge-scale-

Re: Largest number of indexed documents used by Solr

2018-04-03 Thread Abhi Basu
We have tested Solr 4.10 with 200 million docs with avg doc size of 250 KB. No issues with performance when using 3 shards / 2 replicas. On Tue, Apr 3, 2018 at 8:12 PM, Steven White wrote: > Hi everyone, > > I'm about to start a project that requires indexing 36 million records > using Solr 7.