SolrCloud Scale Struggle

anand.mahajan Fri, 01 Aug 2014 03:20:13 -0700

Hello all,

Struggling to get this going with SolrCloud -


Requirement in brief :
 - Ingest about 4M Used Cars listings a day and track all unique cars for
changes
 - 4M automated searches a day (during the ingestion phase to check if a doc
exists in the index (based on values of 4-5 key fields) or it is a new one
or an updated version)
 - Of the 4 M - About 3M Updates to existing docs (for every non-key value
change)
 - About 1M inserts a day (I'm assuming these many new listings come in
every day)
 - Daily Bulk CSV exports of inserts / updates in last 24 hours of various
snapshots of the data to various clients

My current deployment : 
 i) I'm using Solr 4.8 and have set up a SolrCloud with 6 dedicated machines
- 24 Core + 96 GB RAM each.
 ii)There are over 190M docs in the SolrCloud at the moment (for all
replicas its consuming overall disk 2340GB which implies - each doc is at
about 5-8kb in size.)
 iii) The docs are split into 36 Shards - and 3 replica per shard (in all
108 Solr Jetty processes split over 6 Servers leaving about 18 Jetty JVMs
running on each host)
 iv) There are 60 fields per doc and all fields are stored at the moment  :( 
(The backend is only Solr at the moment)
 v) The current shard/routing key is a combination of Car Year, Make and
some other car level attributes that help classify the cars
vi) We are mostly using the default Solr config as of now - no heavy caching
as the search is pretty random in nature 
vii) Autocommit is on - with maxDocs = 1

Current throughput & Issues :
With the above mentioned deployment the daily throughout is only at about
1.5M on average (Inserts + Updates) - falling way short of what is required.
Search is slow - Some queries take about 15 seconds to return - and since
insert is dependent on at least one Search that degrades the write
throughput too. (This is not a Solr issue - but the app demands it so)

Questions :

1. Autocommit with maxDocs = 1 - is that a goof up and could that be slowing
down indexing? Its a requirement that all docs are available as soon as
indexed.

2. Should I have been better served had I deployed a Single Jetty Solr
instance per server with multiple cores running inside? The servers do start
to swap out after a couple of days of Solr uptime - right now we reboot the
entire cluster every 4 days.

3. The routing key is not able to effectively balance the docs on available
shards - There are a few shards with just about 2M docs - and others over
11M docs. Shall I split the larger shards? But I do not have more nodes /
hardware to allocate to this deployment. In such case would splitting up the
large shards give better read-write throughput? 

4. To remain with the current hardware - would it help if I remove 1 replica
each from a shard? But that would mean even when just 1 node goes down for a
shard there would be only 1 live node left that would not serve the write
requests.

5. Also, is there a way to control where the Split Shard replicas would go?
Is there a pattern / rule that Solr follows when it creates replicas for
split shards?

6. I read somewhere that creating a Core would cost the OS one thread and a
file handle. Since a core repsents an index in its entirty would it not be
allocated the configured number of write threads? (The dafault that is 8)

7. The Zookeeper cluster is deployed on the same boxes as the Solr instance
- Would separating the ZK cluster out help?

Sorry for the long thread _ I thought of asking these all at once rather
than posting separate ones.

Thanks,
Anand



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-Scale-Struggle-tp4150592.html
Sent from the Solr - User mailing list archive at Nabble.com.

SolrCloud Scale Struggle

Reply via email to