Is not a typo. I was wrong, for zookeeper 2 nodes still count as majority.
It's not the desirable configuration but is tolerable.

  

Thanks Erick.

  

\--

/Yago Riveiro

> On Jan 21 2016, at 4:15 am, Erick Erickson <erickerick...@gmail.com>
wrote:  

>

> bq: 3 are to risky, you lost one you lost quorum

>

> Typo? You need to lose two.....

>

> On Wed, Jan 20, 2016 at 6:25 AM, Yago Riveiro <yago.rive...@gmail.com>
wrote:  
> Our Zookeeper cluster is an ensemble of 5 machines, is a good starting
point,  
> 3 are to risky, you lost one you lost quorum and with 7 sync cost
increase.  
>  
>  
>  
> ZK cluster is in machines without IO and rotative hdd (don't not use SDD
to  
> gain IO performance, zookeeper is optimized to spinning disks).  
>  
>  
>  
> The ZK cluster behaves without problems, the first deploy of ZK was in
the  
> same machines that the Solr Cluster (ZK log in its own hdd) and that
didn't  
> wok very well, CPU and networking IO from Solr Cluster was too much.  
>  
>  
>  
> About schema modifications.  
>  
> Modify the schema to add new fields is relative simple with new API, in
the  
> pass all the work was manually uploading the schema to ZK and reloading
all  
> collections (indexing must be disable or timeouts and funny errors
happen).  
>  
> With the new Schema API this is more user friendly. Anyway, I stop
indexing  
> and for reload the collections (I don't know if it's necessary nowadays).  
>  
> About Indexing data.  
>  
>  
>  
> We have self made data importer, it's not java and not performs batch
indexing  
> (with 500 collections buffer data and build the batch is expensive and  
> complicate for error handling).  
>  
>  
>  
> We use regular HTTP post in json. Our throughput is about 1000 docs/s
without  
> any type of optimization. Some time we have issues with replication, the
slave  
> can keep pace with leader insertion and a full sync is requested, this is
bad  
> because sync the replica again implicates a lot of IO wait and CPU and
with  
> replicas with 100G take an hour or more (normally when this happen, we
disable  
> indexing to release IO and CPU and not kill the node with a load of 50 or
60).  
>  
> In this department my advice is "keep it simple" in the end is an HTTP
POST to  
> a node of the cluster.  
>  
>  
>  
> \\--  
>  
> /Yago Riveiro  
>  
>> On Jan 20 2016, at 1:39 pm, Troy Edwards
<tedwards415...@gmail.com>  
> wrote:  
>  
>>  
>  
>> Thank you for sharing your experiences/ideas.  
>  
>>  
>  
>> Yago since you have 8 billion documents over 500 collections, can you
share  
> what/how you do index maintenance (e.g. add field)? And how are you
loading  
> data into the index? Any experiences around how Zookeeper ensemble
behaves  
> with so many collections?  
>  
>>  
>  
>> Best,  
>  
>>  
>  
>>  
> On Tue, Jan 19, 2016 at 6:05 PM, Yago Riveiro
<yago.rive...@gmail.com>  
> wrote:  
>  
>>  
>  
>> > What I can say is:  
> >  
> >  
> > * SDD (crucial for performance if the index doesn't fit in
memory, and  
> > will not fit)  
> > * Divide and conquer, for that volume of docs you will need more
than 6  
> > nodes.  
> > * DocValues to not stress the java HEAP.  
> > * Do you will you aggregate data?, if yes, what is your max  
> > cardinality?, this question is the most important to size
correctly the  
> > memory needs.  
> > * Latency is important too, which threshold is acceptable before  
> > consider a query slow?  
> > At my company we are running a 12 terabytes (2 replicas) Solr
cluster  
> with  
> > 8  
> > billion documents sparse over 500 collection . For this we have
about 12  
> > machines with SDDs and 32G of ram each (~24G for the heap).  
> >  
> > We don't have a strict need of speed, 30 second query to
aggregate 100  
> > million  
> > documents with 1M of unique keys is fast enough for us, normally
the  
> > aggregation performance decrease as the number of unique keys
increase,  
> > with  
> > low unique key factor, queries take less than 2 seconds if data
is in OS  
> > cache.  
> >  
> > Personal recommendations:  
> >  
> > * Sharding is important and smart sharding is crucial, you don't
want  
> > run queries on data that is not interesting (this slow down
queries when  
> > the dataset is big).  
> > * If you want measure speed do it with about 1 billion documents
to  
> > simulate something real (real for 10 billion document world).  
> > * Index with re-indexing in mind. with 10 billion docs, re-index
data  
> > takes months ... This is important if you don't use regular
features of  
> > Solr. In my case I configured Docvalues with disk format (not
standard  
> > feature in 4.x) and at some point this format was deprecated.
Upgrade  
> Solr  
> > to 5.x was an epic 3 months battle to do it without full
downtime.  
> > * Solr is like your girlfriend, will demand love and care and
plenty of  
> > space to full-recover replicas that in some point are out of
sync, happen  
> a  
> > lot restarting nodes (this is annoying with replicas with 100G),
don't  
> > underestimate this point. Free space can save your life.  
> >  
> > \\\\--  
> >  
> > /Yago Riveiro  
> >  
> > > On Jan 19 2016, at 11:26 pm, Shawn Heisey  
> <apa...@elyograg.org>  
> > wrote:  
> >  
> > >  
> >  
> > > On 1/19/2016 1:30 PM, Troy Edwards wrote:  
> > > We are currently "beta testing" a SolrCloud with 2
nodes and 2  
> shards  
> > with  
> > > 2 replicas each. The number of documents is about
125000.  
> > >  
> > > We now want to scale this to about 10 billion
documents.  
> > >  
> > > What are the steps to prototyping, hardware
estimation and  
> stress  
> > testing?  
> >  
> > >  
> >  
> > > There is no general information available for sizing,
because there  
> are  
> > too many factors that will affect the answers. Some of the
important  
> > information that you need will be impossible to predict until
you  
> > actually build it and subject it to a real query load.  
> >  
> > >  
> >  
> > > https://lucidworks.com/blog/sizing-hardware-in-the-
abstract-why-we-  
> dont-  
> > have-a-definitive-answer/  
> >  
> > >  
> >  
> > > With an index of 10 billion documents, you may not be
able to  
> precisely  
> > predict performance and hardware requirements from a small-scale  
> > prototype. You'll likely need to build a full-scale system on a
small  
> > testbed, look for bottlenecks, ask for advice, and plan on a
larger  
> > system for production.  
> >  
> > >  
> >  
> > > The hard limit for documents on a single shard is
slightly less than  
> > Java's Integer.MAX_VALUE -- just over two billion. Because
deleted  
> > documents count against this max, about one billion documents
per shard  
> > is the absolute max that should be loaded in practice.  
> >  
> > >  
> >  
> > > BUT, if you actually try to put one billion documents
in a single  
> > server, performance will likely be awful. A more reasonable
limit per  
> > machine is 100 million ... but even this is quite large. You
might need  
> > smaller shards, or you might be able to get good performance
with larger  
> > shards. It all depends on things that you may not even know yet.  
> >  
> > >  
> >  
> > > Memory is always a strong driver for Solr performance,
and I am  
> speaking  
> > specifically of OS disk cache -- memory that has not been
allocated by  
> > any program. With 10 billion documents, your total index size
will  
> > likely be hundreds of gigabytes, and might even reach terabyte
scale.  
> > Good performance with indexes this large will require a lot of
total  
> > memory, which probably means that you will need a lot of servers
and  
> > many shards. SSD storage is strongly recommended.  
> >  
> > >  
> >  
> > > For extreme scaling on Solr, especially if the query
rate will be  
> high,  
> > it is recommended to only have one shard replica per server.  
> >  
> > >  
> >  
> > > I have just added an "extreme scaling" section to the
following wiki  
> > page, but it's mostly a placeholder right now. I would like to
have a  
> > discussion with people who operate very large indexes so I can
put real  
> > usable information in this section. I'm on IRC quite frequently
in the  
> > #solr channel.  
> >  
> > >  
> >  
> > > https://wiki.apache.org/solr/SolrPerformanceProblems  
> >  
> > >  
> >  
> > > Thanks,  
> > Shawn  
> >  
> >  
>

Reply via email to