Thx! -----邮件原件----- 发件人: Michael Della Bitta [mailto:michael.della.bi...@appinions.com] 发送时间: 2013年3月20日 20:42 收件人: solr-user@lucene.apache.org 主题: Re: SolrCloud with Zookeeper ensemble in production environment: SEVERE problems.
> 2. As far as I know the better SolrJ interface to index with SolrCloud > is CloudSolrServer, not ConcurrentUpdateSolrServer. If you have many instances of CloudSolrServer and you correctly balance them with a Round Robin or something similar you´ll get a better performance in SolrCloud scenarios. At least is what I´ve read in the documentation, and also I asked to Mark Miller some months ago when I started dealing with Solr 4.0-BETA. I was told otherwise during Solr Boot Camp. Michael Della Bitta ------------------------------------------------ Appinions 18 East 41st Street, 2nd Floor New York, NY 10017-6271 www.appinions.com Where Influence Isn’t a Game On Wed, Mar 20, 2013 at 5:14 AM, Luis Cappa Banda <luisca...@gmail.com> wrote: > Thank you for answering. Some notes: > > 1. The Java engine I´ve developed that wrappers SolrJ 4.1 with some > business logic only executes search queries, not index/update > operations, so the problem is not related with concurrent updates, or > something similar. > > 2. As far as I know the better SolrJ interface to index with SolrCloud > is CloudSolrServer, not ConcurrentUpdateSolrServer. If you have many > instances of CloudSolrServer and you correctly balance them with a > Round Robin or something similar you´ll get a better performance in SolrCloud > scenarios. > At least is what I´ve read in the documentation, and also I asked to > Mark Miller some months ago when I started dealing with Solr 4.0-BETA. > > 3. I´m almost convinced that the problem is related with: > > - Zookeeper ensemble configuration. > - Zookeeper version (3.4.5) is not compatible with Solr 4.1. expected one. > - SolrJ Zookeeper driver. > > In short, all my architecture works perfectly with search operations. > Also I´ve got another NRT Indexer module that deals with > CloudSolrServer and works perfectly. But after two, three days, > something happens with Zookeeper - CloudSolrServer connection, and > tries to update cluster status forever with no success. Only after > Zookeeper + SolrCloud leader&replica shards restart the problem is solved. > > > 2013/3/19 Michael Della Bitta <michael.della.bi...@appinions.com> > >> Don't use CloudSolrServer for writes. Instead, use >> ConcurrentUpdateSolrServer, something like: >> >> SolrServer solrServer = new ConcurrentUpdateSolrServer(solrUrl, 100, >> 4); >> >> The 100 corresponds to how many docs to send in a batch. The higher >> this is, the better performance is (to a point, don't set that to 50k >> or anything). >> >> The 4 corresponds to the number of threads that will be sending batches. >> >> Note that this class doesn't report errors, so if you want to see >> exceptions when bad things happen, you'll have to override >> handleError(Throwable ex) method. >> >> Here's the javadoc for the class: >> >> http://lucene.apache.org/solr/4_2_0/solr-solrj/org/apache/solr/client >> /solrj/impl/ConcurrentUpdateSolrServer.html >> >> It'd be best if you can use a load balancer in front of your Solr >> Cloud and use that as the solrUrl parameter. >> >> ***Either way, though, Mark is right in that you need to diagnose why >> you're only able to do a few documents per second first.*** Adding >> more threads at this point is probably not going to help. >> >> Michael Della Bitta >> >> ------------------------------------------------ >> Appinions >> 18 East 41st Street, 2nd Floor >> New York, NY 10017-6271 >> >> www.appinions.com >> >> Where Influence Isn’t a Game >> >> >> On Tue, Mar 19, 2013 at 3:57 PM, Luis Cappa Banda >> <luisca...@gmail.com> >> wrote: >> > Anyone can help me? Each response may save a little kitten from a >> horrible >> > and dramatic death somewhere in the world :-P El 15/03/2013 21:06, >> > "Jack Park" <jackp...@topicquests.org> escribió: >> > >> >> Is there a document that tells how to create multiple threads? >> >> Search returns many hits which orbit this idea, but I haven't >> >> spotted one which tells how. >> >> >> >> Thanks >> >> Jack >> >> >> >> On Fri, Mar 15, 2013 at 1:01 PM, Mark Miller >> >> <markrmil...@gmail.com> >> >> wrote: >> >> > You def have to use multiple threads with it for it to be fast, >> >> > but 3 >> or >> >> 4 docs a second still sounds absurdly slow. >> >> > >> >> > - Mark >> >> > >> >> > On Mar 15, 2013, at 2:58 PM, Luis Cappa Banda >> >> > <luisca...@gmail.com> >> >> wrote: >> >> > >> >> >> And up! :-) >> >> >> >> >> >> I´ve been wondering if using CloudSolrServer has something to >> >> >> do >> here. >> >> Does >> >> >> it have a bad performance when a CloudSolrServer singletong >> >> >> receives multiple queries? Is it recommended to have a >> >> >> CloudSolrServer >> instances >> >> >> list and select one of them with a Round Robin criteria? >> >> >> >> >> >> >> >> >> >> >> >> 2013/3/14 Luis Cappa Banda <luisca...@gmail.com> >> >> >> >> >> >>> Hello! >> >> >>> >> >> >>> Thanks a lot, Erick! I've attached some stack traces during a >> >> >>> normal 'engine' running. >> >> >>> >> >> >>> Cheers, >> >> >>> >> >> >>> - Luis Cappa >> >> >>> >> >> >>> >> >> >>> 2013/3/13 Erick Erickson <erickerick...@gmail.com> >> >> >>> >> >> >>>> Stack traces.. >> >> >>>> >> >> >>>> First, >> >> >>>> jps -l >> >> >>>> >> >> >>>> that will give you a the process IDs of your running Java >> processes. >> >> Then: >> >> >>>> >> >> >>>> jstack <pid from above> >> >> >>>> >> >> >>>> Usually I pipe the output from jstack into a text file... >> >> >>>> >> >> >>>> Best >> >> >>>> Erick >> >> >>>> >> >> >>>> >> >> >>>> On Wed, Mar 13, 2013 at 1:48 PM, Luis Cappa Banda < >> >> luisca...@gmail.com >> >> >>>>> wrote: >> >> >>>> >> >> >>>>> Uhm, how can I do that... 'cleanly'? I know that with >> >> >>>>> JConsole >> it´s >> >> >>>> posible >> >> >>>>> to output this traces, but with a .war application built on >> >> >>>>> top of >> >> >>>> Spring I >> >> >>>>> don´t know how can I do that. In any case, here is my >> CloudSolrServer >> >> >>>>> wrapper that is used by other classes. There is no sync >> >> >>>>> method or >> >> piece >> >> >>>> of >> >> >>>>> code: >> >> >>>>> >> >> >>>>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - >> >> >>>>> - - - >> - - >> >> >>>> - - >> >> >>>>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - >> >> >>>>> - - - >> - - >> >> >>>>> >> >> >>>>> *public class BinaryLBHttpSolrServer extends >> >> >>>>> LBHttpSolrServer {* >> >> >>>>> >> >> >>>>> private static final long serialVersionUID = 3905956120804659445L; >> >> >>>>> public BinaryLBHttpSolrServer(String[] endpoints) throws >> >> >>>>> MalformedURLException { >> >> >>>>> super(endpoints); >> >> >>>>> } >> >> >>>>> >> >> >>>>> @Override >> >> >>>>> protected HttpSolrServer makeServer(String server) throws >> >> >>>>> MalformedURLException { >> >> >>>>> HttpSolrServer solrServer = super.makeServer(server); >> >> >>>>> solrServer.setRequestWriter(new BinaryRequestWriter()); >> >> >>>>> return solrServer; >> >> >>>>> } >> >> >>>>> } >> >> >>>>> >> >> >>>>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - >> >> >>>>> - - - >> - - >> >> >>>> - - >> >> >>>>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - >> >> >>>>> - - - >> - - >> >> >>>>> >> >> >>>>> *public class CloudSolrHttpServerImpl implements >> CloudSolrHttpServer >> >> {* >> >> >>>>> private CloudSolrServer cloudSolrServer; >> >> >>>>> >> >> >>>>> private Logger log = >> Logger.getLogger(CloudSolrHttpServerImpl.class); >> >> >>>>> >> >> >>>>> public CloudSolrHttpServerImpl(String zookeeperEndpoints, >> >> >>>>> String[] endpoints, int clientTimeout, int connectTimeout, >> >> >>>>> String cloudCollection) { try { BinaryLBHttpSolrServer >> >> >>>>> lbSolrServer = new *BinaryLBHttpSolrServer* (endpoints); >> >> >>>>> this.cloudSolrServer = new >> >> >>>>> CloudSolrServer(zookeeperEndpoints, >> >> >>>>> lbSolrServer); >> >> >>>>> this.cloudSolrServer.setZkConnectTimeout(connectTimeout); >> >> >>>>> this.cloudSolrServer.setZkClientTimeout(clientTimeout); >> >> >>>>> this.cloudSolrServer.setDefaultCollection(cloudCollection); >> >> >>>>> } catch (MalformedURLException e) { log.error(e); } } >> >> >>>>> >> >> >>>>> @Override >> >> >>>>> public QueryResponse *search*(SolrQuery query) throws >> >> >>>> SolrServerException { >> >> >>>>> return cloudSolrServer.query(query, METHOD.POST); } >> >> >>>>> >> >> >>>>> @Override >> >> >>>>> public boolean *index*(DocumentBean user) { boolean indexed >> >> >>>>> = false; int retries = 0; do { indexed = addBean(user); >> >> >>>>> retries++; >> >> >>>>> } while(!indexed && retries<4); return indexed; } @Override >> >> >>>>> public boolean *update*(SolrInputDocument updateDoc) { >> >> >>>>> boolean update = false; int retries = 0; >> >> >>>>> >> >> >>>>> do { >> >> >>>>> update = addSolrInputDocument(updateDoc); >> >> >>>>> retries++; >> >> >>>>> } while(!update && retries<4); return update; } @Override >> >> >>>>> public void commit() { try { cloudSolrServer.commit(); } >> >> >>>>> catch (SolrServerException e) { >> >> >>>>> log.error(e); >> >> >>>>> } catch (IOException e) { >> >> >>>>> log.error(e); >> >> >>>>> } >> >> >>>>> } >> >> >>>>> >> >> >>>>> @Override >> >> >>>>> public boolean *delete*(String ... ids) { boolean deleted = >> >> >>>>> false; List<String> idList = Arrays.asList(ids); try { >> >> >>>>> this.cloudSolrServer.deleteById(idList); >> >> >>>>> this.cloudSolrServer.commit(true, true); deleted = true; >> >> >>>>> >> >> >>>>> } catch (SolrServerException e) { log.error(e); >> >> >>>>> >> >> >>>>> } catch (IOException e) { >> >> >>>>> log.error(e); >> >> >>>>> } >> >> >>>>> return deleted; >> >> >>>>> } >> >> >>>>> >> >> >>>>> @Override >> >> >>>>> public void *optimize*() { >> >> >>>>> try { >> >> >>>>> this.cloudSolrServer.optimize(); } catch >> >> >>>>> (SolrServerException e) { log.error(e); } catch (IOException >> >> >>>>> e) { log.error(e); } } >> >> >>>>> /* >> >> >>>>> * ******************** >> >> >>>>> * Getters & setters * >> >> >>>>> * ******************** >> >> >>>>> * */ >> >> >>>>> public CloudSolrServer getSolrServer() { return >> >> >>>>> cloudSolrServer; } >> >> >>>>> >> >> >>>>> public void setSolrServer(CloudSolrServer solrServer) { >> >> >>>>> this.cloudSolrServer = solrServer; } >> >> >>>>> >> >> >>>>> private boolean addBean(DocumentBean user) { boolean added = >> >> >>>>> false; try { this.cloudSolrServer.addBean(user, 100); >> >> >>>>> this.commit(); >> >> >>>>> >> >> >>>>> } catch (IOException e) { >> >> >>>>> log.error(e); >> >> >>>>> >> >> >>>>> } catch (SolrServerException e) { log.error(e); >> >> >>>>> }catch(SolrException e) { log.error(e); } return added; } >> >> >>>>> private boolean addSolrInputDocument(SolrInputDocument >> >> >>>>> updateDoc) >> { >> >> >>>>> boolean added = false; >> >> >>>>> try { >> >> >>>>> this.cloudSolrServer.add(updateDoc, 100); this.commit(); >> >> >>>>> added = true; } catch (IOException e) { log.error(e); >> >> >>>>> >> >> >>>>> } catch (SolrServerException e) { log.error(e); >> >> >>>>> }catch(SolrException e) { log.error(e); } return added; } } >> >> >>>>> >> >> >>>>> Thank you very much, Mark. >> >> >>>>> >> >> >>>>> >> >> >>>>> - Luis Cappa >> >> >>>>> >> >> >>>>> >> >> >>>>> >> >> >>>>> And >> >> >>>>> 2013/3/13 Mark Miller <markrmil...@gmail.com> >> >> >>>>> >> >> >>>>>> >> >> >>>>>> Could you capture some thread stack traces in the 'engine' >> >> >>>>>> and >> see >> >> if >> >> >>>>>> there are any blocking methods? >> >> >>>>>> >> >> >>>>>> - Mark >> >> >>>>>> >> >> >>>>>> On Mar 13, 2013, at 1:34 PM, Luis Cappa Banda < >> luisca...@gmail.com> >> >> >>>>> wrote: >> >> >>>>>> >> >> >>>>>>> Just one correction: >> >> >>>>>>> >> >> >>>>>>> When I said: >> >> >>>>>>> >> >> >>>>>>> - I´ve checked SolrCloud via Solr Admin interface and it´s OK: >> >> >>>>>>> everything is green, and I cant execute queries directly >> >> >>>>>>> into >> >> >>>> Solr. >> >> >>>>>>> >> >> >>>>>>> I mean: >> >> >>>>>>> >> >> >>>>>>> >> >> >>>>>>> - I´ve checked SolrCloud via Solr Admin interface and it´s OK: >> >> >>>>>>> everything is green, and *I can* execute queries directly >> >> >>>>>>> into >> >> >>>> Solr. >> >> >>>>>>> >> >> >>>>>>> >> >> >>>>>>> Thanks! >> >> >>>>>>> >> >> >>>>>>> >> >> >>>>>>> - Luis Cappa >> >> >>>>>>> >> >> >>>>>>> >> >> >>>>>>> 2013/3/13 Luis Cappa Banda <luisca...@gmail.com> >> >> >>>>>>> >> >> >>>>>>>> Hello, guys! >> >> >>>>>>>> >> >> >>>>>>>> I´ve been experiencing some annoying behavior with my >> >> >>>>>>>> current >> >> >>>>> production >> >> >>>>>>>> scenario. Here is the snapshot: >> >> >>>>>>>> >> >> >>>>>>>> >> >> >>>>>>>> - SolrCloud: 2 shards >> >> >>>>>>>> - Zookeeper ensemble: 3 nodes in *different machines >> >> >>>>>>>> *(most of >> >> >>>> the >> >> >>>>>>>> tutorials installs 3 Zookeeper nodes in the same machine). >> >> >>>>>>>> - This is the zoo.cfg from every >> >> >>>>>>>> >> >> >>>>>>>> tickTime=2000 // I´ve also tried with 60000 >> >> >>>>>>>> >> >> >>>>>>>> initLimit=10 >> >> >>>>>>>> >> >> >>>>>>>> syncLimit=5 >> >> >>>>>>>> >> >> >>>>>>>> dataDir=/var/lib/zookeeper >> >> >>>>>>>> >> >> >>>>>>>> clientPort=9000 >> >> >>>>>>>> >> >> >>>>>>>> server.1=zoohost1:2888:3888 >> >> >>>>>>>> >> >> >>>>>>>> server.2=zoohost1:2888:3888 >> >> >>>>>>>> >> >> >>>>>>>> server.3=zoohost1:2888:3888 >> >> >>>>>>>> >> >> >>>>>>>> >> >> >>>>>>>> >> >> >>>>>>>> - I´ve developed a Java Application with a REST API >> >> >>>>>>>> (let´s >> call >> >> >>>> it * >> >> >>>>>>>> engine*) that dispatches queries into SolrCloud. It´s a >> wrapper >> >> >>>>> around >> >> >>>>>>>> CloudSolrServer, so it´s mandatory to specify some >> >> >>>>>>>> Zookeeper >> >> >>>>>> configuration >> >> >>>>>>>> params too. They are loaded dynamically when the >> >> >>>>>>>> application >> is >> >> >>>>>> deployed in >> >> >>>>>>>> a Tomcat server, but the current values that I´m using >> >> >>>>>>>> are as >> >> >>>>> follows: >> >> >>>>>>>> >> >> >>>>>>>> cloudSolrServer.*setZkConnectTimeout(60000)* >> >> >>>>>>>> >> >> >>>>>>>> cloudSolrServer.*setZkClientTimeout(60000)* >> >> >>>>>>>> * >> >> >>>>>>>> * >> >> >>>>>>>> * >> >> >>>>>>>> * >> >> >>>>>>>> >> >> >>>>>>>> *THE PROBLEM* >> >> >>>>>>>> * >> >> >>>>>>>> * >> >> >>>>>>>> Everything goes OK, but after two days more or less (yes, >> >> >>>>>>>> I´ve >> >> >>>> checked >> >> >>>>>>>> that this behavior occurrs periodically, more or less) >> >> >>>>>>>> the >> *engine >> >> >>>>>> blocks >> >> >>>>>>>> * and cannot dispatch any query to SolrCloud. >> >> >>>>>>>> >> >> >>>>>>>> - The *engine *log only outputs "updating Zookeeper..." >> >> >>>>>>>> one >> last >> >> >>>>> time, >> >> >>>>>>>> but never updates. >> >> >>>>>>>> - I´ve checked SolrCloud via Solr Admin interface and it´s OK: >> >> >>>>>>>> everything is green, and I cant execute queries directly >> >> >>>>>>>> into >> >> >>>> Solr. >> >> >>>>>>>> - So then Solr appears to be OK, so the next step is to >> restart >> >> >>>>>> *engine >> >> >>>>>>>> but *it again appears "updating Zookeeper...". >> >> >>>>>>>> Unfortunately >> >> >>>> switch >> >> >>>>>>>> off + switch on doesn´t work here, :-( >> >> >>>>>>>> - I´ve checked too Zookeeper logs and it appears some >> connection >> >> >>>> log >> >> >>>>>>>> outs, but the ensemble appears to be OK too. >> >> >>>>>>>> - *The end: *If I restart Zookeeper one by one, and I >> >> >>>>>>>> restart SolrCloud, plus I restart the engine, the problem is >> >> >>>>>>>> solved. >> I´m >> >> >>>>> using >> >> >>>>>>>> Amazon AWS as hostage, so I discard connection problems >> between >> >> >>>>>> instances. >> >> >>>>>>>> >> >> >>>>>>>> >> >> >>>>>>>> Does anyone experienced something similar? Can anybody >> >> >>>>>>>> shed >> some >> >> >>>> light >> >> >>>>>> on >> >> >>>>>>>> this problem? >> >> >>>>>>>> >> >> >>>>>>>> Thank you very much. >> >> >>>>>>>> >> >> >>>>>>>> >> >> >>>>>>>> Regards, >> >> >>>>>>>> >> >> >>>>>>>> >> >> >>>>>>>> - Luis Cappa >> >> >>>>>>>> >> >> >>>>>> >> >> >>>>>> >> >> >>>>> >> >> >>>> >> >> >>> >> >> >>> >> >> > >> >> >> > > > > -- > Luis Cappa Banda > > *Phone*: (0034) 686 200 375 > *Skype*: luiscappabanda