date:20140322

Re: Best approach to handle large volume of documents with constantly high incoming rate?

2014-03-22 Thread Jack Krupansky

I defer to Erick on on this level of detail and experience. Let's continue the discussion - some of it will be a matter of how to configure and tune Solr, how to select, configure, and tune hardware, the need for further Lucene/Solr improvements, and how much further we have to go to get to th

Re: Best approach to handle large volume of documents with constantly high incoming rate?

2014-03-22 Thread shushuai zhu

Erick, Thanks a lot for the detailed answers. They are very helpful and I do get some idea from them. As per our searches, we will mainly do term and field (AND/OR) searches, histogram, and faceting. Generally the queries are bound by time (e.g, last hour, last day, last week, or even last mon

Re: Best approach to handle large volume of documents with constantly high incoming rate?

2014-03-22 Thread Erick Erickson

Well, the "commonsense limits" Jack is referring to in that post are more (IMO) scales you should count on having to do some _serious_ prototyping/configuring/etc. As you scale out, you'll run into edge cases that aren't the common variety, aren't reliably tested every night, etc. I mean how would

Re: Best approach to handle large volume of documents with constantly high incoming rate?

2014-03-22 Thread shushuai zhu

Jack, thanks for your reply. Sorry for the confusion about 4 nodes. What I meant was to use 4 nodes to do some POC, mainly focusing on handling the high incoming rate in a few days instead of storing data over one year. You estimated the required nodes (6,308) and storage (322TB) based on the

Re: Rounding errors with SOLR score

2014-03-22 Thread William Bell

I will send the debugQuery. They are exactly the same. On Fri, Mar 21, 2014 at 2:59 AM, Raymond Wiker wrote: > Are you sure that SOLR is rounding incorrectly, and not simply differently > from what you expect? I was surprised myself at some of the rounding > behaviour I saw with SOLR, but acco

Re: Solr4.7 No live SolrServers available to handle this request

2014-03-22 Thread Michael Sokolov

Excellent, thanks Shalin! On 3/22/2014 3:32 PM, Shalin Shekhar Mangar wrote: Thanks Michael! I just committed your fix. It will be released with 4.7.1 On Fri, Mar 21, 2014 at 8:30 PM, Michael Sokolov wrote: I just managed to track this down -- as you said the disconnect was a red herring. Ul

Re: Solr Cloud collection keep going down?

2014-03-22 Thread Shawn Heisey

On 3/22/2014 1:23 PM, Software Dev wrote: > We have 2 collections with 1 shard each replicated over 5 servers in the > cluster. We see a lot of flapping (down or recovering) on one of the > collections. When this happens the other collection hosted on the same > machine is still marked as active. W

Re: using SolrJ with SolrCloud, searching multiple indexes.

2014-03-22 Thread Shawn Heisey

On 3/22/2014 7:34 AM, Russell Taylor wrote: > Yeah sorry didn't explain myself there, one of the three zookeepers will > return me one of the solrcloud machines for me to access the index. I either > need to know which machine it returned(is this feasible I can't seem to find > a way to access i

Re: Limit on # of collections -SolrCloud

2014-03-22 Thread Chris W

I figured out that most of the startup time seems to spent on waiting for replicas to recover. It waits from 6 seconds all the way upto 600 seconds for replicas to recover before trying again and sometimes it succeeds and otherwise it marks the core as down. Is there a way to reduce the timeout whi

Re: Solr4.7 No live SolrServers available to handle this request

2014-03-22 Thread Shalin Shekhar Mangar

Thanks Michael! I just committed your fix. It will be released with 4.7.1 On Fri, Mar 21, 2014 at 8:30 PM, Michael Sokolov wrote: > I just managed to track this down -- as you said the disconnect was a red > herring. > > Ultimately the problem was caused by a custom analysis component we wrote >

Re: Solr Cloud collection keep going down?

2014-03-22 Thread Software Dev

Some logs the core in question is "items". - WARN - 2014-03-22 02:37:13.344; org.apache.solr.cloud.RecoveryStrategy; Stopping recovery for zkNodeName=10.0.14.101:8983_solr_itemscore=items WARN

Solr Cloud collection keep going down?

2014-03-22 Thread Software Dev

We have 2 collections with 1 shard each replicated over 5 servers in the cluster. We see a lot of flapping (down or recovering) on one of the collections. When this happens the other collection hosted on the same machine is still marked as active. When this happens it takes a fairly long time (~30

Re: Best approach to handle large volume of documents with constantly high incoming rate?

2014-03-22 Thread Jack Krupansky

20K docs/sec = 20,000 * 60 * 60 * 24 = 1,728,000,000 = 1.7 billion docs/day * 365 = 630,720,000,000 = 631 billion docs/yr At 100 million docs/node = 6,308 nodes! And you think you can do it with 4 nodes? Oh, and that's before replication! 0.5K/doc * 631 billion docs = 322 TB. -- Jack Krupans

Re: Best approach to handle large volume of documents with constantly high incoming rate?

2014-03-22 Thread shushuai zhu

Any thoughts? Can Solr Cloud support such use case with acceptable performance? On Thursday, March 20, 2014 7:51 PM, shushuai zhu wrote: Hi, I am looking for some advice to handle large volume of documents with a very high incoming rate. The size of each document is about 0.5 KB and the i

RE: using SolrJ with SolrCloud, searching multiple indexes.

2014-03-22 Thread Russell Taylor

Yeah sorry didn't explain myself there, one of the three zookeepers will return me one of the solrcloud machines for me to access the index. I either need to know which machine it returned(is this feasible I can't seem to find a way to access information in SolrCloudServer) and then add the ext

setting up solr on tomcat

2014-03-22 Thread anupamk

Hi, Is the solrTomcat wiki article valid for solr-4.7.0 ? http://wiki.apache.org/solr/SolrTomcat I am not able to deploy solr after following the instructions there. When I try to access the solr admin page I get a 404. I followed every step exactly as mentioned in the wiki, still no dice. A

Re: Best approach to handle large volume of documents with constantly high incoming rate?

Re: Best approach to handle large volume of documents with constantly high incoming rate?

Re: Best approach to handle large volume of documents with constantly high incoming rate?

Re: Best approach to handle large volume of documents with constantly high incoming rate?

Re: Rounding errors with SOLR score

Re: Solr4.7 No live SolrServers available to handle this request

Re: Solr Cloud collection keep going down?

Re: using SolrJ with SolrCloud, searching multiple indexes.

Re: Limit on # of collections -SolrCloud

Re: Solr4.7 No live SolrServers available to handle this request

Re: Solr Cloud collection keep going down?

Solr Cloud collection keep going down?

Re: Best approach to handle large volume of documents with constantly high incoming rate?

Re: Best approach to handle large volume of documents with constantly high incoming rate?

RE: using SolrJ with SolrCloud, searching multiple indexes.

setting up solr on tomcat

16 matches

Site Navigation

Mail list logo

Footer information