I defer to Erick on on this level of detail and experience.
Let's continue the discussion - some of it will be a matter of how to
configure and tune Solr, how to select, configure, and tune hardware, the
need for further Lucene/Solr improvements, and how much further we have to
go to get to th
Erick,
Thanks a lot for the detailed answers. They are very helpful and I do get some
idea from them.
As per our searches, we will mainly do term and field (AND/OR) searches,
histogram, and faceting. Generally the queries are bound by time (e.g, last
hour, last day, last week, or even last mon
Well, the "commonsense limits" Jack is referring to in that post are
more (IMO) scales you should count on having to do some _serious_
prototyping/configuring/etc. As you scale out, you'll run into edge
cases that aren't the common variety, aren't reliably tested every
night, etc. I mean how would
Jack, thanks for your reply.
Sorry for the confusion about 4 nodes. What I meant was to use 4 nodes to do
some POC, mainly focusing on handling the high incoming rate in a few days
instead of storing data over one year.
You estimated the required nodes (6,308) and storage (322TB) based on the
I will send the debugQuery. They are exactly the same.
On Fri, Mar 21, 2014 at 2:59 AM, Raymond Wiker wrote:
> Are you sure that SOLR is rounding incorrectly, and not simply differently
> from what you expect? I was surprised myself at some of the rounding
> behaviour I saw with SOLR, but acco
Excellent, thanks Shalin!
On 3/22/2014 3:32 PM, Shalin Shekhar Mangar wrote:
Thanks Michael! I just committed your fix. It will be released with 4.7.1
On Fri, Mar 21, 2014 at 8:30 PM, Michael Sokolov
wrote:
I just managed to track this down -- as you said the disconnect was a red
herring.
Ul
On 3/22/2014 1:23 PM, Software Dev wrote:
> We have 2 collections with 1 shard each replicated over 5 servers in the
> cluster. We see a lot of flapping (down or recovering) on one of the
> collections. When this happens the other collection hosted on the same
> machine is still marked as active. W
On 3/22/2014 7:34 AM, Russell Taylor wrote:
> Yeah sorry didn't explain myself there, one of the three zookeepers will
> return me one of the solrcloud machines for me to access the index. I either
> need to know which machine it returned(is this feasible I can't seem to find
> a way to access i
I figured out that most of the startup time seems to spent on waiting for
replicas to recover. It waits from 6 seconds all the way upto 600 seconds
for replicas to recover before trying again and sometimes it succeeds and
otherwise it marks the core as down. Is there a way to reduce the timeout
whi
Thanks Michael! I just committed your fix. It will be released with 4.7.1
On Fri, Mar 21, 2014 at 8:30 PM, Michael Sokolov
wrote:
> I just managed to track this down -- as you said the disconnect was a red
> herring.
>
> Ultimately the problem was caused by a custom analysis component we wrote
>
Some logs the core in question is "items".
-
WARN - 2014-03-22 02:37:13.344;
org.apache.solr.cloud.RecoveryStrategy; Stopping recovery for
zkNodeName=10.0.14.101:8983_solr_itemscore=items
WARN
We have 2 collections with 1 shard each replicated over 5 servers in the
cluster. We see a lot of flapping (down or recovering) on one of the
collections. When this happens the other collection hosted on the same
machine is still marked as active. When this happens it takes a fairly long
time (~30
20K docs/sec = 20,000 * 60 * 60 * 24 = 1,728,000,000 = 1.7 billion docs/day
* 365 = 630,720,000,000 = 631 billion docs/yr
At 100 million docs/node = 6,308 nodes!
And you think you can do it with 4 nodes?
Oh, and that's before replication!
0.5K/doc * 631 billion docs = 322 TB.
-- Jack Krupans
Any thoughts? Can Solr Cloud support such use case with acceptable performance?
On Thursday, March 20, 2014 7:51 PM, shushuai zhu wrote:
Hi,
I am looking for some advice to handle large volume of documents with a very
high incoming rate. The size of each document is about 0.5 KB and the i
Yeah sorry didn't explain myself there, one of the three zookeepers will return
me one of the solrcloud machines for me to access the index. I either need to
know which machine it returned(is this feasible I can't seem to find a way to
access information in SolrCloudServer) and then add the ext
Hi,
Is the solrTomcat wiki article valid for solr-4.7.0 ?
http://wiki.apache.org/solr/SolrTomcat
I am not able to deploy solr after following the instructions there.
When I try to access the solr admin page I get a 404.
I followed every step exactly as mentioned in the wiki, still no dice.
A
16 matches
Mail list logo