Cacti, Nagios you name it already in use :) Well I'm the CTO so the one really really interested in estimating perf.
The id's come from a db initially and is later used for retrieval from a distributed on disk caching system which I have written. I'm in the process of moving from MySQL to HBase or Hypertable. /M On Tue, Jun 10, 2008 at 10:03 PM, Otis Gospodnetic < [EMAIL PROTECTED]> wrote: > Marcus, > > It sounds like you may just want to use a good server monitoring package > that collects server data and prints out pretty charts. Then you can show > them to your IT/budget people when the charts start showing increased query > latency times, very little available RAM, swapping, high CPU usage and such. > Nagios, Ganglia, any of those things will do. > > > Otis > -- > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > ----- Original Message ---- > > From: Marcus Herou <[EMAIL PROTECTED]> > > To: solr-user@lucene.apache.org > > Sent: Tuesday, June 10, 2008 3:29:40 PM > > Subject: Re: Num docs > > > > Well guys you are right... Still I want to have a clue about how much > each > > machine stores to predict when we need more machines (measure performance > > degradation per new document). But it's harder to collect that kind of > data. > > It sure is doable no doubt and is a normal sharding "algo" for MySQL. > > > > The best approach I think is to have some bg threads run X number of > queries > > and collect the response times, throw away the n lowest/highest response > > times and calc an avg time which is used for in sharding and query > lb'ing. > > > > Little off topic but interesting.... > > What would you guys say about a good correlation between the index size > on > > disk (no stored text content) and available RAM and having good response > > times. > > > > How long is a rope would you perhaps say...but I think some rule of thumb > > could be established... > > > > One of the schemas of concern > > > > > > required="true" /> > > > > required="true" /> > > > > required="false" /> > > > > stored="false" required="true" /> > > > > required="true" /> > > > > required="true" /> > > > > required="false" /> > > > > required="true" /> > > > > required="true" /> > > > > required="false" /> > > > > required="false" multiValued="true"/> > > > > required="false" /> > > > > required="false" /> > > > > required="false" /> > > > > required="false" /> > > > > > > and a normal solr query (taken from the log): > > /select > > > start=0&q=(title:(apple)^4+OR+description:(apple))&version=2.2&rows=15&wt=xml&sort=publishDate+desc > > > > > > //Marcus > > > > > > > > > > > > On Tue, Jun 10, 2008 at 1:15 AM, Otis Gospodnetic < > > [EMAIL PROTECTED]> wrote: > > > > > Exactly. I think I mentioned this once before several months ago. One > can > > > take various hardware specs (# cores, CPU speed, FSB, RAM, etc.), > > > performance numbers, etc. and come up with a number for each server's > > > overall capacity. > > > > > > > > > As a matter of fact, I think this would be useful to have right in > Solr, > > > primarily for use when allocating and sizing shards for Distributed > Search. > > > JIRA enhancement/feature issue? > > > Otis > > > -- > > > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > > > > > > ----- Original Message ---- > > > > From: Alexander Ramos Jardim > > > > To: solr-user@lucene.apache.org > > > > Sent: Monday, June 9, 2008 6:42:17 PM > > > > Subject: Re: Num docs > > > > > > > > I even think that such a decision should be based on the overall > machine > > > > performance at a given time, and not the index size. Unless you are > > > talking > > > > solely about HD space and not having any performance issues. > > > > > > > > 2008/6/7 Otis Gospodnetic : > > > > > > > > > Marcus, > > > > > > > > > > > > > > > For that you can rely on du, vmstat, iostat, top and such, too. :) > > > > > > > > > > Otis > > > > > -- > > > > > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > > > > > > > > > > > > ----- Original Message ---- > > > > > > From: Marcus Herou > > > > > > To: solr-user@lucene.apache.org > > > > > > Sent: Saturday, June 7, 2008 12:33:10 PM > > > > > > Subject: Re: Num docs > > > > > > > > > > > > Thanks, I wanna ask the indices how much more each shard can > handle > > > > > before > > > > > > they're considered "full" and scream for a budget to get a new > > > machine :) > > > > > > > > > > > > /M > > > > > > > > > > > > On Sat, Jun 7, 2008 at 3:07 PM, Otis Gospodnetic > > > > > > wrote: > > > > > > > > > > > > > Marcus, check out the Luke request handler. You can get it > from > > > its > > > > > > > output. It may also be possible to get *just* that number, but > I'm > > > not > > > > > > > looking at docs/code right now to know for sure. > > > > > > > > > > > > > > Otis > > > > > > > -- > > > > > > > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > > > > > > > > > > > > > > > > > > ----- Original Message ---- > > > > > > > > From: Marcus Herou > > > > > > > > To: solr-user@lucene.apache.org > > > > > > > > Sent: Saturday, June 7, 2008 5:09:20 AM > > > > > > > > Subject: Num docs > > > > > > > > > > > > > > > > Hi. > > > > > > > > > > > > > > > > Is there a way of retrieve IndexWriter.numDocs() in SOLR ? > > > > > > > > > > > > > > > > Kindly > > > > > > > > > > > > > > > > //Marcus > > > > > > > > > > > > > > > > -- > > > > > > > > Marcus Herou CTO and co-founder Tailsweep AB > > > > > > > > +46702561312 > > > > > > > > [EMAIL PROTECTED] > > > > > > > > http://www.tailsweep.com/ > > > > > > > > http://blogg.tailsweep.com/ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > Marcus Herou CTO and co-founder Tailsweep AB > > > > > > +46702561312 > > > > > > [EMAIL PROTECTED] > > > > > > http://www.tailsweep.com/ > > > > > > http://blogg.tailsweep.com/ > > > > > > > > > > > > > > > > > > > > > > -- > > > > Alexander Ramos Jardim > > > > > > > > > > > > -- > > Marcus Herou CTO and co-founder Tailsweep AB > > +46702561312 > > [EMAIL PROTECTED] > > http://www.tailsweep.com/ > > http://blogg.tailsweep.com/ > > -- Marcus Herou CTO and co-founder Tailsweep AB +46702561312 [EMAIL PROTECTED] http://www.tailsweep.com/ http://blogg.tailsweep.com/