Marcus, It sounds like you may just want to use a good server monitoring package that collects server data and prints out pretty charts. Then you can show them to your IT/budget people when the charts start showing increased query latency times, very little available RAM, swapping, high CPU usage and such. Nagios, Ganglia, any of those things will do.
Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch ----- Original Message ---- > From: Marcus Herou <[EMAIL PROTECTED]> > To: solr-user@lucene.apache.org > Sent: Tuesday, June 10, 2008 3:29:40 PM > Subject: Re: Num docs > > Well guys you are right... Still I want to have a clue about how much each > machine stores to predict when we need more machines (measure performance > degradation per new document). But it's harder to collect that kind of data. > It sure is doable no doubt and is a normal sharding "algo" for MySQL. > > The best approach I think is to have some bg threads run X number of queries > and collect the response times, throw away the n lowest/highest response > times and calc an avg time which is used for in sharding and query lb'ing. > > Little off topic but interesting.... > What would you guys say about a good correlation between the index size on > disk (no stored text content) and available RAM and having good response > times. > > How long is a rope would you perhaps say...but I think some rule of thumb > could be established... > > One of the schemas of concern > > > required="true" /> > > required="true" /> > > required="false" /> > > stored="false" required="true" /> > > required="true" /> > > required="true" /> > > required="false" /> > > required="true" /> > > required="true" /> > > required="false" /> > > required="false" multiValued="true"/> > > required="false" /> > > required="false" /> > > required="false" /> > > required="false" /> > > > and a normal solr query (taken from the log): > /select > start=0&q=(title:(apple)^4+OR+description:(apple))&version=2.2&rows=15&wt=xml&sort=publishDate+desc > > > //Marcus > > > > > > On Tue, Jun 10, 2008 at 1:15 AM, Otis Gospodnetic < > [EMAIL PROTECTED]> wrote: > > > Exactly. I think I mentioned this once before several months ago. One can > > take various hardware specs (# cores, CPU speed, FSB, RAM, etc.), > > performance numbers, etc. and come up with a number for each server's > > overall capacity. > > > > > > As a matter of fact, I think this would be useful to have right in Solr, > > primarily for use when allocating and sizing shards for Distributed Search. > > JIRA enhancement/feature issue? > > Otis > > -- > > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > > > ----- Original Message ---- > > > From: Alexander Ramos Jardim > > > To: solr-user@lucene.apache.org > > > Sent: Monday, June 9, 2008 6:42:17 PM > > > Subject: Re: Num docs > > > > > > I even think that such a decision should be based on the overall machine > > > performance at a given time, and not the index size. Unless you are > > talking > > > solely about HD space and not having any performance issues. > > > > > > 2008/6/7 Otis Gospodnetic : > > > > > > > Marcus, > > > > > > > > > > > > For that you can rely on du, vmstat, iostat, top and such, too. :) > > > > > > > > Otis > > > > -- > > > > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > > > > > > > > > ----- Original Message ---- > > > > > From: Marcus Herou > > > > > To: solr-user@lucene.apache.org > > > > > Sent: Saturday, June 7, 2008 12:33:10 PM > > > > > Subject: Re: Num docs > > > > > > > > > > Thanks, I wanna ask the indices how much more each shard can handle > > > > before > > > > > they're considered "full" and scream for a budget to get a new > > machine :) > > > > > > > > > > /M > > > > > > > > > > On Sat, Jun 7, 2008 at 3:07 PM, Otis Gospodnetic > > > > > wrote: > > > > > > > > > > > Marcus, check out the Luke request handler. You can get it from > > its > > > > > > output. It may also be possible to get *just* that number, but I'm > > not > > > > > > looking at docs/code right now to know for sure. > > > > > > > > > > > > Otis > > > > > > -- > > > > > > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > > > > > > > > > > > > > > > ----- Original Message ---- > > > > > > > From: Marcus Herou > > > > > > > To: solr-user@lucene.apache.org > > > > > > > Sent: Saturday, June 7, 2008 5:09:20 AM > > > > > > > Subject: Num docs > > > > > > > > > > > > > > Hi. > > > > > > > > > > > > > > Is there a way of retrieve IndexWriter.numDocs() in SOLR ? > > > > > > > > > > > > > > Kindly > > > > > > > > > > > > > > //Marcus > > > > > > > > > > > > > > -- > > > > > > > Marcus Herou CTO and co-founder Tailsweep AB > > > > > > > +46702561312 > > > > > > > [EMAIL PROTECTED] > > > > > > > http://www.tailsweep.com/ > > > > > > > http://blogg.tailsweep.com/ > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > Marcus Herou CTO and co-founder Tailsweep AB > > > > > +46702561312 > > > > > [EMAIL PROTECTED] > > > > > http://www.tailsweep.com/ > > > > > http://blogg.tailsweep.com/ > > > > > > > > > > > > > > > > > -- > > > Alexander Ramos Jardim > > > > > > > -- > Marcus Herou CTO and co-founder Tailsweep AB > +46702561312 > [EMAIL PROTECTED] > http://www.tailsweep.com/ > http://blogg.tailsweep.com/