Marcus,

It sounds like you may just want to use a good server monitoring package that 
collects server data and prints out pretty charts.  Then you can show them to 
your IT/budget people when the charts start showing increased query latency 
times, very little available RAM, swapping, high CPU usage and such.  Nagios, 
Ganglia, any of those things will do.


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch


----- Original Message ----
> From: Marcus Herou <[EMAIL PROTECTED]>
> To: solr-user@lucene.apache.org
> Sent: Tuesday, June 10, 2008 3:29:40 PM
> Subject: Re: Num docs
> 
> Well guys you are right... Still I want to have a clue about how much each
> machine stores to predict when we need more machines (measure performance
> degradation per new document). But it's harder to collect that kind of data.
> It sure is doable no doubt and is a normal sharding "algo" for MySQL.
> 
> The best approach I think is to have some bg threads run X number of queries
> and collect the response times, throw away the n lowest/highest response
> times and calc an avg time which is used for in sharding and query lb'ing.
> 
> Little off topic but interesting....
> What would you guys say about a good correlation between the index size on
> disk (no stored text content) and available RAM and having good response
> times.
> 
> How long is a rope would you perhaps say...but I think some rule of thumb
> could be established...
> 
> One of the schemas of concern
> 
>         
> required="true" />
>         
> required="true" />
>         
> required="false" />
>         
> stored="false" required="true" />
>         
> required="true" />
>         
> required="true" />
>         
> required="false" />
>         
> required="true" />
>         
> required="true" />
>         
> required="false" />
>         
> required="false" multiValued="true"/>
>         
> required="false" />
>         
> required="false" />
>         
> required="false" />
>         
> required="false" />
> 
> 
> and a normal solr query (taken from the log):
> /select
> start=0&q=(title:(apple)^4+OR+description:(apple))&version=2.2&rows=15&wt=xml&sort=publishDate+desc
> 
> 
> //Marcus
> 
> 
> 
> 
> 
> On Tue, Jun 10, 2008 at 1:15 AM, Otis Gospodnetic <
> [EMAIL PROTECTED]> wrote:
> 
> > Exactly.  I think I mentioned this once before several months ago.  One can
> > take various hardware specs (# cores, CPU speed, FSB, RAM, etc.),
> > performance numbers, etc. and come up with a number for each server's
> > overall capacity.
> >
> >
> > As a matter of fact, I think this would be useful to have right in Solr,
> > primarily for use when allocating and sizing shards for Distributed Search.
> >  JIRA enhancement/feature issue?
> > Otis
> > --
> > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> >
> >
> > ----- Original Message ----
> > > From: Alexander Ramos Jardim 
> > > To: solr-user@lucene.apache.org
> > > Sent: Monday, June 9, 2008 6:42:17 PM
> > > Subject: Re: Num docs
> > >
> > > I even think that such a decision should be based on the overall machine
> > > performance at a given time, and not the index size. Unless you are
> > talking
> > > solely about HD space and not having any performance issues.
> > >
> > > 2008/6/7 Otis Gospodnetic :
> > >
> > > > Marcus,
> > > >
> > > >
> > > > For that you can rely on du, vmstat, iostat, top and such, too. :)
> > > >
> > > > Otis
> > > > --
> > > > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> > > >
> > > >
> > > > ----- Original Message ----
> > > > > From: Marcus Herou
> > > > > To: solr-user@lucene.apache.org
> > > > > Sent: Saturday, June 7, 2008 12:33:10 PM
> > > > > Subject: Re: Num docs
> > > > >
> > > > > Thanks, I wanna ask the indices how much more each shard can handle
> > > > before
> > > > > they're considered "full" and scream for a budget to get a new
> > machine :)
> > > > >
> > > > > /M
> > > > >
> > > > > On Sat, Jun 7, 2008 at 3:07 PM, Otis Gospodnetic
> > > > > wrote:
> > > > >
> > > > > > Marcus, check out the Luke request handler.  You can get it from
> > its
> > > > > > output.  It may also be possible to get *just* that number, but I'm
> > not
> > > > > > looking at docs/code right now to know for sure.
> > > > > >
> > > > > >  Otis
> > > > > > --
> > > > > > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> > > > > >
> > > > > >
> > > > > > ----- Original Message ----
> > > > > > > From: Marcus Herou
> > > > > > > To: solr-user@lucene.apache.org
> > > > > > > Sent: Saturday, June 7, 2008 5:09:20 AM
> > > > > > > Subject: Num docs
> > > > > > >
> > > > > > > Hi.
> > > > > > >
> > > > > > > Is there a way of retrieve IndexWriter.numDocs() in SOLR ?
> > > > > > >
> > > > > > > Kindly
> > > > > > >
> > > > > > > //Marcus
> > > > > > >
> > > > > > > --
> > > > > > > Marcus Herou CTO and co-founder Tailsweep AB
> > > > > > > +46702561312
> > > > > > > [EMAIL PROTECTED]
> > > > > > > http://www.tailsweep.com/
> > > > > > > http://blogg.tailsweep.com/
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Marcus Herou CTO and co-founder Tailsweep AB
> > > > > +46702561312
> > > > > [EMAIL PROTECTED]
> > > > > http://www.tailsweep.com/
> > > > > http://blogg.tailsweep.com/
> > > >
> > > >
> > >
> > >
> > > --
> > > Alexander Ramos Jardim
> >
> >
> 
> 
> -- 
> Marcus Herou CTO and co-founder Tailsweep AB
> +46702561312
> [EMAIL PROTECTED]
> http://www.tailsweep.com/
> http://blogg.tailsweep.com/

Reply via email to