Hmmm distributed BDB brrr :)

On Fri, Jun 13, 2008 at 3:21 AM, Otis Gospodnetic <
[EMAIL PROTECTED]> wrote:

> Or, if you want to go with something older/more stable, go with BDB. :)
>
>
> Otis --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
>
> ----- Original Message ----
> > From: Marcus Herou <[EMAIL PROTECTED]>
> > To: solr-user@lucene.apache.org
> > Sent: Thursday, June 12, 2008 3:17:52 PM
> > Subject: Re: Num docs
> >
> > Cacti, Nagios you name it already in use :)
> >
> > Well I'm the CTO so the one really really interested in estimating perf.
> >
> > The id's come from a db initially and is later used for retrieval from a
> > distributed on disk caching system which I have written.
> > I'm in the process of moving from MySQL to HBase or Hypertable.
> >
> > /M
> >
> > On Tue, Jun 10, 2008 at 10:03 PM, Otis Gospodnetic <
> > [EMAIL PROTECTED]> wrote:
> >
> > > Marcus,
> > >
> > > It sounds like you may just want to use a good server monitoring
> package
> > > that collects server data and prints out pretty charts.  Then you can
> show
> > > them to your IT/budget people when the charts start showing increased
> query
> > > latency times, very little available RAM, swapping, high CPU usage and
> such.
> > >  Nagios, Ganglia, any of those things will do.
> > >
> > >
> > > Otis
> > > --
> > > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> > >
> > >
> > > ----- Original Message ----
> > > > From: Marcus Herou
> > > > To: solr-user@lucene.apache.org
> > > > Sent: Tuesday, June 10, 2008 3:29:40 PM
> > > > Subject: Re: Num docs
> > > >
> > > > Well guys you are right... Still I want to have a clue about how much
> > > each
> > > > machine stores to predict when we need more machines (measure
> performance
> > > > degradation per new document). But it's harder to collect that kind
> of
> > > data.
> > > > It sure is doable no doubt and is a normal sharding "algo" for MySQL.
> > > >
> > > > The best approach I think is to have some bg threads run X number of
> > > queries
> > > > and collect the response times, throw away the n lowest/highest
> response
> > > > times and calc an avg time which is used for in sharding and query
> > > lb'ing.
> > > >
> > > > Little off topic but interesting....
> > > > What would you guys say about a good correlation between the index
> size
> > > on
> > > > disk (no stored text content) and available RAM and having good
> response
> > > > times.
> > > >
> > > > How long is a rope would you perhaps say...but I think some rule of
> thumb
> > > > could be established...
> > > >
> > > > One of the schemas of concern
> > > >
> > > >
> > > > required="true" />
> > > >
> > > > required="true" />
> > > >
> > > > required="false" />
> > > >
> > > > stored="false" required="true" />
> > > >
> > > > required="true" />
> > > >
> > > > required="true" />
> > > >
> > > > required="false" />
> > > >
> > > > required="true" />
> > > >
> > > > required="true" />
> > > >
> > > > required="false" />
> > > >
> > > > required="false" multiValued="true"/>
> > > >
> > > > required="false" />
> > > >
> > > > required="false" />
> > > >
> > > > required="false" />
> > > >
> > > > required="false" />
> > > >
> > > >
> > > > and a normal solr query (taken from the log):
> > > > /select
> > > >
> > >
> >
> start=0&q=(title:(apple)^4+OR+description:(apple))&version=2.2&rows=15&wt=xml&sort=publishDate+desc
> > > >
> > > >
> > > > //Marcus
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On Tue, Jun 10, 2008 at 1:15 AM, Otis Gospodnetic <
> > > > [EMAIL PROTECTED]> wrote:
> > > >
> > > > > Exactly.  I think I mentioned this once before several months ago.
>  One
> > > can
> > > > > take various hardware specs (# cores, CPU speed, FSB, RAM, etc.),
> > > > > performance numbers, etc. and come up with a number for each
> server's
> > > > > overall capacity.
> > > > >
> > > > >
> > > > > As a matter of fact, I think this would be useful to have right in
> > > Solr,
> > > > > primarily for use when allocating and sizing shards for Distributed
> > > Search.
> > > > >  JIRA enhancement/feature issue?
> > > > > Otis
> > > > > --
> > > > > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> > > > >
> > > > >
> > > > > ----- Original Message ----
> > > > > > From: Alexander Ramos Jardim
> > > > > > To: solr-user@lucene.apache.org
> > > > > > Sent: Monday, June 9, 2008 6:42:17 PM
> > > > > > Subject: Re: Num docs
> > > > > >
> > > > > > I even think that such a decision should be based on the overall
> > > machine
> > > > > > performance at a given time, and not the index size. Unless you
> are
> > > > > talking
> > > > > > solely about HD space and not having any performance issues.
> > > > > >
> > > > > > 2008/6/7 Otis Gospodnetic :
> > > > > >
> > > > > > > Marcus,
> > > > > > >
> > > > > > >
> > > > > > > For that you can rely on du, vmstat, iostat, top and such, too.
> :)
> > > > > > >
> > > > > > > Otis
> > > > > > > --
> > > > > > > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> > > > > > >
> > > > > > >
> > > > > > > ----- Original Message ----
> > > > > > > > From: Marcus Herou
> > > > > > > > To: solr-user@lucene.apache.org
> > > > > > > > Sent: Saturday, June 7, 2008 12:33:10 PM
> > > > > > > > Subject: Re: Num docs
> > > > > > > >
> > > > > > > > Thanks, I wanna ask the indices how much more each shard can
> > > handle
> > > > > > > before
> > > > > > > > they're considered "full" and scream for a budget to get a
> new
> > > > > machine :)
> > > > > > > >
> > > > > > > > /M
> > > > > > > >
> > > > > > > > On Sat, Jun 7, 2008 at 3:07 PM, Otis Gospodnetic
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Marcus, check out the Luke request handler.  You can get it
> > > from
> > > > > its
> > > > > > > > > output.  It may also be possible to get *just* that number,
> but
> > > I'm
> > > > > not
> > > > > > > > > looking at docs/code right now to know for sure.
> > > > > > > > >
> > > > > > > > >  Otis
> > > > > > > > > --
> > > > > > > > > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > ----- Original Message ----
> > > > > > > > > > From: Marcus Herou
> > > > > > > > > > To: solr-user@lucene.apache.org
> > > > > > > > > > Sent: Saturday, June 7, 2008 5:09:20 AM
> > > > > > > > > > Subject: Num docs
> > > > > > > > > >
> > > > > > > > > > Hi.
> > > > > > > > > >
> > > > > > > > > > Is there a way of retrieve IndexWriter.numDocs() in SOLR
> ?
> > > > > > > > > >
> > > > > > > > > > Kindly
> > > > > > > > > >
> > > > > > > > > > //Marcus
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > Marcus Herou CTO and co-founder Tailsweep AB
> > > > > > > > > > +46702561312
> > > > > > > > > > [EMAIL PROTECTED]
> > > > > > > > > > http://www.tailsweep.com/
> > > > > > > > > > http://blogg.tailsweep.com/
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Marcus Herou CTO and co-founder Tailsweep AB
> > > > > > > > +46702561312
> > > > > > > > [EMAIL PROTECTED]
> > > > > > > > http://www.tailsweep.com/
> > > > > > > > http://blogg.tailsweep.com/
> > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Alexander Ramos Jardim
> > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > Marcus Herou CTO and co-founder Tailsweep AB
> > > > +46702561312
> > > > [EMAIL PROTECTED]
> > > > http://www.tailsweep.com/
> > > > http://blogg.tailsweep.com/
> > >
> > >
> >
> >
> > --
> > Marcus Herou CTO and co-founder Tailsweep AB
> > +46702561312
> > [EMAIL PROTECTED]
> > http://www.tailsweep.com/
> > http://blogg.tailsweep.com/
>
>


-- 
Marcus Herou CTO and co-founder Tailsweep AB
+46702561312
[EMAIL PROTECTED]
http://www.tailsweep.com/
http://blogg.tailsweep.com/

Reply via email to