On Thu, Jan 19, 2012 at 4:51 AM, Otis Gospodnetic <
otis_gospodne...@yahoo.com> wrote:
>
> Huge is relative. ;)
> Huge Solr clusters also often have huge hardware. Servers with 16 cores
> and 32 GM RAM are becoming very common, for example.
> Another thing to keep in mind is that while lots of organizations have
> huge indices, only some portions of them may be hot at any one time.  We've
> had a number of clients who index social media or news data and while all
> of them have giant indices, typically only the most recent data is really
> actively searched.
>

So let's say, if I have for example an index of 100GB with million of
documents, but 99% of the queries only hit the latest 200.000 documents in
the index, I can easily handle this on a machine which is not so powerful?
So with 'hot' you mean a subset of the whole index. You don't mean, that
there is e.g. one huge archive-index and a active-index in separate Solr
instances?


>
> > Because I also read often, that the Index size of one shard
> >should fit into RAM.
>
> Nah.  Don't take this as "the whole index needs to fit in RAM".  Just "the
> hot parts of the index should fit in RAM".  This is related to what I wrote
> above.
>

Ah, ok. Good to know. I always tried to split the index over multiple
shards, because I recognized a big performance loss, when I try to put it
on one machine. But maybe this is also connected to the 'hot' and 'not hot'
parts. thanks.


>
> > Or at least the heap size should be as big as the
> > index size. So I see a lots of limitations hardware-wise. Or am I on the
> > totally wrong track?
>
> Regarding heap - nah, that's not correct.  The heap is usually much
> smaller than the index and RAM is given to the OS to use for data caching.
>

Oh, ok. Thanks for this information. Maybe I can tweak the settings then a
bit. But I got several GC-errors etc. so I am always trying to modify all
these heap/gc settings. But I haven't found the perfect settings up to now.

Thanks.

Daniel


>
> Otis
> ----
> Performance Monitoring SaaS for Solr -
> http://sematext.com/spm/solr-performance-monitoring/index.html
>
>
>
> >On Thu, Jan 19, 2012 at 12:14 AM, Mark Miller <markrmil...@gmail.com>
> wrote:
> >
> >> You can raise the limit to a point.
> >>
> >> On Jan 18, 2012, at 5:59 PM, Daniel Bruegge wrote:
> >>
> >> > Hi,
> >> >
> >> > I am just wondering how I can 'grow' a distributed Solr setup to an
> index
> >> > size of a couple of terabytes, when one of the distributed Solr
> >> limitations
> >> > is max. 4000 characters in URI limitation. See:
> >> >
> >> > *The number of shards is limited by number of characters allowed for
> GET
> >> >> method's URI; most Web servers generally support at least 4000
> >> characters,
> >> >> but many servers limit URI length to reduce their vulnerability to
> >> Denial
> >> >> of Service (DoS) attacks.
> >> >> *
> >> >
> >> >
> >> >
> >> >> *(via
> >> >>
> >>
> http://lucidworks.lucidimagination.com/display/solr/Distributed+Search+with+Index+Sharding
> >> >> )*
> >> >>
> >> >
> >> > Is the only way then to make multiple distributed solr clusters and
> query
> >> > them independently and merge them in application code?
> >> >
> >> > Thanks. Daniel
> >>
> >> - Mark Miller
> >> lucidimagination.com
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >
> >
> >
>
>

Reply via email to