Hi Daniel, >________________________________ > From: Daniel Bruegge <daniel.brue...@googlemail.com> >Subject: Re: How can a distributed Solr setup scale to TB-data, if URL >limitations are 4000 for distributed shard search? > >But you can read so often about huge solr clusters and I am wondering how >they do this.
Huge is relative. ;) Huge Solr clusters also often have huge hardware. Servers with 16 cores and 32 GM RAM are becoming very common, for example. Another thing to keep in mind is that while lots of organizations have huge indices, only some portions of them may be hot at any one time. We've had a number of clients who index social media or news data and while all of them have giant indices, typically only the most recent data is really actively searched. > Because I also read often, that the Index size of one shard >should fit into RAM. Nah. Don't take this as "the whole index needs to fit in RAM". Just "the hot parts of the index should fit in RAM". This is related to what I wrote above. > Or at least the heap size should be as big as the > index size. So I see a lots of limitations hardware-wise. Or am I on the > totally wrong track? Regarding heap - nah, that's not correct. The heap is usually much smaller than the index and RAM is given to the OS to use for data caching. Otis ---- Performance Monitoring SaaS for Solr - http://sematext.com/spm/solr-performance-monitoring/index.html >On Thu, Jan 19, 2012 at 12:14 AM, Mark Miller <markrmil...@gmail.com> wrote: > >> You can raise the limit to a point. >> >> On Jan 18, 2012, at 5:59 PM, Daniel Bruegge wrote: >> >> > Hi, >> > >> > I am just wondering how I can 'grow' a distributed Solr setup to an index >> > size of a couple of terabytes, when one of the distributed Solr >> limitations >> > is max. 4000 characters in URI limitation. See: >> > >> > *The number of shards is limited by number of characters allowed for GET >> >> method's URI; most Web servers generally support at least 4000 >> characters, >> >> but many servers limit URI length to reduce their vulnerability to >> Denial >> >> of Service (DoS) attacks. >> >> * >> > >> > >> > >> >> *(via >> >> >> http://lucidworks.lucidimagination.com/display/solr/Distributed+Search+with+Index+Sharding >> >> )* >> >> >> > >> > Is the only way then to make multiple distributed solr clusters and query >> > them independently and merge them in application code? >> > >> > Thanks. Daniel >> >> - Mark Miller >> lucidimagination.com >> >> >> >> >> >> >> >> >> >> >> >> > > >