Erick, Many thanks for your suggestions and pointers, i am proceeding with my study and looking forward to do a POC with Solr.
Thanks again. On Sun, Sep 25, 2011 at 7:40 PM, Erick Erickson <erickerick...@gmail.com>wrote: > Well, this is not a neutral forum <G>... > > A common use-case for Solr is exactly to replace > database searches because, as you say, search > performance in a database is often slow and limited. > RDBMSs do very complex stuff very well, but they > are not designed for text searching. > > Scaling is accomplished by either replication or > sharding. Replication is used when the entire index > fits on a single machine and you can get > reasonable responses. I've seen 40-50M docs fit > quite comfortably on one machine. But 150TB > *probably* indicates that this isn't reasonable in your > case. > > If you can't fit the entire index on one machine, then > you shard, which splits up the single logical index > into multiple slices and Solr automatically will query > all the shards and assemble the parts into a single > response. > > But you absolutely cannot guess the hardware > requirements ahead of time. It's like answering > "How big is a Java program?" There are too > many variables. But Solr is free, right? So you > absolutely have to get a copy and put your 2.5M > docs on it and test (Solrmeter or jMeter are > good options). If you get adequate throughput, add > another 1M docs to the machine. Keep on until > your QPS rate drops and you'll have a good idea how > many documents you can put on a single machine. > There's really no other way to answer that question > > Best > Erick > > On Sun, Sep 25, 2011 at 5:55 AM, Raja Ghulam Rasool <the.r...@gmail.com> > wrote: > > Hi, > > > > I am new to Solr, and I am studying it currently. We are planning to > > implement Solr in our production setup. We have 15 servers where we are > > getting the data. The data is huge, like we are supposed to keep 150 Tera > > bytes of data (in terms of documents it will be around 2592000 documents > > per server), across all servers (combined). We have the > > necessary storage capacity. Can anyone let me know whether Solr will be a > > good solution for our text search needs ? We are required to provide text > > searches or certain limited number of fields. > > > > 1- Does Solr support such architecture, i.e. multiple servers ? what > > specific area in Solr do i need to explore (shards, cores etc, ???) > > 2- Any idea whether we will really benefit from Solr implementation for > text > > searches, vs let us say Oracle Text Search ? Currently our Oracle Text > > search is giving a very bad performance and we are looking to some how > > improve our text search performance > > any high level pointers or help will be greatly appreciated. > > > > thanks in advance guys > > > > -- > > Regards, > > Raja > > > -- Regards, Ghulam Rasool. Blog: http://ghulamrasool.blogspot.com Mobile: +971506141872