Re: Lucene-based Distributed Index Leveraging Hadoop

2008-02-08 Thread Srikant Jakilinki
Hi Ning, In continuation with our offline conversation, here is a public expression of interest in your work and a description of our work. Sorry for the length in advance and I hope that the folk will be able to collaborate and/or share experiences and/or give us some pointers... 1) We are

Solr-303 Re: Solr in a distributed multi-machine high-performance environment

2008-01-23 Thread Srikant Jakilinki
guys, any documentaion on Solr-303 please, Srikant Shalin Shekhar Mangar wrote: Look at http://issues.apache.org/jira/browse/SOLR-303 Please note that it is still work in progress. So you may not be able to use it immeadiately. On Jan 16, 2008 10:53 AM, Srikant Jakilinki <[EMAIL PROTECTED]

Re: Solr feasibility with terabyte-scale data

2008-01-18 Thread Srikant Jakilinki
Nice description of a use-case. My 2 pennies embedded... Phillip Farber wrote: Hello everyone, We are considering Solr 1.2 to index and search a terabyte-scale dataset of OCR. Initially our requirements are simple: basic tokenizing, score sorting only, no faceting. The schema is simple to

Re: Solr in a distributed multi-machine high-performance environment

2008-01-16 Thread Srikant Jakilinki
Thanks for that Shalin. Looks like I have to wait and keep track of developments. Forgetting about indexes that cannot be fit on a single machine (distributed search), any links to have Solr running in a 2-machine environment? I want to measure how much improvement there will be in performanc

Solr in a distributed multi-machine high-performance environment

2008-01-15 Thread Srikant Jakilinki
Hi All, There is a requirement in our group of indexing and searching several millions of documents (TREC) in real-time and millisecond responses. For the moment we are preferring scale-out (throw more commodity machines) approaches rather than scale-up (faster disks, more RAM). This is in-turn in