Why do you think you'd hit OOM errors? How big is "very large"? I've indexed, as a single document, a 26 volume encyclopedia of civil war records......
Although as much as I like the technology, if I could get away without using two technologies, I would. Are you completely sure you can't get what you want with clever Oracle querying? Best Erick On Tue, Mar 16, 2010 at 3:20 PM, Neil Chaudhuri < nchaudh...@potomacfusion.com> wrote: > I am working on an application that currently hits a database containing > millions of very large documents. I use Oracle Text Search at the moment, > and things work fine. However, there is a request for faceting capability, > and Solr seems like a technology I should look at. Suffice to say I am new > to Solr, but at the moment I see two approaches-each with drawbacks: > > > 1) Have Solr index document metadata (id, subject, date). Then Use > Oracle Text to do a content search based on criteria. Finally, query the > Solr index for all documents whose id's match the set of id's returned by > Oracle Text. That strikes me as an unmanageable Boolean query. (e.g. > id:4ORid:33432323OR...). > > 2) Remove Oracle Text from the equation and use Solr to query document > content based on search criteria. The indexing process though will almost > certainly encounter an OutOfMemoryError given the number and size of > documents. > > > > I am using the embedded server and Solr Java APIs to do the indexing and > querying. > > > > I would welcome your thoughts on the best way to approach this situation. > Please let me know if I should provide additional information. > > > > Thanks. >