I am working on an application that currently hits a database containing millions of very large documents. I use Oracle Text Search at the moment, and things work fine. However, there is a request for faceting capability, and Solr seems like a technology I should look at. Suffice to say I am new to Solr, but at the moment I see two approaches-each with drawbacks:
1) Have Solr index document metadata (id, subject, date). Then Use Oracle Text to do a content search based on criteria. Finally, query the Solr index for all documents whose id's match the set of id's returned by Oracle Text. That strikes me as an unmanageable Boolean query. (e.g. id:4ORid:33432323OR...). 2) Remove Oracle Text from the equation and use Solr to query document content based on search criteria. The indexing process though will almost certainly encounter an OutOfMemoryError given the number and size of documents. I am using the embedded server and Solr Java APIs to do the indexing and querying. I would welcome your thoughts on the best way to approach this situation. Please let me know if I should provide additional information. Thanks.