Hi, We have an existing Java based enterprise application that is bundled as a WAR file and runs on Tomcat and uses Spring 3.0.5, Hibernate 3.6.2, and Lucene 3.0.3. We are using annotations in Hibernate that nicely couple it Lucene to index objects (documents, images, PDFs, etc.) based on key value pairs. We use Hibernate Search to retrieve the results were are looking for.
We want to extend our indexing capability to use Tika to extract text and metadata out of documents that are uploaded to the server and index that content. When I initially read about Solr I saw that it would provide extra functionality on top of Lucene. I was eager to get it integrated with our application. But now that I have fully read "Apache Solr 3 Enterprise Search Server" I feel that my initial impressions of Solr were wrong. I saw where Solr talked about using web services to upload files for indexing and also to perform searching and download content. I thought that was just a nice feature that was available. But I was not interested in that due to the fact that our application already has a web service interface that is used by our own home grown client application that communicates with the enterprise application above. I've read about SolrJ / Solr Cell, EmebbedSolrServer, BackendQueueProcessor, and DIH and researched them on the web. But none of them have provided me with the information to take a Hibernate managed object, inside of a transaction, persist the binary data in the database (which we are already doing), extra the text / contents from the binary file via Tika (which is a separate issue for a separate thread), and index that text with either Java API code or Java Annotations. It seems like Solr forces one to expose access to its "Cores" (indexes) via its own WAR file. I don't want that. I just want to be able to utilize the Solr Java API to integrate with our current web services and Hibernate framework to index text based documents. Then allow our users to perform open text searching and utilize Solr's advance features like highlighting, MLT, spell checking, suggester and faceting. But I just don't see how to integrate what Solr has to offer with our existing web application. I get the feeling that I have to create a new Solr based web application and then have the current application delegate indexing and searching to the Solr application, which is not what I really want to do, if possible. I've looked through the Solr Java Docs and I haven't found anything substantial that would allow for me to just use Java code instead of creating HTTP connections to index and search for data. Will someone let me know if what I am looking for is out of the scope of Solr's functionality or if there is a way, please provide an example of how I can accomplish this? Thank you, Todd