Bertrand Delacretaz wrote:
My "Subversion and Solr" presentation from the last Cocoon GetTogether might give you ideas for how to handle this, see the link at http://wiki.apache.org/solr/SolrResources.
Hmm, I'm beginning to think the only way to do this is to write a complete custom front-end to Solr - even a custom analyser won't do as analyzers only deal with fields, not a full document (e.g. a PDF file).
Although it does not handle all binary formats out of the box (might need to write some java glue code to implement new formats), Cocoon is a good tool for transforming various document formats to XML and filter the results to generate the appropriate XML for Solr. I wouldn't add functionality to Solr for doing this, it's best to keep things loosely-coupled IMHO.
Cocoon? Thanks for the suggestion, but the last thing I want is yet another "Web Framework". I'm trying to simplify things, not add 90% clutter for 10% functionality.
-- Alan Burlison --