Solr effectively supports only one binary document that gets indexed. This is because you are not actually indexing the document. You are extracting metadata (e.g. Author) and content fields out of it and map it to the "Solr document". So, it makes no sense to have two fields that are binary because their Meta output will overlap. The actual "binary" is not actually stored. And not recommended either for performance reasons.
You may want to think backwards from what you want to find and then figuring out where that data is coming from. Then, you may end up with multi-value fields, child documents, flattened documents (e.g. repeated common metadata), etc. Depending on your real scenario. Regards, Alex. Personal: http://www.outerthoughts.com/ and @arafalov Solr resources and newsletter: http://www.solr-start.com/ and @solrstart Solr popularizers community: https://www.linkedin.com/groups?gid=6713853 On Wed, Jul 30, 2014 at 8:00 PM, Tommaso Teofili <tommaso.teof...@gmail.com> wrote: > Hi all, > > while SolrCell works nicely when in need of indexing binary documents, I am > wondering about the possibility of having Lucene / Solr documents that have > binaries in specific Lucene fields, e.g. title="a nice doc", > name"blabla.doc", binary="0x1234...". > > In that case the "binary" field should have an indexing analyzer which can > extract the text from the binary and index it. > > Would it make sense to create a Tika based analyzer for that purpose? > > Regards, > Tommaso