Hi, I am using Solr for searching my email data. My application is in C++ so I a using CURL library to POST the data to Solr for indexing. I am posting data in XML format and some of the XML fields are in plain text and some of the fields are in binary format. I want to know what should I do so that Solr can index both types of data (plain text as well as binary data) coming in a single XML file.
For the reference my XML file looks like: "<add><doc><field name=mailbox-id>1111</field><field name=folder>INBOX</field><field name=from>solr solr <s...@abc.com></field><field name=to>solr <s...@abc.com></field><field name=email-body>HI I AM EMAIL BODY\r\n\r\nTHANKS</field><field name=email-attachment>Some binary data</doc></add>" I tried to use ExtractingUpdateProcessorFactory but it seems to me that ExtractingUpdateProcessorFactory support is not in Solr 4.5(which I am using) even not in any of the Solr version available in market. Also, I think I can not use ExtractingRequestHandler for my problem as the document is of type XML format and having mixed type of data(text and binary). Am I right ?? If yes, pls. suggest me how to proceed and if no, how can I extract text using ExtractingRequestHandler from some of the binary fields. Any help is highly appreciated..... -- View this message in context: http://lucene.472066.n3.nabble.com/Indexing-on-plain-text-and-binary-data-in-a-single-HTTP-POST-request-tp4105661.html Sent from the Solr - User mailing list archive at Nabble.com.