You can use an update processor. Maybe write a JavaScript script for the
stateless script update processor that takes a list of field names and then
converts the base 64 encoding to normal text for those specified fields.
Or, convert base 64 to text before you send the field values to Solr.
-- Jack Krupansky
-----Original Message-----
From: neerajp
Sent: Saturday, October 26, 2013 12:50 PM
To: solr-user@lucene.apache.org
Subject: Indexing on plain text data and base64 encode data in a single HTTP
POST request
Hi,
I am using Solr for searching my email data. My application is in C++ so I a
using CURL library to POST the data to Solr for indexing. I am posting data
in XML format and some of the XML fields are in plain text and some of the
base64 encoded. I want to know what should I do so that Solr can index both
types of data (plain text as well as base64 encoded data) coming in a single
XML file.
For the reference my XML file looks like:
"<add><doc><field name=mailbox-id>1111</field><field
name=folder>INBOX</field><field name=from>solr solr
<s...@abc.com></field><field name=to>solr <s...@abc.com></field><field
name=email-body>HI I AM EMAIL BODY\r\n\r\nTHANKS</field><field
name=email-attachment>SGkgSSBBTSBBVFRBQ0hNRU5U</doc></add>"
In above XML all fields are in plain US ASCII characters except
email-attachment which is base64 encoded. Attachment content type could be
pdf, doc, text file etc.
Any help is highly appreciated.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Indexing-on-plain-text-data-and-base64-encode-data-in-a-single-HTTP-POST-request-tp4097905.html
Sent from the Solr - User mailing list archive at Nabble.com.