Re: Indexing on plain text and binary data in a single HTTP POST request

2013-12-10 Thread Raymond Wiker
I would index all attachments separately, but with some sort of reference back to the mail message. That way, I could use the update handler for the text and metadata of the mail message, and the the update/extract handler for the binary attachment(s) and a restricted set of metadata (file name, co

Re: Indexing on plain text and binary data in a single HTTP POST request

2013-12-09 Thread neerajp
Pls. find my response in-line: Assuming that your binary fields are mime attachments to email messages, they will probably already be encoded as base 64. Why not just leave them that way in solr too? You can't do much with them other than store them right? Or do you have some kind of image pr

Re: Indexing on plain text and binary data in a single HTTP POST request

2013-12-09 Thread Michael Sokolov
On 12/9/2013 11:13 PM, neerajp wrote: Hi, Pls. find my response in-line: That said, the obvious alternative is to use /update/extract instead of /update – this gives you a way of handling up to one binary stream in addition to any number of fields that can be represented as text. In that case, y

Re: Indexing on plain text and binary data in a single HTTP POST request

2013-12-09 Thread neerajp
Hi, Pls. find my response in-line: That said, the obvious alternative is to use /update/extract instead of /update – this gives you a way of handling up to one binary stream in addition to any number of fields that can be represented as text. In that case, you need to construct a POST request that

Re: Indexing on plain text and binary data in a single HTTP POST request

2013-12-09 Thread neerajp
Thanks everybody for throwing your ideas. So, I came to know that XML can not carry random binary data so I will encode the data in base64 format. Yes, I can write a custom URP which can convert the base64 encode fields to binary fields. Now, I have binary fields in my document.* My question is th

Re: Indexing on plain text and binary data in a single HTTP POST request

2013-12-09 Thread Raymond Wiker
On 09 Dec 2013, at 17:20 , neerajp wrote: > > 2) Your binary content is encoded in some way inside XML, right? Not just > random binary, which would make it invalid XML? Like base64 or something? > > [Neeraj]: I want to use random binary(*not base64 encoded*) in some of the > XML fields insi

Re: Indexing on plain text and binary data in a single HTTP POST request

2013-12-09 Thread Shawn Heisey
On 12/9/2013 9:20 AM, neerajp wrote: I tried to use ExtractingUpdateProcessor but soon came to know that the same is not rolled out in solr 4.5 I am not sure how to use ExtractingRequestHandler for an XML document having some of the fields in plain text and some of the fields in random binary for

Re: Indexing on plain text and binary data in a single HTTP POST request

2013-12-09 Thread neerajp
Hi Alexandre, Thanks very much for responding my post. Pls. find my response in-line: 1) For your email address fields, you are escaping the brackets, right? Not just "solr solr <[hidden email]>" as you show, but the < and > escaped, right? Otherwise, those email addresses become part of XML

Re: Indexing on plain text and binary data in a single HTTP POST request

2013-12-09 Thread Alexandre Rafalovitch
Not a solution, but a couple of thoughts: 1) For your email address fields, you are escaping the brackets, right? Not just "solr solr " as you show, but the < and > escaped, right? Otherwise, those email addresses become part of XML markup and mess it all up 2) Your binary content is encoded in so