Re: form-data post to ExtractingRequestHandler with utf-8 characters not handled

2011-11-02 Thread kgoess
I finally managed to answer my own question. UTF-8 data in the body is ok, but you need to specify charset=utf-8 in the Content-Type header in each part, to tell the receiver (Solr) that it's not the default ISO-8859-1 Content-Disposition: form-data; name=literal.bptitle Content-Type: text/p

form-data post to ExtractingRequestHandler with utf-8 characters not handled

2011-10-28 Thread kgoess
I'm trying to post a PDF along with a whole bunch of metadata fields to the ExtractingRequestHandler as multipart/form-data. It works fine except for the utf-8 character handling. Here is what my post looks like (abridged): POST /solr/update/extract HTTP/1.1 TE: deflate,gzip;q=0.3 Conn