Hello, I'm trying to write custom parser and add it to Tika, but I'm not very successful right now.
As I have a binary file that converts custom file type into XML file, I'm converting custom file to XML file inside my custom parser, then call XMLParser inside the parser. However, when I convert InputStream stream (inside parse function) to File, it seems that Solr is adding header and footer that contains Metadata so the file won't be converted properly. (http://wiki.apache.org/solr/ExtractingRequestHandler#Metadata) Following text is added as a header 1 0000000: 2d2d 2d2d 2d2d 2d2d 2d2d 2d2d 2d2d 2d2d ---------------- 2 0000010: 2d2d 2d2d 2d2d 2d2d 2d2d 2d2d 2d2d 3139 --------------19 3 0000020: 3230 3862 3937 3764 6637 0d0a 436f 6e74 208b977df7..Cont 4 0000030: 656e 742d 4469 7370 6f73 6974 696f 6e3a ent-Disposition: 5 0000040: 2066 6f72 6d2d 6461 7461 3b20 6e61 6d65 form-data; name 6 0000050: 3d22 6d79 6669 6c65 223b 2066 696c 656e ="myfile"; filen 7 0000060: 616d 653d 2268 7770 322e 6877 7022 0d0a ame="hwp2.hwp".. 8 0000070: 436f 6e74 656e 742d 5479 7065 3a20 6170 Content-Type: ap 9 0000080: 706c 6963 6174 696f 6e2f 6f63 7465 742d plication/octet- 10 0000090: 7374 7265 616d 0d0a 0d0a d0cf 11e0 a1b1 stream Following text is added as a footer 554 0002290: 0000 0000 0000 0000 0000 0d0a 2d2d 2d2d ............---- 555 00022a0: 2d2d 2d2d 2d2d 2d2d 2d2d 2d2d 2d2d 2d2d ---------------- 556 00022b0: 2d2d 2d2d 2d2d 2d2d 2d2d 3139 3230 3862 ----------19208b 557 00022c0: 3937 3764 6637 2d2d 0d0a 977df7--.. How can I prevent Solr from adding headers and footers? Thank you. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-adding-header-and-footer-to-streamed-documents-tp4003439.html Sent from the Solr - User mailing list archive at Nabble.com.