: user:~/solr/example/exampledocs$ java -jar post.jar test.pdf doesnt work
1) you can use post.jar to send PDFs, but you have to use the option to tell solr you are sending a PDF file - because by default it assumes you are posting XML. you can see the problem by looking at the output from post.jar and the solr logs... hossman@frisbee:~/tmp/solr-4.0-BETA/bin-zip/apache-solr-4.0.0-BETA/example/exampledocs$ java -jar post.jar /tmp/test.pdf SimplePostTool version 1.5 Posting files to base url http://localhost:8983/solr/update using content-type application/xml.. ... And in the Solr logs... ... SEVERE: org.apache.solr.common.SolrException: Invalid UTF-8 middle byte 0xe3 (at char #10, byte #-1) at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:159) ... ...if you specify the type things should work fine on the clinet side. As for the Server side... 2) by default Solr's "/update" handler supports Solr Documents in XML, JSON, CSV, and JavaBin. If you wnat to use the "ExtractingRequestHandler" to parse rich documents you just have to change the URL exactly as noted in the wiki you mentioned ("-Durl=http://localhost:8983/solr/update/extract") -Hoss