Hi !I'm using solr 3.3 version and i have some pdf files which i want to index. I followed instructions from the wiki page: http://wiki.apache.org/solr/ExtractingRequestHandler The problem is that i can add my documents to Solr but i cannot request them. Here is what i have:
*solrconfig.xml*: <requestHandler name="/update/extract" startup="lazy" class="solr.extraction.ExtractingRequestHandler" > <lst name="defaults"> <str name="fmap.content">text</str> <str name="lowernames">true</str> <str name="uprefix">ignored_</str> <str name="captureAttr">true</str> <str name="fmap.a">links</str> <str name="fmap.div">ignored_</str> </lst> </requestHandler> *schema.xml *: <field name="title" type="string" indexed="true" stored="true"/> <field name="author" type="string" indexed="true" stored="true" /> <field name="text" type="text_general" indexed="true" stored="true" multiValued="true"/> *data-config.xml* : ... <dataSource type="BinFileDataSource" name="ds-file"/> ... <entity processor="TikaEntityProcessor" dataSource="ds-file" url="../${document.filename}"> <field column="Author" name="author" meta="true"/> <field column="title" name="title" meta="true"/> <field column="text" name="text"/> </entity> ... I use Solrj to add documents as follows: SolrServer server = new CommonsHttpSolrServer("http://localhost:8080/solr"); ContentStreamUpdateRequest up = new ContentStreamUpdateRequest("/update/extract"); up.addFile(new File("d:\\test.pdf")); up.setParam("literal.id", "test"); up.setParam("extractOnly", "true"); server.commit(); NamedList result = server.request(up); System.out.println("Result: " + result); // can display information about test.pdf QueryResponse rsp = server.query( new SolrQuery( "*:*") ); System.out.println("rsp: " + rsp); // returns nothing Any suggestion? -- View this message in context: http://lucene.472066.n3.nabble.com/Problem-with-pdf-files-indexing-tp3527202p3527202.html Sent from the Solr - User mailing list archive at Nabble.com.