: Right. You're requiring that every document have an ID (via uniqueKey), but : there's nothing : magic about DIH that'll automagically parse a PDF file and map something : into your ID : field. : : So you have to create a unique ID before you send your doc to Curl. I'm
a) This example isn't using DIH, it's using the extracting request handler directly b) in the example URL provided, Ahson was already using the exact syntax you mentioned... : > curl : > " : > http://localhost:8983/solr1/update/extract?literal.DocID=123&fmap.content=Contents&commit=true : > " : > -F "myfi...@d:/solr/apache-solr-1.4.0/docs/filename1.pdf" ...note the "literal.DocID" param (where "DocID" is the field listed as uniqueKey in his example) The actual root of the problem is that the "lowernames" param (which is declared "true" in the Solr 1.4 example declaration of /update/extract) is getting applied to all field names, even the literal ones... http://wiki.apache.org/solr/ExtractingRequestHandler#Order_of_field_operations Ahson: You could change your uniqueKey field to something that is all lowercase, or you could set lowernames=false in your config (which will impact all field names extract by Tika) (Personally, i think the order of operations in the ExtractingRequestHandler makes no sense at all) -Hoss