You mean I should actually read the #entire# post before responding? What an idea!
Thanks for helping out here, I was completely misleading Ahson. Erick On Mon, Oct 11, 2010 at 7:25 PM, Chris Hostetter <hossman_luc...@fucit.org>wrote: > > : Right. You're requiring that every document have an ID (via uniqueKey), > but > : there's nothing > : magic about DIH that'll automagically parse a PDF file and map something > : into your ID > : field. > : > : So you have to create a unique ID before you send your doc to Curl. I'm > > a) This example isn't using DIH, it's using the extracting request handler > directly > > b) in the example URL provided, Ahson was already using the exact syntax > you mentioned... > > : > curl > : > " > : > > http://localhost:8983/solr1/update/extract?literal.DocID=123&fmap.content=Contents&commit=true > : > " > : > -F "myfi...@d:/solr/apache-solr-1.4.0/docs/filename1.pdf" > > ...note the "literal.DocID" param (where "DocID" is the field listed as > uniqueKey in his example) > > The actual root of the problem is that the "lowernames" param > (which is declared "true" in the Solr 1.4 example declaration of > /update/extract) is getting applied to all field names, even the literal > ones... > > > http://wiki.apache.org/solr/ExtractingRequestHandler#Order_of_field_operations > > Ahson: You could change your uniqueKey field to something that is all > lowercase, or you could set lowernames=false in your config (which will > impact all field names extract by Tika) > > (Personally, i think the order of operations in the > ExtractingRequestHandler makes no sense at all) > > -Hoss >