You mean I should actually read the #entire# post before responding? What an
idea!

Thanks for helping out here, I was completely misleading Ahson.

Erick

On Mon, Oct 11, 2010 at 7:25 PM, Chris Hostetter
<hossman_luc...@fucit.org>wrote:

>
> : Right. You're requiring that every document have an ID (via uniqueKey),
> but
> : there's nothing
> : magic about DIH that'll automagically parse a PDF file and map something
> : into your ID
> : field.
> :
> : So you have to create a unique ID before you send your doc to Curl. I'm
>
> a) This example isn't using DIH, it's using the extracting request handler
> directly
>
> b) in the example URL provided, Ahson was already using the exact syntax
> you mentioned...
>
> : > curl
> : > "
> : >
> http://localhost:8983/solr1/update/extract?literal.DocID=123&fmap.content=Contents&commit=true
> : > "
> : >  -F "myfi...@d:/solr/apache-solr-1.4.0/docs/filename1.pdf"
>
> ...note the "literal.DocID" param (where "DocID" is the field listed as
> uniqueKey in his example)
>
> The actual root of the problem is that the "lowernames" param
> (which is declared "true" in the Solr 1.4 example declaration of
> /update/extract) is getting applied to all field names, even the literal
> ones...
>
>
> http://wiki.apache.org/solr/ExtractingRequestHandler#Order_of_field_operations
>
> Ahson: You could change your uniqueKey field to something that is all
> lowercase, or you could set lowernames=false in your config (which will
> impact all field names extract by Tika)
>
> (Personally, i think the order of operations in the
> ExtractingRequestHandler makes no sense at all)
>
> -Hoss
>

Reply via email to