> Why is it so??

I'm reading your post on my mobile so probably I didn't get the point:
other then the date_modified field, what is the problem? Fields with
"ignored" prefix? That is perfectly right according with your configuration.

The other fields you declared aren't there because they are not part of
input file metadata

Best,
Andrea

On 11 Jan 2014 09:00, "sweety" <sweetyshind...@yahoo.com> wrote:
>
> I need to index rich text documents, this is* solrconfig.xml for extract
> handler*:
> <requestHandler name="/update/extract"
> class="solr.extraction.ExtractingRequestHandler" >
> <lst name="defaults">
>
> <str name="lowernames">true</str>
> <str name="uprefix">ignored_</str>
> <str name="captureAttr">true</str>
> </lst>
> </requestHandler>
>
> My *schema.xml* is:
> <field name="doc_id" type="uuid" indexed="true" stored="true"
default="NEW"
> multiValued="false"/>
> <field name="id" type="long" indexed="true" stored="true" required="true"
> multiValued="false"/>
> <field name="contents" type="text" indexed="true" stored="true"
> multiValued="false"/>
> <field name="author" type="title_text" indexed="true" stored="true"
> multiValued="true"/>
> <field name="title" type="title_text" indexed="true" stored="true"/>
> <field name="date_modified" type="date" indexed="true" stored="true"
> multivalued="true"/>
> <field name="_version_" type="long" indexed="true" stored="true"
> multiValued="false"/>
> <dynamicField name="ignored_*" type="text" indexed="true" stored="true"
> multiValued="true"/>
>
>
> But after *indexing using this curl*:
> curl
> "
http://localhost:8080/solr/document/update/extract?literal.id=12&commit=true
"
> -F"myfile=Coding.pdf"
> when queried as q=id:12, the *output* is :
> <arr name="ignored_stream_source_info">
> <str>myfile</str>
> </arr>
> <arr name="ignored_stream_content_type">
> <str>application/octet-stream</str>
> </arr>
> <arr name="ignored_stream_size">
> <str>3336935</str>
> </arr>
> <arr name="ignored_stream_name">
> <str>Coding.pdf</str>
> </arr>
> <arr name="ignored_content_type">
> <str>application/pdf</str>
> </arr>
> <str name="contents"></str>     ----*Contents not shown*
> <long name="_version_">1456831756526157824</long>
> <str name="doc_id">8eb229e0-5f25-4d26-bba4-6cb67aab7f81</str>
> </doc>
>
> Why is it so??
>
> Also date_modified field does not appear??
>
>
>
> --
> View this message in context:
http://lucene.472066.n3.nabble.com/using-extract-handler-data-not-extracted-tp4110850.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to