Are you sure date_modified is a meta-data field in the PDF document you're extracting?
Best, Erick On Sat, Jan 11, 2014 at 3:00 AM, sweety <sweetyshind...@yahoo.com> wrote: > I need to index rich text documents, this is* solrconfig.xml for extract > handler*: > <requestHandler name="/update/extract" > class="solr.extraction.ExtractingRequestHandler" > > <lst name="defaults"> > > <str name="lowernames">true</str> > <str name="uprefix">ignored_</str> > <str name="captureAttr">true</str> > </lst> > </requestHandler> > > My *schema.xml* is: > <field name="doc_id" type="uuid" indexed="true" stored="true" default="NEW" > multiValued="false"/> > <field name="id" type="long" indexed="true" stored="true" required="true" > multiValued="false"/> > <field name="contents" type="text" indexed="true" stored="true" > multiValued="false"/> > <field name="author" type="title_text" indexed="true" stored="true" > multiValued="true"/> > <field name="title" type="title_text" indexed="true" stored="true"/> > <field name="date_modified" type="date" indexed="true" stored="true" > multivalued="true"/> > <field name="_version_" type="long" indexed="true" stored="true" > multiValued="false"/> > <dynamicField name="ignored_*" type="text" indexed="true" stored="true" > multiValued="true"/> > > > But after *indexing using this curl*: > curl > "http://localhost:8080/solr/document/update/extract?literal.id=12&commit=true" > -F"myfile=Coding.pdf" > when queried as q=id:12, the *output* is : > <arr name="ignored_stream_source_info"> > <str>myfile</str> > </arr> > <arr name="ignored_stream_content_type"> > <str>application/octet-stream</str> > </arr> > <arr name="ignored_stream_size"> > <str>3336935</str> > </arr> > <arr name="ignored_stream_name"> > <str>Coding.pdf</str> > </arr> > <arr name="ignored_content_type"> > <str>application/pdf</str> > </arr> > <str name="contents"></str> ----*Contents not shown* > <long name="_version_">1456831756526157824</long> > <str name="doc_id">8eb229e0-5f25-4d26-bba4-6cb67aab7f81</str> > </doc> > > Why is it so?? > > Also date_modified field does not appear?? > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/using-extract-handler-data-not-extracted-tp4110850.html > Sent from the Solr - User mailing list archive at Nabble.com.