Not an issue that I know of. I expect you've got some obscure problem in your definitions, but I'm guession. Try modifying your schema so the glob pattern maps to a stored field, something like: <dynamicField name="*" type="string" multiValued="true" stored="true" /> remove all other fields except id, remove your mapping, and try it again. If you query with fl=* you should see everything that was extracted. That'll tell you whether it is a problem with Solr/Tika or something in how you're using them.
Best Erick On Mon, Nov 26, 2012 at 10:19 AM, Brett Melbourne < bmelbou...@halogensoftware.com> wrote: > Hi Erik, > > The document is committed successfully... it is just missing all the > extracted content from Tika when I query for that document. > > i.e. the mapped content field attr_content is empty > (fmap.content=attr_content) > > <result name="response" numFound="1" start="0" maxScore="1.9162908"> > <doc> > <float name="score">1.9162908</float> > <arr name="attr_character_count"> > <str>24</str> > </arr> > <arr name="attr_content"> > <str></str> > </arr> > <arr name="attr_creation_date"> > <str>2009-04-16T11:32:00</str> > </arr> > <arr name="attr_date"> > <str>2012-11-23T00:29:39.73</str> > </arr> > > ... > > </result> > > > Brett. > > -----Original Message----- > From: Erick Erickson [mailto:erickerick...@gmail.com] > Sent: Sunday, November 25, 2012 9:27 PM > To: solr-user@lucene.apache.org > Subject: Re: Problem with Solr 3.6.1 extracting ODT content using > SolrCell's ExtractingRequestHandler > > Did you commit after you added the document but before you tried the > search? > > Best > Erick > > > On Fri, Nov 23, 2012 at 6:25 PM, Brett Melbourne < > bmelbou...@halogensoftware.com> wrote: > > > Hi all, > > > > I am encountering a problem where Solr 3.6.1 is not able to extract > > the text content from ODT (Open Office Document) files submitted to > > the ExtractingRequestHandler. I can reproduce this issue against the > > example schema running with jetty. > > > > Executing a simple index request (based on the example in the wiki): > > curl " > > http://localhost:8983/solr/update/extract?literal.id=doc1&uprefix=attr > > _&fmap.content=attr_content&commit=true > > "< > > http://localhost:8983/solr/update/extract?literal.id=doc1&uprefix=attr > > _&fmap.content=attr_content&commit=true%22> > > -F "myfile=@testfile.odt" > > returns no errors, and does not generate any exceptions in the > log/console. > > > > A query for doc1 returns an empty attr_content field: > > <arr name="attr_content"> <str></str> </arr> > > > > Oddly enough, executing an "extractOnly=true" request against the > > ExtractingRequestHandler with the same ODT file correctly returns the > > text of the file. > > > > I am wondering: > > > > * Is this a known issue? (I couldn't find any mention of this > > particular issue anywhere...) > > > > * Are there any workarounds or does anyone have any suggestions? > > > > Thanks, > > > > Brett. > > > > >