Re: Missing required field: id Using ExtractingRequestHandler

Chris Harris Thu, 19 Mar 2009 16:51:58 -0700

Unless there's a regression in the ExtractingRequestHandler, then this
should be caused because both


A) you have an id field defined in your solr schema file that's marked
as a required field

and

B) you did not specify an ID parameter when you submitted your
document to the handler.

If you don't want your Solr docs to have an id field, then mark that
field as not required in your schema.

If you *do* want your Solr docs to have a required field called id,
then you'll need to specify the ID when you submit your document. One
way is using an ext.literal parameter, more or less like this:

    startofURL...&ext.literal.id=13&...restofURL

Alternatively, you can try the field mapping mechanism, which is
hopefully described on the wiki page.

Cheers,
Chris

On Thu, Mar 19, 2009 at 3:46 PM, Larry Reid <lcr...@jadesystems.ca> wrote:
> I trying to index Word, PDF and other documents with Solr. I installed
> the latest nightly build of Solr on March 17. I followed the
> instructions in the Wiki for ExtractingRequestHandler at
> http://wiki.apache.org/solr/ExtractingRequestHandler#head-c95841f9eda007b6b4e4594ead12a04223cf7b6e.
>
> I have produced text output from tiki in the nightly build directories
> from PDF files.
>
> When I try the suggested test curl commands in the "Getting Started with
> the Solr Examle" section of the Wiki page, I get the following. Any idea
> what I've done wrong? Thanks in advance for your help.
>
> $ curl http://localhost:8983/solr/update/extract?ext.idx.attr=true
> \&ext.def.fl=text -F "myfi...@tutorial.pdf"
> <html>
> <head>
> <meta http-equiv="Content-Type" content="text/html;
> charset=ISO-8859-1"/>
> <title>Error 500 </title>
> </head>
> <body><h2>HTTP ERROR: 500</h2><pre>org.apache.solr.common.SolrException:
> Document [null] missing required field: id
>
> org.apache.solr.common.SolrException:
> org.apache.solr.common.SolrException: Document [null] missing required
> field: id
>        at
> org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:169)
>        at
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
>        at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
>        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1333)
>        at
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:303)
>        at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:232)
>        at org.mortbay.jetty.servlet.ServletHandler
> $CachedChain.doFilter(ServletHandler.java:1089)
>        at
> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
>        at
> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
>        at
> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
>        at
> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
>        at
> org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
>        at
> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211)
>        at
> org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
>        at
> org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
>        at org.mortbay.jetty.Server.handle(Server.java:285)
>        at
> org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
>        at org.mortbay.jetty.HttpConnection
> $RequestHandler.content(HttpConnection.java:835)
>        at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641)
>        at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202)
>        at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
>        at org.mortbay.jetty.bio.SocketConnector
> $Connection.run(SocketConnector.java:226)
>        at org.mortbay.thread.BoundedThreadPool
> $PoolThread.run(BoundedThreadPool.java:442)
> Caused by: org.apache.solr.common.SolrException: Document [null] missing
> required field: id
>        at
> org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:292)
>        at
> org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:59)
>        at
> org.apache.solr.handler.extraction.ExtractingDocumentLoader.doAdd(ExtractingDocumentLoader.java:90)
>        at
> org.apache.solr.handler.extraction.ExtractingDocumentLoader.addDoc(ExtractingDocumentLoader.java:95)
>        at
> org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:157)
>        ... 22 more
> </pre>
> <p>RequestURI=/solr/update/extract</p><p><i><small><a
> href="http://jetty.mortbay.org/";>Powered by
> Jetty://</a></small></i></p><br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
>
> </body>
> </html>
>
>
> Larry Reid
> Principal Consultant, Jade Systems Inc
> Mobile: +1 604.376.8884
> Pragmatic IT Blog | El Blog Technologia Pragmatica | www.jadesystems.ca
>

Re: Missing required field: id Using ExtractingRequestHandler

Reply via email to