OK, I asked another question for it and it has solved, thanks. 2013/4/26 Furkan KAMACI <furkankam...@gmail.com>
> Jack, thanks for your answers. Ok, when I remove -Durl parameter I think > it works, thanks. However I think that I have a problem with my schema. I > get that error: > > Apr 26, 2013 3:52:21 PM org.apache.solr.common.SolrException log > SEVERE: org.apache.solr.common.SolrException: ERROR: > [doc=/home/ll/Desktop/b/lucene-solr-lucene_solr_4_2_1/solr/example/exampledocs/523387.pdf] > multiple values encountered for non multiValued copy field text: > application/pdf > > > 2013/4/26 Jack Krupansky <j...@basetechnology.com> > >> Maybe you are confusing things by mixing instructions - there are >> SEPARATE instructions for directly using SolrCell and implicitly using it >> via post.jar. Pick which you want and stick with it. DO NOT MIX the >> instructions. >> >> You wrote: " I run that command: java -Durl= >> http://localhost:8983/solr/update/extract -jar post.jar 523387.pdf" >> >> Was there a GOOD reason that you chose that URL? >> >> Best to stay with what the post.jar wiki recommends: >> >> Post all CSV, XML, JSON and PDF documents using AUTO mode which detects >> type based on file name: >> >> java -Dauto -jar post.jar *.csv *.xml *.json *.pdf >> >> Or, stick with SolrCell directly, but follow its distinct instructions: >> http://wiki.apache.org/solr/ExtractingRequestHandler >> >> Again, DO NOT MIX the instructions from the two. >> >> post.jar is designed so that you do not need to know or care exactly how >> rich document indexing works. >> >> -- Jack Krupansky >> >> -----Original Message----- From: Furkan KAMACI >> Sent: Friday, April 26, 2013 5:30 AM >> To: solr-user@lucene.apache.org >> Subject: Document is missing mandatory uniqueKey field: id for Solr PDF >> indexing >> >> >> I use Solr 4.2.1 and these are my fields: >> >> <field name="id" type="string" indexed="true" stored="true" >> required="true" >> multiValued="false" /> >> <field name="text" type="text_general" indexed="true" stored="true"/> >> >> >> <!-- Common metadata fields, named specifically to match up with >> SolrCell metadata when parsing rich documents such as Word, PDF. >> Some fields are multiValued only because Tika currently may return >> multiple values for them. Some metadata is parsed from the documents, >> but there are some which come from the client context: >> "content_type": From the HTTP headers of incoming stream >> "resourcename": From SolrCell request param resource.name >> --> >> <field name="title" type="text_general" indexed="true" stored="true" >> multiValued="true"/> >> <field name="subject" type="text_general" indexed="true" stored="true"/> >> <field name="description" type="text_general" indexed="true" >> stored="true"/> >> <field name="comments" type="text_general" indexed="true" stored="true"/> >> <field name="author" type="text_general" indexed="true" stored="true"/> >> <field name="keywords" type="text_general" indexed="true" stored="true"/> >> <field name="category" type="text_general" indexed="true" stored="true"/> >> <field name="resourcename" type="text_general" indexed="true" >> stored="true"/> >> <field name="url" type="text_general" indexed="true" stored="true"/> >> <field name="content_type" type="string" indexed="true" stored="true" >> multiValued="true"/> >> <field name="last_modified" type="date" indexed="true" stored="true"/> >> <field name="links" type="string" indexed="true" stored="true" >> multiValued="true"/> >> >> <!-- Main body of document extracted by SolrCell. >> NOTE: This field is not indexed by default, since it is also copied to >> "text" >> using copyField below. This is to save space. Use this field for returning >> and >> highlighting document content. Use the "text" field to search the content. >> --> >> <field name="content" type="text_general" indexed="false" stored="true" >> multiValued="true"/> >> >> >> <!-- catchall field, containing all other searchable text fields >> (implemented >> via copyField further on in this schema --> >> <!-- >> <field name="text" type="text_general" indexed="true" stored="false" >> multiValued="true"/> >> --> >> <!-- catchall text field that indexes tokens both normally and in reverse >> for efficient >> leading wildcard queries. --> >> <field name="text_rev" type="text_general_rev" indexed="true" >> stored="false" multiValued="true"/> >> >> <!-- non-tokenized version of manufacturer to make it easier to sort or >> group >> results by manufacturer. copied from "manu" via copyField --> >> <field name="manu_exact" type="string" indexed="true" stored="false"/> >> >> <field name="payloads" type="payloads" indexed="true" stored="true"/> >> >> <field name="_version_" type="long" indexed="true" stored="true"/> >> >> I run that command: >> >> java -Durl=http://localhost:8983/solr/update/extract -jar post.jar >> 523387.pdf >> >> However I get that error, any ideas? >> >> Apr 26, 2013 12:26:51 PM org.apache.solr.common.SolrException log >> SEVERE: org.apache.solr.common.SolrException: Document is missing >> mandatory >> uniqueKey field: id >> at >> >> org.apache.solr.update.AddUpdateCommand.getIndexedId(AddUpdateCommand.java:88) >> at >> >> org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:464) >> at >> >> org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:346) >> at >> >> org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:100) >> at >> >> org.apache.solr.handler.extraction.ExtractingDocumentLoader.doAdd(ExtractingDocumentLoader.java:121) >> at >> >> org.apache.solr.handler.extraction.ExtractingDocumentLoader.addDoc(ExtractingDocumentLoader.java:126) >> at >> >> org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:228) >> at >> >> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74) >> at >> >> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) >> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1817) >> at >> >> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:639) >> at >> >> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345) >> at >> >> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141) >> at >> >> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1307) >> at >> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:453) >> at >> >> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137) >> at >> >> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:560) >> at >> >> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231) >> at >> >> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1072) >> at >> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:382) >> at >> >> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) >> at >> >> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1006) >> at >> >> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) >> at >> >> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255) >> at >> >> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154) >> at >> >> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) >> at org.eclipse.jetty.server.Server.handle(Server.java:365) >> at >> >> org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:485) >> at >> >> org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53) >> at >> >> org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:937) >> at >> >> org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:998) >> at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:856) >> at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240) >> at >> >> org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72) >> at >> >> org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264) >> at >> >> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) >> at >> >> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) >> at java.lang.Thread.run(Thread.java:722) >> > >