Yes, that schema is configured for the fields Nutch can generate. It's recommended to remove the Solr index after changing the schema. Then you must reindex. The
On Tuesday 10 May 2011 17:40:39 Gabriele Kahlout wrote: > You mean that I should copy it from nutch into solr? > > $ cp $NUTCH_HOME/conf/schema.xml $SOLR_HOME/conf/schema.xml > > After restarting tomcat, and re-executing the script nothing changed. > > On Tue, May 10, 2011 at 5:35 PM, Markus Jelsma > > <markus.jel...@openindex.io>wrote: > > You need to use the schema.xml shipped with Nutch in Solr. It provides > > most fields that you need. > > > > On Tuesday 10 May 2011 17:31:33 Gabriele Kahlout wrote: > > > I don't get you, are you talking about conf/schema.xml? That's what I'm > > > referring to. Am i supposed to do something with the nutch's > > > conf/schema.xml? > > > > > > On Tue, May 10, 2011 at 4:46 PM, Markus Jelsma > > > > > > <markus.jel...@openindex.io>wrote: > > > > There is a working example schema in Nutch' conf directory. > > > > > > > > On Tuesday 10 May 2011 16:40:02 Gabriele Kahlout wrote: > > > > > From solr logs: > > > > > > > > > > May 10, 2011 4:33:20 PM org.apache.solr.common.SolrException log > > > > > *SEVERE: org.apache.solr.common.SolrException: ERROR:unknown field > > > > > 'content' * > > > > > > > > > > at > > > > org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:32 > > > > > > 1) > > > > > > > > > at > > > > org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateP > > > > > > ro > > > > > > > > > cessorFactory.java:60) at > > > > > org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:147) > > > > at > > > > > > > org.apache.solr.handler.XMLLoader.load(XMLLoader.java:77) > > > > > > > > > > at > > > > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(Conten > > > > > > tS > > > > > > > > > treamHandlerBase.java:55) at > > > > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBa > > > > > > se > > > > > > > > > .java:129) at > > > > org.apache.solr.core.SolrCore.execute(SolrCore.java:1360) > > > > > > > at > > > > org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java: > > > > > 356) at > > > > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.ja > > > > > > va > > > > > > > > > :252) at > > > > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Applicat > > > > > > io > > > > > > > > > nFilterChain.java:244) at > > > > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilte > > > > > > rC > > > > > > > > > hain.java:210) at > > > > org.netbeans.modules.web.monitor.server.MonitorFilter.doFilter(MonitorFil > > > > > > te > > > > > > > > > r.java:393) at > > > > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Applicat > > > > > > io > > > > > > > > > nFilterChain.java:244) at > > > > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilte > > > > > > rC > > > > > > > > > hain.java:210) at > > > > org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve > > > > > > .j > > > > > > > > > ava:240) at > > > > org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve > > > > > > .j > > > > > > > > > ava:161) at > > > > org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java: > > > > 16 > > > > > > > > > 4) at > > > > org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java: > > > > 10 > > > > > > > > > 0) at > > > > org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:55 > > > > > > > 0) > > > > > > > > > > at > > > > org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.j > > > > > > av > > > > > > > > > a:118) at > > > > org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:38 > > > > > > 0) > > > > > > > > > at > > > > org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:243 > > > > > > ) > > > > > > > > > at > > > > org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(H > > > > > > tt > > > > > > > > > p11Protocol.java:188) at > > > > org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(H > > > > > > tt > > > > > > > > > p11Protocol.java:166) at > > > > org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.ja > > > > > > va > > > > > > > > > :288) at > > > > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor > > > > > > .j > > > > > > > > > ava:886) at > > > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java: > > > > > 908) at java.lang.Thread.run(Thread.java:680) > > > > > > > > > > in conf/schema.xml: > > > > > <!-- fields for index-basic plugin --> > > > > > > > > > > <field name="host" type="url" stored="false" > > > > > indexed="true"/> <field name="site" type="string" > > > > > stored="false" > > > > > indexed="true"/> <field name="url" type="url" stored="true" > > > > > indexed="true" > > > > > > > > > > required="true"/> > > > > > > > > > > * <field name="content" type="text" stored="false" > > > > > > > > indexed="true"/>* > > > > > > > > > in conf/solrindex-mapping.xml: > > > > > <fields> > > > > > > > > > > <field dest="content" source="content"/> > > > > > > > > > > In recent solr I think this has been renamed into text? > > > > > > > > > > Solr's conf/schema.xml: > > > > > via copyField further on in this schema --> > > > > > > > > > > * <field name="text" type="text" indexed="true" stored="false" > > > > > multiValued="true"/>* > > > > > > > > > > On Tue, May 10, 2011 at 4:30 PM, Gabriele Kahlout > > > > > > > > > > <gabri...@mysimpatico.com>wrote: > > > > > > It apparently is normal, and my issue is indeed with nutch. > > > > > > > > > > > > I've modified post.sh from the example docs to use the solr in > > > > > > http://localhost:8080/apache-solr-3.1-SNAPSHOT and now finally > > > > data > > > > > > made > > > > > > > > > > it to the index. > > > > > > $ post.sh solr.xml monitor.xml > > > > > > > > > > > > With nutch I'm at: > > > > > > > > > > > > $ svn info > > > > > > Path: . > > > > > > URL: http://svn.apache.org/repos/asf/nutch/branches/branch-1.3 > > > > > > Repository Root: http://svn.apache.org/repos/asf > > > > > > Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68 > > > > > > Revision: *1101459* > > > > > > Node Kind: directory > > > > > > Schedule: normal > > > > > > Last Changed Author: markus > > > > > > Last Changed Rev: 1101280 > > > > > > Last Changed Date: 2011-05-10 02:46:04 +0200 (Tue, 10 May 2011) > > > > > > > > > > > > Does this work for you? All I've done is svn co nutch 1.3 and > > > > execute > > > > > > my > > > > > > > > > > script which up to now worked. > > > > > > > > > > > > > > > > > > > > > > > > On Tue, May 10, 2011 at 4:11 PM, Gabriele Kahlout < > > > > > > > > > > > > gabri...@mysimpatico.com> wrote: > > > > > >> Hello, > > > > > >> > > > > > >> I'm having trouble getting Solr 3.1 to work with nutch-1.3. I'm > > > > not > > > > > > > >> sure where the problem is, but I'm wondering why does the > > > > > >> solrHome > > > > > > > > path > > > > > > > > > >> end with /./. > > > > > >> > > > > > >> cwd=/Applications/NetBeans/apache-tomcat-7.0.6/bin > > > > > >> SolrHome=/Users/simpatico/apache-solr-3.1.0/solr/./ > > > > > >> > > > > > >> In the web.xml of solr: > > > > > >> <env-entry> > > > > > >> > > > > > >> <env-entry-name>solr/home</env-entry-name> > > > > <env-entry-value>${user.home}/apache-solr-3.1.0/solr</env-entry-valu > > > > > > > >> e> > > > > > >> > > > > > >> <env-entry-type>java.lang.String</env-entry-type> > > > > > >> > > > > > >> </env-entry> > > > > > >> > > > > > >> -- > > > > > >> Regards, > > > > > >> K. Gabriele > > > > > >> > > > > > >> --- unchanged since 20/9/10 --- > > > > > >> P.S. If the subject contains "[LON]" or the addressee > > > > > >> acknowledges the receipt within 48 hours then I don't resend > > > > > >> the email. subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ > > > > > >> Acknowledges(x, this) > > > > ∧ > > > > > > > >> time(x) < Now + 48h) ⇒ ¬resend(I, this). > > > > > >> > > > > > >> If an email is sent by a sender that is not a trusted contact or > > > > the > > > > > > > >> email does not contain a valid code then the email is not > > > > received. > > > > > > > >> A valid code starts with a hyphen and ends with "X". > > > > > >> ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ > > > > > >> subject(x) > > > > ∧ > > > > > > > >> y > > > > > > > > ∈ > > > > > > > > > >> L(-[a-z]+[0-9]X)). > > > > > > > > > > > > -- > > > > > > Regards, > > > > > > K. Gabriele > > > > > > > > > > > > --- unchanged since 20/9/10 --- > > > > > > P.S. If the subject contains "[LON]" or the addressee > > > > > > acknowledges the receipt within 48 hours then I don't resend the > > > > > > email. subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ > > > > > > Acknowledges(x, this) > > > > ∧ > > > > > > > > time(x) < Now + 48h) ⇒ ¬resend(I, this). > > > > > > > > > > > > If an email is sent by a sender that is not a trusted contact or > > > > the > > > > > > > > email does not contain a valid code then the email is not > > > > > > received. > > > > A > > > > > > > > valid code starts with a hyphen and ends with "X". > > > > > > ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ > > > > > > subject(x) > > > > ∧ > > > > > > > > y > > > > > > > > ∈ > > > > > > > > > > L(-[a-z]+[0-9]X)). > > > > > > > > -- > > > > Markus Jelsma - CTO - Openindex > > > > http://www.linkedin.com/in/markus17 > > > > 050-8536620 / 06-50258350 > > > > -- > > Markus Jelsma - CTO - Openindex > > http://www.linkedin.com/in/markus17 > > 050-8536620 / 06-50258350 -- Markus Jelsma - CTO - Openindex http://www.linkedin.com/in/markus17 050-8536620 / 06-50258350