> *5. Create a seed list (the initial urls to fetch)*
>
> mkdir urls *(crea una cartella ‘urls’)*
>
> echo "http://www.google.it/"; > urls/seed.txt
>
> *6. Inject seed url(s) to nutch crawldb (execute in nutch directory)*
>
> bin/nutch inject crawl/crawldb urls
&
url in the url
> location that you provide.
>
> Kindly ensure there is a url there.
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/nutch-and-solr-tp3765166p3773089.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
The empty path message is becayse nutch is unable to find a url in the url
location that you provide.
Kindly ensure there is a url there.
--
View this message in context:
http://lucene.472066.n3.nabble.com/nutch-and-solr-tp3765166p3773089.html
Sent from the Solr - User mailing list archive at
ill be for different domains. So for each domain folder in
> urls folder there has to be a corresponding folder (with the same name) in
> the crawl folder.
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/nutch-and-solr-tp3765166p3765607.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
domain folder in
urls folder there has to be a corresponding folder (with the same name) in
the crawl folder.
--
View this message in context:
http://lucene.472066.n3.nabble.com/nutch-and-solr-tp3765166p3765607.html
Sent from the Solr - User mailing list archive at Nabble.com.
I try to configured nutch (1.4) on my solr 3.2
But when I try with a crawl command
"bin/nutch inject crawl/crawldb urls"
don't works, and it reply with "can't convert a empty path"
why, in your opinion?
tx
a.
> >> >
> >> > I downloaded both the softwares, however, I am getting error (*solrUrl
> >> is
> >> > not set, indexing will be skipped..*) when I am trying to crawl using
> >> > Cygwin.
> >> >
> >> > Can anyone please help me out to fix this issue ?
> >> > Else any other website suggesting for Apache Nutch and Solr
> integration
> >> > would be greatly helpful.
> >> >
> >> >
> >> >
> >> > Thanks & Regards,
> >> > Serenity
> >> >
> >>
> >
> >
>
techietutorials.blogspot.com/2011/06/how-to-build-and-start-apache-solr.html
>> >
>> >
>> > I downloaded both the softwares, however, I am getting error (*solrUrl
>> is
>> > not set, indexing will be skipped..*) when I am trying to crawl using
>> > Cygwin.
>> >
>> > Can anyone please help me out to fix this issue ?
>> > Else any other website suggesting for Apache Nutch and Solr integration
>> > would be greatly helpful.
>> >
>> >
>> >
>> > Thanks & Regards,
>> > Serenity
>> >
>>
>
>
t; > Cygwin.
> >
> > Can anyone please help me out to fix this issue ?
> > Else any other website suggesting for Apache Nutch and Solr integration
> > would be greatly helpful.
> >
> >
> >
> > Thanks & Regards,
> > Serenity
> >
>
lrUrl is
> not set, indexing will be skipped..*) when I am trying to crawl using
> Cygwin.
>
> Can anyone please help me out to fix this issue ?
> Else any other website suggesting for Apache Nutch and Solr integration
> would be greatly helpful.
>
>
>
> Thanks & Regards,
> Serenity
etting error (*solrUrl is
> not set, indexing will be skipped..*) when I am trying to crawl using
> Cygwin.
>
> Can anyone please help me out to fix this issue ?
> Else any other website suggesting for Apache Nutch and Solr integration
> would be greatly helpful.
>
>
>
> Thanks & Regards,
> Serenity
>
l
I downloaded both the softwares, however, I am getting error (*solrUrl is
not set, indexing will be skipped..*) when I am trying to crawl using
Cygwin.
Can anyone please help me out to fix this issue ?
Else any other website suggesting for Apache Nutch and Solr integration
would be greatly he
gt; >
> > > >
> > > > On Wed, Feb 9, 2011 at 7:09 PM, Markus Jelsma <
> > > markus.jel...@openindex.io
> > > > >wrote:
> > > >
> > > > > The parsed data is only sent to the Solr index of you tell a
> segment
gt; > > >wrote:
> > >
> > > > The parsed data is only sent to the Solr index of you tell a segment
> to
> > > be
> > > > indexed; solrindex
> > > >
> > > > If you did this only once after injecting and then the conseq
rsed data is only sent to the Solr index of you tell a segment to
> > be
> > > indexed; solrindex
> > >
> > > If you did this only once after injecting and then the consequent
> > > fetch,parse,update,index sequence then you, of course, only see thos
er injecting and then the consequent
> > fetch,parse,update,index sequence then you, of course, only see those
> > URL's.
> > If you don't index a segment after it's being parsed, you need to do it
> > later
> > on.
> >
> > On Wednesd
indexed; solrindex
> >
> > If you did this only once after injecting and then the consequent
> > fetch,parse,update,index sequence then you, of course, only see those
> > URL's.
> > If you don't index a segment after it's being parsed, you need to do
quence then you, of course, only see those
> URL's.
> If you don't index a segment after it's being parsed, you need to do it
> later
> on.
>
> On Wednesday 09 February 2011 04:29:44 .: Abhishek :. wrote:
> > Hi all,
> >
> > I am a newbie to
;s being parsed, you need to do it later
on.
On Wednesday 09 February 2011 04:29:44 .: Abhishek :. wrote:
> Hi all,
>
> I am a newbie to nutch and solr. Well relatively much newer to Solr than
> Nutch :)
>
> I have been using nutch for past two weeks, and I wanted to know
Hi all,
I am a newbie to nutch and solr. Well relatively much newer to Solr than
Nutch :)
I have been using nutch for past two weeks, and I wanted to know if I can
query or search on my nutch crawls on the fly(before it completes). I am
asking this because the websites I am crawling are really
010 at 4:21 PM, Anurag wrote:
>
>>
>> why are using solrindex in the argument.? It is used when we need to index
>> the crawled data in Solr
>> For more read http://wiki.apache.org/nutch/NutchTutorial .
>>
>> Also for nutch-solr integration this is very usef
very useful blog
> http://www.lucidimagination.com/blog/2009/03/09/nutch-solr/
> I integrated nutch and solr and it works well.
>
> Thanks
>
> On Tue, Dec 21, 2010 at 1:57 AM, Adam Estrada-2 [via Lucene] <
> ml-node+2122347-622655030-146...@n3.nabble.com
>
> >
> > wrote:
>
>
gt;
> Also for nutch-solr integration this is very useful blog
> http://www.lucidimagination.com/blog/2009/03/09/nutch-solr/
> I integrated nutch and solr and it works well.
>
> Thanks
>
> On Tue, Dec 21, 2010 at 1:57 AM, Adam Estrada-2 [via Lucene] <
> ml-node+2122347-622655
>
> at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1249)
> at org.apache.nutch.crawl.Injector.inject(Injector.java:217)
> at org.apache.nutch.crawl.Crawl.main(Crawl.java:124)
>
>
> --
> View message @
> http://lucene.47206
All,
I have a couple websites that I need to crawl and the following command line
used to work I think. Solr is up and running and everything is fine there
and I can go through and index the site but I really need the results added
to Solr after the crawl. Does anyone have any idea on how to make
25 matches
Mail list logo