Hi - use the domain URL filter plugin and list the domains, hosts or TLD's you
want to restrict the crawl to.
-Original message-
> From:Vivekanand Ittigi
> Sent: Tuesday 29th July 2014 7:17
> To: solr-user@lucene.apache.org
> Subject: crawling all links of same d
Hi,
Can anyone tel me how to crawl all other pages of same domain.
For example i'm feeding a website http://www.techcrunch.com/ in seed.txt.
Following property is added in nutch-site.xml
db.ignore.internal.links
false
If true, when adding new links to a page, links from
the same host ar
;> > ..
> >> >
> >> > 2012/2/5 Matthew Parker
> >> >
> >> >> Doesn't tomcat run on port 8080, and not port 8983? Or did you change
> >> the
> >> >> tomcat's default port to 8983?
> >> >> On
t port to 8983?
>> >> On Feb 5, 2012 5:17 AM, "alessio crisantemi" <
>> alessio.crisant...@gmail.com
>> >> >
>> >> wrote:
>> >>
>> >> > Hi All,
>> >> > I have some problems with integration of Nutch in So
port 8080, and not port 8983? Or did you change
> the
> >> tomcat's default port to 8983?
> >> On Feb 5, 2012 5:17 AM, "alessio crisantemi" <
> alessio.crisant...@gmail.com
> >> >
> >> wrote:
> >>
> >> > Hi All,
>
't tomcat run on port 8080, and not port 8983? Or did you change the
>> tomcat's default port to 8983?
>> On Feb 5, 2012 5:17 AM, "alessio crisantemi" > >
>> wrote:
>>
>> > Hi All,
>> > I have some problems with integration of Nutch
8983.
> ..
>
> 2012/2/5 Matthew Parker
>
> > Doesn't tomcat run on port 8080, and not port 8983? Or did you change the
> > tomcat's default port to 8983?
> > On Feb 5, 2012 5:17 AM, "alessio crisantemi" <
> alessio.crisant...@gmail.com
> >
no, all run on port 8983.
..
2012/2/5 Matthew Parker
> Doesn't tomcat run on port 8080, and not port 8983? Or did you change the
> tomcat's default port to 8983?
> On Feb 5, 2012 5:17 AM, "alessio crisantemi" >
> wrote:
>
> > Hi All,
> > I ha
alessio crisantemi-2,
I think you got it.. Check the jars in nutch lib and see if the solr n solrj
jars are same... That could be the issue
--
View this message in context:
http://lucene.472066.n3.nabble.com/nutch-in-solr-tp3716969p3717542.html
Sent from the Solr - User mailing list archive at
Doesn't tomcat run on port 8080, and not port 8983? Or did you change the
tomcat's default port to 8983?
On Feb 5, 2012 5:17 AM, "alessio crisantemi"
wrote:
> Hi All,
> I have some problems with integration of Nutch in Solr and Tomcat.
>
> I follo Nutch tutoria
Hi All,
I have some problems with integration of Nutch in Solr and Tomcat.
I follo Nutch tutorial for integration and now, I can crawl a website: all
works right.
But It I try the solr integration, I can't indexing on Solr.
follow the nutch output after the command:
bin/nutch crawl urls
11 matches
Mail list logo