Re: What kind of nutch documents does Solr index?

2015-09-30 Thread Daniel Holmes
t Nutch is using is not unique. > > Upayavira > > On Mon, Sep 28, 2015, at 10:19 AM, Daniel Holmes wrote: > > Hi, > > I am using apache Nutch 1.7 to crawl and apache Solr 4.7.2 for indexing. > > In > > my tests there is a gap between number of fetched results of Nutch and >

What kind of nutch documents does Solr index?

2015-09-28 Thread Daniel Holmes
Hi, I am using apache Nutch 1.7 to crawl and apache Solr 4.7.2 for indexing. In my tests there is a gap between number of fetched results of Nutch and number of indexed documents in Solr. For example one of the crawls is fetched 23343 pages and 1146 images successfully while in the Solr 19250 docs

Re: problem with index size

2015-07-22 Thread Daniel Holmes
d . > So, do you simply mean you have 4 segments ? > Where is the problem anyway ? > You are also storing content which usually is a big part of the index. > As Upaya said, I am curious to know why you are so surprised ! > > Cheers > > 2015-07-22 11:27 GMT+01:00 Daniel Holmes

problem with index size

2015-07-22 Thread Daniel Holmes
Hi All I have problem with index size in solr 4.7.2. My OS is Ubuntu 14.10 64-bit. my fields are : In one case for instance my segments size is 8.4G while index size is 28G!!! It seems unusual... What suggestions do you have to reduce index size? Is there any way to check disk usage d