Re: Re: Re: Some problems when upload data to index in cloud environment

周建二 Mon, 21 Dec 2015 16:54:47 -0800

Erick:


Thank your so much for your advise. Now we do not index a large number of 
files, but in future we may. I will pay more attention to 
ExtractingRequestHandler. Thanks again.


Best regard,
Jianer


> -----原始邮件-----
> 发件人: "Erick Erickson" <erickerick...@gmail.com>
> 发送时间: 2015年12月22日 星期二
> 收件人: solr-user <solr-user@lucene.apache.org>
> 抄送: 
> 主题: Re: Re: Some problems when upload data to index in cloud environment
> 
> Jianer:
> 
> Getting your head around the configs is, indeed, "exciting" at times.
> 
> I just wanted to caution you that using ExtractingRequestHandler
> puts the Tika parsing load on the Solr server, which doesn't
> scale as the same machine that's serving queries and indexing
> is _also_ parsing potentially very large files. It may not matter
> if you don't do it often, but if you're going to index a large number
> of files and/or you're going to do this continuously, you probably
> want to move the parsing off Solr. Here's an example with DB
> as well, but the DB bits can be removed easily.
> 
> https://lucidworks.com/blog/2012/02/14/indexing-with-solrj/
> 
> Best,
> Erick
> 
> On Sun, Dec 20, 2015 at 9:29 PM, 周建二 <zhoujia...@ict.ac.cn> wrote:
> > Hi Shawn, thanks for your reply. :)
> >
> >
> > It is because the /update/extract handler is not defined in my collection's 
> > solrconfig.xml file as I upload the basic_configs/conf to ZooKeeper. When I 
> > upload sample_techproducts_configs to ZooKeeper, everything goes well.
> >
> >
> > I am a freshman for Solr. Now I am going to learn the schema.xml 
> > solrconfig.xml,  and try to make my own config for my dataset based on the 
> > basic_configs.
> >
> >
> > Thanks again.
> > Jianer
> >
> >
> >> -----原始邮件-----
> >> 发件人: "Shawn Heisey" <apa...@elyograg.org>
> >> 发送时间: 2015年12月20日 星期日
> >> 收件人: solr-user@lucene.apache.org
> >> 抄送:
> >> 主题: Re: Some problems when upload data to index in cloud environment
> >>
> >> On 12/18/2015 6:16 PM, 周建二 wrote:
> >> > I am building a solr cloud production environment. My solr version is 
> >> > 5.3.1. The environment consists three nodes running CentOS 6.5. First I 
> >> > build the zookeeper environment by the three nodes, and then run solr on 
> >> > the three nodes, and at last build a collection consists of three shards 
> >> > and each shard has two replicas. After that we can see that cloud 
> >> > structure on the Solr Admin page.
> >>
> >> <snip>
> >>
> >> > <body><h2>HTTP ERROR 404</h2>
> >> >
> >> > <p>Problem accessing /solr/cloud-test/update/extract. Reason:
> >>
> >> One of two problems is likely:  Either there is no collection named
> >> "cloud-test" on your cloud, or the /update/extract handler is not
> >> defined in that collection's solrconfig.xml file.  The active version of
> >> this file lives in zookeeper when you're running SolrCloud.
> >>
> >> If you're sure a collection with this name exists, how exactly did you
> >> create it?  Was it built with one of the sample configs or with a config
> >> that you built yourself?
> >>
> >> Of the three configsets included with the Solr dowbload,
> >> data_driven_schema_configs and sample_techproducts_configs contain the
> >> /update/extract handler.  The configset named basic_configs does NOT
> >> contain the handler.
> >>
> >> Thanks,
> >> Shawn
> >>
> >
> >
> >

Re: Re: Re: Some problems when upload data to index in cloud environment

Reply via email to