Erick:
Thank your so much for your advise. Now we do not index a large number of files, but in future we may. I will pay more attention to ExtractingRequestHandler. Thanks again. Best regard, Jianer > -----原始邮件----- > 发件人: "Erick Erickson" <erickerick...@gmail.com> > 发送时间: 2015年12月22日 星期二 > 收件人: solr-user <solr-user@lucene.apache.org> > 抄送: > 主题: Re: Re: Some problems when upload data to index in cloud environment > > Jianer: > > Getting your head around the configs is, indeed, "exciting" at times. > > I just wanted to caution you that using ExtractingRequestHandler > puts the Tika parsing load on the Solr server, which doesn't > scale as the same machine that's serving queries and indexing > is _also_ parsing potentially very large files. It may not matter > if you don't do it often, but if you're going to index a large number > of files and/or you're going to do this continuously, you probably > want to move the parsing off Solr. Here's an example with DB > as well, but the DB bits can be removed easily. > > https://lucidworks.com/blog/2012/02/14/indexing-with-solrj/ > > Best, > Erick > > On Sun, Dec 20, 2015 at 9:29 PM, 周建二 <zhoujia...@ict.ac.cn> wrote: > > Hi Shawn, thanks for your reply. :) > > > > > > It is because the /update/extract handler is not defined in my collection's > > solrconfig.xml file as I upload the basic_configs/conf to ZooKeeper. When I > > upload sample_techproducts_configs to ZooKeeper, everything goes well. > > > > > > I am a freshman for Solr. Now I am going to learn the schema.xml > > solrconfig.xml, and try to make my own config for my dataset based on the > > basic_configs. > > > > > > Thanks again. > > Jianer > > > > > >> -----原始邮件----- > >> 发件人: "Shawn Heisey" <apa...@elyograg.org> > >> 发送时间: 2015年12月20日 星期日 > >> 收件人: solr-user@lucene.apache.org > >> 抄送: > >> 主题: Re: Some problems when upload data to index in cloud environment > >> > >> On 12/18/2015 6:16 PM, 周建二 wrote: > >> > I am building a solr cloud production environment. My solr version is > >> > 5.3.1. The environment consists three nodes running CentOS 6.5. First I > >> > build the zookeeper environment by the three nodes, and then run solr on > >> > the three nodes, and at last build a collection consists of three shards > >> > and each shard has two replicas. After that we can see that cloud > >> > structure on the Solr Admin page. > >> > >> <snip> > >> > >> > <body><h2>HTTP ERROR 404</h2> > >> > > >> > <p>Problem accessing /solr/cloud-test/update/extract. Reason: > >> > >> One of two problems is likely: Either there is no collection named > >> "cloud-test" on your cloud, or the /update/extract handler is not > >> defined in that collection's solrconfig.xml file. The active version of > >> this file lives in zookeeper when you're running SolrCloud. > >> > >> If you're sure a collection with this name exists, how exactly did you > >> create it? Was it built with one of the sample configs or with a config > >> that you built yourself? > >> > >> Of the three configsets included with the Solr dowbload, > >> data_driven_schema_configs and sample_techproducts_configs contain the > >> /update/extract handler. The configset named basic_configs does NOT > >> contain the handler. > >> > >> Thanks, > >> Shawn > >> > > > > > >