TimeoutException, IOException, Read timed out

2017-10-25 Thread Fengtan
Hi,

We run a SolrCloud 6.4.2 cluster with ZooKeeper 3.4.6 on 3 VM's.
Each VM runs RHEL 7 with 16 GB RAM and 8 CPU and OpenJDK 1.8.0_131 ; each
VM has one Solr and one ZK instance.
The cluster hosts 1,000 collections ; each collection has 1 shard and
between 500 and 50,000 documents.
Documents are indexed incrementally every day ; the Solr client mostly does
searching.
Solr runs with -Xms7g -Xmx7g.

Everything has been working fine for about one month but a few days ago we
started to see Solr timeouts: https://pastebin.com/raw/E2prSrQm

Also we have always seen these:
  PERFORMANCE WARNING: Overlapping onDeckSearchers=2


We are not sure what is causing the timeouts, although we have identified a
few things that could be improved:

1) Ignore explicit commits using IgnoreCommitOptimizeUpdateProcessorFactory
-- we are aware that explicit commits are expensive

2) Drop the 1,000 collections and use a single one instead (all our
collections use the same schema/solrconfig.xml) since stability problems
are expected when the number of collections reaches the low hundreds
. The
downside is that the new collection would contain 1,000,000 documents which
may bring new challenges.

3) Tune the GC and possibly switch from CMS to G1 as it seems to bring a
better performance according to this
,
this

and this
.
The downside is that Lucene explicitely discourages the usage of G1

so we are not sure what to expect. We use the default GC settings:
  -XX:NewRatio=3
  -XX:SurvivorRatio=4
  -XX:TargetSurvivorRatio=90
  -XX:MaxTenuringThreshold=8
  -XX:+UseConcMarkSweepGC
  -XX:+UseParNewGC
  -XX:ConcGCThreads=4
  -XX:ParallelGCThreads=4
  -XX:+CMSScavengeBeforeRemark
  -XX:PretenureSizeThreshold=64m
  -XX:+UseCMSInitiatingOccupancyOnly
  -XX:CMSInitiatingOccupancyFraction=50
  -XX:CMSMaxAbortablePrecleanTime=6000
  -XX:+CMSParallelRemarkEnabled
  -XX:+ParallelRefProcEnabled

4) Tune the caches, possibly by increasing autowarmCount on filterCache --
our current config is:
  
  
  

5) Tweak the timeout settings, although this would not fix the underlying
issue


Does any of these options seem relevant ? Is there anything else that might
address the timeouts ?

Thanks


Re: TimeoutException, IOException, Read timed out

2017-10-26 Thread Fengtan
Thanks Erick and Emir -- we are going to start with <1> and possibly <2>.

On Thu, Oct 26, 2017 at 7:06 AM, Emir Arnautović <
emir.arnauto...@sematext.com> wrote:

> Hi Fengtan,
> I would just add that when merging collections, you might want to use
> document routing (https://lucene.apache.org/solr/guide/6_6/shards-and-
> indexing-data-in-solrcloud.html#ShardsandIndexingDatainSolrClo
> ud-DocumentRouting <https://lucene.apache.org/solr/guide/6_6/shards-and-
> indexing-data-in-solrcloud.html#ShardsandIndexingDatainSolrClo
> ud-DocumentRouting>) - since you are keeping separate collections, I
> guess you have a “collection ID” to use as routing key. This will enable
> you to have one collection but query only shard(s) with data from one
> “collection”.
>
> HTH,
> Emir
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
>
> > On 25 Oct 2017, at 19:25, Erick Erickson 
> wrote:
> >
> > <1> It's not the explicit commits are expensive, it's that they happen
> > too fast. An explicit commit and an internal autocommit have exactly
> > the same cost. Your "overlapping ondeck searchers"  is definitely an
> > indication that your commits are happening from somwhere too quickly
> > and are piling up.
> >
> > <2> Likely a good thing, each collection increases overhead. And
> > 1,000,000 documents is quite small in Solr's terms unless the
> > individual documents are enormous. I'd do this for a number of
> > reasons.
> >
> > <3> Certainly an option, but I'd put that last. Fix the commit problem
> first ;)
> >
> > <4> If you do this, make the autowarm count quite small. That said,
> > this will be very little use if you have frequent commits. Let's say
> > you commit every second. The autowarming will warm caches, which will
> > then be thrown out a second later. And will increase the time it takes
> > to open a new searcher.
> >
> > <5> Yeah, this would probably just be a band-aid.
> >
> > If I were prioritizing these, I'd do
> > <1> first. If you control the client, just don't call commit. If you
> > do not control the client, then what you've outlined is fine. Tip: set
> > your soft commit settings to be as long as you can stand. If you must
> > have very short intervals, consider disabling your caches completely.
> > Here's a long article on commits
> > https://lucidworks.com/2013/08/23/understanding-
> transaction-logs-softcommit-and-commit-in-sorlcloud/
> >
> > <2> Actually, this and <1> are pretty close in priority.
> >
> > Then re-evaluate. Fixing the commit issue may buy you quite a bit of
> > time. Having 1,000 collections is pushing the boundaries presently.
> > Each collection will establish watchers on the bits it cares about in
> > ZooKeeper, and reducing the watchers by a factor approaching 1,000 is
> > A Good Thing.
> >
> > Frankly, between these two things I'd pretty much expect your problems
> > to disappear. wouldn't be the first time I've been totally wrong, but
> > it's where I'd start ;)
> >
> > Best,
> > Erick
> >
> > On Wed, Oct 25, 2017 at 8:54 AM, Fengtan  wrote:
> >> Hi,
> >>
> >> We run a SolrCloud 6.4.2 cluster with ZooKeeper 3.4.6 on 3 VM's.
> >> Each VM runs RHEL 7 with 16 GB RAM and 8 CPU and OpenJDK 1.8.0_131 ;
> each
> >> VM has one Solr and one ZK instance.
> >> The cluster hosts 1,000 collections ; each collection has 1 shard and
> >> between 500 and 50,000 documents.
> >> Documents are indexed incrementally every day ; the Solr client mostly
> does
> >> searching.
> >> Solr runs with -Xms7g -Xmx7g.
> >>
> >> Everything has been working fine for about one month but a few days ago
> we
> >> started to see Solr timeouts: https://pastebin.com/raw/E2prSrQm
> >>
> >> Also we have always seen these:
> >>  PERFORMANCE WARNING: Overlapping onDeckSearchers=2
> >>
> >>
> >> We are not sure what is causing the timeouts, although we have
> identified a
> >> few things that could be improved:
> >>
> >> 1) Ignore explicit commits using IgnoreCommitOptimizeUpdateProc
> essorFactory
> >> -- we are aware that explicit commits are expensive
> >>
> >> 2) Drop the 1,000 collections and use a single one instead (all our
> >> collections use the same schema/so

Re: TimeoutException, IOException, Read timed out

2017-11-13 Thread Fengtan
I am happy to report that <1> fixed these:
  PERFORMANCE WARNING: Overlapping onDeckSearchers=2

We still occasionnally see timeouts so we may have to explore <2>.





On Thu, Oct 26, 2017 at 12:12 PM, Fengtan  wrote:

> Thanks Erick and Emir -- we are going to start with <1> and possibly <2>.
>
> On Thu, Oct 26, 2017 at 7:06 AM, Emir Arnautović <
> emir.arnauto...@sematext.com> wrote:
>
>> Hi Fengtan,
>> I would just add that when merging collections, you might want to use
>> document routing (https://lucene.apache.org/sol
>> r/guide/6_6/shards-and-indexing-data-in-solrcloud.html#Shard
>> sandIndexingDatainSolrCloud-DocumentRouting <
>> https://lucene.apache.org/solr/guide/6_6/shards-and-indexin
>> g-data-in-solrcloud.html#ShardsandIndexingDatainSolrCloud-DocumentRouting>)
>> - since you are keeping separate collections, I guess you have a
>> “collection ID” to use as routing key. This will enable you to have one
>> collection but query only shard(s) with data from one “collection”.
>>
>> HTH,
>> Emir
>> --
>> Monitoring - Log Management - Alerting - Anomaly Detection
>> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>>
>>
>>
>> > On 25 Oct 2017, at 19:25, Erick Erickson 
>> wrote:
>> >
>> > <1> It's not the explicit commits are expensive, it's that they happen
>> > too fast. An explicit commit and an internal autocommit have exactly
>> > the same cost. Your "overlapping ondeck searchers"  is definitely an
>> > indication that your commits are happening from somwhere too quickly
>> > and are piling up.
>> >
>> > <2> Likely a good thing, each collection increases overhead. And
>> > 1,000,000 documents is quite small in Solr's terms unless the
>> > individual documents are enormous. I'd do this for a number of
>> > reasons.
>> >
>> > <3> Certainly an option, but I'd put that last. Fix the commit problem
>> first ;)
>> >
>> > <4> If you do this, make the autowarm count quite small. That said,
>> > this will be very little use if you have frequent commits. Let's say
>> > you commit every second. The autowarming will warm caches, which will
>> > then be thrown out a second later. And will increase the time it takes
>> > to open a new searcher.
>> >
>> > <5> Yeah, this would probably just be a band-aid.
>> >
>> > If I were prioritizing these, I'd do
>> > <1> first. If you control the client, just don't call commit. If you
>> > do not control the client, then what you've outlined is fine. Tip: set
>> > your soft commit settings to be as long as you can stand. If you must
>> > have very short intervals, consider disabling your caches completely.
>> > Here's a long article on commits
>> > https://lucidworks.com/2013/08/23/understanding-transaction-
>> logs-softcommit-and-commit-in-sorlcloud/
>> >
>> > <2> Actually, this and <1> are pretty close in priority.
>> >
>> > Then re-evaluate. Fixing the commit issue may buy you quite a bit of
>> > time. Having 1,000 collections is pushing the boundaries presently.
>> > Each collection will establish watchers on the bits it cares about in
>> > ZooKeeper, and reducing the watchers by a factor approaching 1,000 is
>> > A Good Thing.
>> >
>> > Frankly, between these two things I'd pretty much expect your problems
>> > to disappear. wouldn't be the first time I've been totally wrong, but
>> > it's where I'd start ;)
>> >
>> > Best,
>> > Erick
>> >
>> > On Wed, Oct 25, 2017 at 8:54 AM, Fengtan  wrote:
>> >> Hi,
>> >>
>> >> We run a SolrCloud 6.4.2 cluster with ZooKeeper 3.4.6 on 3 VM's.
>> >> Each VM runs RHEL 7 with 16 GB RAM and 8 CPU and OpenJDK 1.8.0_131 ;
>> each
>> >> VM has one Solr and one ZK instance.
>> >> The cluster hosts 1,000 collections ; each collection has 1 shard and
>> >> between 500 and 50,000 documents.
>> >> Documents are indexed incrementally every day ; the Solr client mostly
>> does
>> >> searching.
>> >> Solr runs with -Xms7g -Xmx7g.
>> >>
>> >> Everything has been working fine for about one month but a few days
>> ago we
>> >> started to see Solr timeouts: https://pastebin.com/raw/E2prSrQm
>> >>
>&

Re: Solrcloud Drupal 7

2017-12-11 Thread Fengtan
Hi,

Drupal can talk to a SolrCloud cluster the same way it talks to a
standalone Solr server, i.e. by using the Search API suite
https://www.drupal.org/project/search_api

You will have to create the collection yourself on Solr though (Drupal will
not do it for you).

If you want to take advantage of SolrCloud's fault tolerance capabilities
then you may have to:
* either use an external load balancer in front of your SolrCloud cluster
* or implement a smart client in Drupal -- I opened this ticket some time
ago https://www.drupal.org/project/search_api_solr/issues/2858645



On Mon, Dec 11, 2017 at 7:48 AM, Per Qvindesland  wrote:

> Hi All
>
> I have a solr cloud with zookeeper 3.4.10 and solr 5.4.1, i on 3 centos 7
> instances in AWS, I intend to use the cluster for Drupal 7 search.
>
> At the moment we have 2 instances running in production with each having
> their own solr installation (no cloud) but I would like to improve the
> redundancy and maybe even the performance, I followed
> http://www.francelabs.com/blog/tutorial-deploying-
> solrcloud-6-on-amazon-ec2/ to get the cloud up and running but I need
> some guidance on how to add in a new shard/collection so I can point the
> drupal instances to the new solr cloud, does anyone have any information on
> how to do this? as you can see i have no experience with solr cloud :)
>
> Regards
> Per
>
>
>
>


Re: Solrcloud Drupal 7

2017-12-11 Thread Fengtan
One way to provide Solr with the config files is to upload them to
ZooKeeper
https://lucene.apache.org/solr/guide/6_6/using-zookeeper-to-manage-configuration-files.html#UsingZooKeepertoManageConfigurationFiles-UploadingConfigurationFilesusingbin_solrorSolrJ

This can be achieved by copying the config files from
search_api_solr/solr-conf/x.x to your Solr server and then running the
'bin/solr zk upconfig' utility to upload them into a new configset. Once
you have done that, you can create a new collection with that configset.

Not sure if the zk upconfig utility is available in Solr 5 though, so you
may want to have a look at the reference guide for Solr 5:
https://archive.apache.org/dist/lucene/solr/ref-guide/



On Mon, Dec 11, 2017 at 11:06 AM, Per Qvindesland  wrote:

> Hi
>
> Thanks for the information.
>
> Once the collection is created how do you drop such as the schema.xml to
> the folder locations, I have used rsync -av search_api_solr/solr-conf/5.x
> /var/solr/data/search_shard2_replica1/ on one instance but I can’t see
> the files being replicated between the instances.
>
> Regards
> Per
>
>
>
>
> > On 11 Dec 2017, at 14:57, Fengtan  wrote:
> >
> > Hi,
> >
> > Drupal can talk to a SolrCloud cluster the same way it talks to a
> > standalone Solr server, i.e. by using the Search API suite
> > https://www.drupal.org/project/search_api
> >
> > You will have to create the collection yourself on Solr though (Drupal
> will
> > not do it for you).
> >
> > If you want to take advantage of SolrCloud's fault tolerance capabilities
> > then you may have to:
> > * either use an external load balancer in front of your SolrCloud cluster
> > * or implement a smart client in Drupal -- I opened this ticket some time
> > ago https://www.drupal.org/project/search_api_solr/issues/2858645
> >
> >
> >
> > On Mon, Dec 11, 2017 at 7:48 AM, Per Qvindesland  wrote:
> >
> >> Hi All
> >>
> >> I have a solr cloud with zookeeper 3.4.10 and solr 5.4.1, i on 3 centos
> 7
> >> instances in AWS, I intend to use the cluster for Drupal 7 search.
> >>
> >> At the moment we have 2 instances running in production with each having
> >> their own solr installation (no cloud) but I would like to improve the
> >> redundancy and maybe even the performance, I followed
> >> http://www.francelabs.com/blog/tutorial-deploying-
> >> solrcloud-6-on-amazon-ec2/ to get the cloud up and running but I need
> >> some guidance on how to add in a new shard/collection so I can point the
> >> drupal instances to the new solr cloud, does anyone have any
> information on
> >> how to do this? as you can see i have no experience with solr cloud :)
> >>
> >> Regards
> >> Per
> >>
> >>
> >>
> >>
>
>


A tool to quickly browse Solr documents ?

2017-01-23 Thread Fengtan
Hi All,

I am looking for a tool to quickly browse/investigate documents indexed in
a Solr core.

The default web admin interface already offers this, but you need to know
the Solr query syntax if you want to list/filter/sort documents.

I have started to build my own tool (https://github.com/fengtan/sophie) but
I don't want to reinvent the wheel -- does anyone know if something similar
already exists ?

Thanks