Re: push to the limit without going over

Erick Erickson Wed, 04 Jul 2018 09:07:45 -0700

First, I usually prefer to construct your CloudSolrClient by
using the Zookeeper ensemble string rather than URLs,
although that's probably not a cure for your problem.


Here's what I _think_ is happening. If you're slamming Solr
with a lot of updates, you're doing a lot of merging. At some point
when there are a lot of merges going on incoming
updates block until one or more merge threads is done.

At that point, I suspect your client is timing out. And (perhaps)
if you used the Zookeeper ensemble instead of HTTP, the
cluster state fetch would go away. I suspect that another
issue would come up, but....

It's also possible this would all go away if you increase your
timeouts significantly. That's still a "set it and hope" approach
rather than a totally robust solution though.

Let's assume that the above works and you start getting timeouts.
You can back off the indexing rate at that point, or just go to
sleep for a while. This isn't what you'd like for a permanent solution,
but may let you get by.

There's work afoot to separate out update thread pools from query
thread pools so _querying_ doesn't suffer when indexing is heavy,
but that hasn't been implemented yet. This could also address
your cluster state fetch error.

You will get significantly better throughput if you batch your
docs and use the client.add(list_of_documents) BTW.

Another possibility is to use the new metrics (since Solr 6.4). They
provide over 200 metrics you can query, and it's quite
possible that they'd help your clients know when to self-throttle
but AFAIK, there's nothing built in to help you there.

Best,
Erick

On Wed, Jul 4, 2018 at 2:32 AM, Arturas Mazeika <maze...@gmail.com> wrote:
> Hi Solr Folk,
>
> I am trying to push solr to the limit and sometimes I succeed. The
> questions is how to not go over it, e.g., avoid:
>
> java.lang.RuntimeException: Tried fetching cluster state using the node
> names we knew of, i.e. [192.168.56.1:9998_solr, 192.168.56.1:9997_solr,
> 192.168.56.1:9999_solr, 192.168.56.1:9996_solr]. However, succeeded in
> obtaining the cluster state from none of them.If you think your Solr
> cluster is up and is accessible, you could try re-creating a new
> CloudSolrClient using working solrUrl(s) or zkHost(s).
>         at org.apache.solr.client.solrj.impl.HttpClusterStateProvider.
> getState(HttpClusterStateProvider.java:109)
>         at org.apache.solr.client.solrj.impl.CloudSolrClient.resolveAliases(
> CloudSolrClient.java:1113)
>         at org.apache.solr.client.solrj.impl.CloudSolrClient.
> requestWithRetryOnStaleState(CloudSolrClient.java:845)
>         at org.apache.solr.client.solrj.impl.CloudSolrClient.request(
> CloudSolrClient.java:818)
>         at org.apache.solr.client.solrj.SolrRequest.process(
> SolrRequest.java:194)
>         at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:173)
>         at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:138)
>         at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:152)
>         at com.asc.InsertDEWikiSimple$SimpleThread.run(
> InsertDEWikiSimple.java:132)
>
>
> Details:
>
> I am benchmarking solrcloud setup on a single machine (Intel 7 with 8 "cpu
> cores", an SSD as well as a HDD) using the German Wikipedia collection. I
> created 4 nodes, 4 shards, rep factor: 2 cluster on the same machine (and
> managed to push the CPU or SSD to the hardware limits, i.e., ~200MB/s,
> ~100% CPU). Now I wanted to see what happens if I push HDD to the limits.
> Indexing the files from the SSD (I am able to scan the collection at the
> actual rate 400-500MB/s) with 16 threads, I tried to send those to the solr
> cluster with all indexes on the HDD.
>
> Clearly solr needs to deal with a very slow hard drive (10-20MB/s actual
> rate). If the cluster is not touched, solrj may start loosing connections
> after a few hours. If one checks the status of the cluster, it may happen
> sooner. After the connection is lost, the cluster calms down with writing
> after a half a dozen of minutes.
>
> What would be a reasonable way to push to the limit without going over?
>
> The exact parameters are:
>
> - 4 cores running 2gb ram
> - Schema:
>
>   <fieldType name="ft_wiki_de" class="solr.TextField"
> positionIncrementGap="100">
>      <analyzer>
>        <charFilter class="solr.HTMLStripCharFilterFactory"/>
>        <tokenizer  class="solr.StandardTokenizerFactory"/>
>        <filter     class="solr.GermanMinimalStemFilterFactory"/>
>        <filter     class="solr.LowerCaseFilterFactory"/>
>      </analyzer>
>   </fieldType>
>
>   <fieldType name="ft_url" class="solr.TextField" positionIncrementGap="100">
>      <analyzer>
>        <tokenizer  class="solr.StandardTokenizerFactory"/>
>        <filter     class="solr.LowerCaseFilterFactory"/>
>      </analyzer>
>   </fieldType>
>
>   <fieldType name="uuid" class="solr.UUIDField" indexed="true" />
>   <field name="id" type="uuid" indexed="true" stored="true" required="true"/>
>   <field name="_root_" type="uuid" indexed="true" stored="false"
> docValues="false" />
>
>   <field name="size"    type="pint"       indexed="true" stored="true"/>
>   <field name="time"    type="pdate"      indexed="true" stored="true"/>
>   <field name="content" type="ft_wiki_de" indexed="true" stored="true"/>
>   <field name="url"     type="ft_url"     indexed="true" stored="true"/>
>
>   <field name="_version_" type="plong"        indexed="false" stored="false"/>
>
> I SolrJ-connect once:
>
>         ArrayList<String> urls = new ArrayList<>();
>         urls.add("http://localhost:9999/solr";);
>         urls.add("http://localhost:9998/solr";);
>         urls.add("http://localhost:9997/solr";);
>         urls.add("http://localhost:9996/solr";);
>
>         solrClient = new CloudSolrClient.Builder(urls)
>             .withConnectionTimeout(10000)
>             .withSocketTimeout(60000)
>             .build();
>         solrClient.setDefaultCollection("de_wiki_man");
>
> and then execute in 16 threads till there's anything to execute:
>
>                     Path p = getJobPath();
>                                            String content = new String
> (Files.readAllBytes(p));
>                     UUID id = UUID.randomUUID();
>                     SolrInputDocument doc = new SolrInputDocument();
>
>                     BasicFileAttributes attr = Files.readAttributes(p,
> BasicFileAttributes.class);
>
>                     doc.addField("id",      id.toString());
>                     doc.addField("content", content);
>                     doc.addField("time",    attr.creationTime().toString());
>                     doc.addField("size",    content.length());
>                     doc.addField("url",     p.getFileName().
> toAbsolutePath().toString());
>                     solrClient.add(doc);
>
>
> to go through all the wiki html files.
>
> Cheers,
> Arturas

Re: push to the limit without going over

Reply via email to