Re: push to the limit without going over

Arturas Mazeika Thu, 05 Jul 2018 07:18:01 -0700

Hi Erick et al,

Thanks a lot for the response. Your explanation seems very plausible and
I'd love to investigate those further.


Batching the docs (for me  surprisingly) improved the numbers:

Buffer size secs MB/s Docs/s
N:500 1117 34.4077538 2400.72695
N:100 1073 35.8186962 2499.17241
N:10 1170 32.849112 2291.97607
N:5 1234 31.1454303 2173.10535
N:3 1433 26.8202798 1871.32729
N:2 1758 21.862037 1525.37656
N:1 2307 16.6594976 1162.38058

It looks like the larger the buffer (in terms of number of documents), the
faster the processing. I thought the gains would not been too high as (1)
solr buffers it itself, (2) the documents are pretty large.

SolrJ API changed a bit since the last few releases and it is becoming
incredibly difficult to find working code. You mentioned that I can connect
to zkHost directly. I tried [1], [2], and [3]  and its variants without any
success (the returned object was null) . How would it look like in 7.2+
branch (I am currently running the embedded zookeeper, solr runs on 9999,
so the zookeeper should be on 10999 [4])?

I am impressed by the number of metrics I can get from the solr with my
very limited knowledge. You mentioned that there are 200+ metrics one can
get about the system. As the primary resource of infos, would you recommend:

https://lucene.apache.org/solr/guide/7_4/collections-api.html

Can you maybe expand this list with additional references?

Cheers,
Arturas

Refs:
            [::1]:10999            ESTABLISHED     13984

[1]
        String zkHostString = "localhost:10999";
        SolrClient solrClient = new CloudSolrClient(zkHostString, true);
        solrClient.setDefaultCollection("de_wiki_man");

[2]
        String zkHostString = "localhost:10999";
        SolrClient solrClient = new
CloudSolrClient.Builder().withZkHost(zkHostString).build();

[3]

        ArrayList<String> zkHosts = new ArrayList<>();
        zkHosts.add("localhost:10999");

        solrClient = new CloudSolrClient.Builder(zkHosts, null)
            .withConnectionTimeout(1000000)
            .withSocketTimeout(6000000)
            .build();

        solrClient.setDefaultCollection("de_wiki_man");

[4]
C:\WINDOWS\system32>netstat -aon | grep 13984
  TCP    0.0.0.0:9999           0.0.0.0:0              LISTENING       13984
  TCP    0.0.0.0:10999          0.0.0.0:0              LISTENING       13984
  TCP    127.0.0.1:8999         0.0.0.0:0              LISTENING       13984
  TCP    127.0.0.1:62888        127.0.0.1:62889        ESTABLISHED     13984
  TCP    127.0.0.1:62889        127.0.0.1:62888        ESTABLISHED     13984
  TCP    127.0.0.1:62891        127.0.0.1:62892        ESTABLISHED     13984
  TCP    127.0.0.1:62892        127.0.0.1:62891        ESTABLISHED     13984
  TCP    127.0.0.1:62900        127.0.0.1:62901        ESTABLISHED     13984
  TCP    127.0.0.1:62901        127.0.0.1:62900        ESTABLISHED     13984
  TCP    127.0.0.1:62902        127.0.0.1:62903        ESTABLISHED     13984
  TCP    127.0.0.1:62903        127.0.0.1:62902        ESTABLISHED     13984
  TCP    127.0.0.1:62904        127.0.0.1:62905        ESTABLISHED     13984
  TCP    127.0.0.1:62905        127.0.0.1:62904        ESTABLISHED     13984
  TCP    127.0.0.1:62906        127.0.0.1:62907        ESTABLISHED     13984
  TCP    127.0.0.1:62907        127.0.0.1:62906        ESTABLISHED     13984
  TCP    [::]:9999              [::]:0                 LISTENING       13984
  TCP    [::]:10999             [::]:0                 LISTENING       13984
  TCP    [::1]:10999            [::1]:62893            ESTABLISHED     13984
  TCP    [::1]:62893


On Wed, Jul 4, 2018 at 6:06 PM, Erick Erickson <erickerick...@gmail.com>
wrote:

> First, I usually prefer to construct your CloudSolrClient by
> using the Zookeeper ensemble string rather than URLs,
> although that's probably not a cure for your problem.
>
> Here's what I _think_ is happening. If you're slamming Solr
> with a lot of updates, you're doing a lot of merging. At some point
> when there are a lot of merges going on incoming
> updates block until one or more merge threads is done.
>
> At that point, I suspect your client is timing out. And (perhaps)
> if you used the Zookeeper ensemble instead of HTTP, the
> cluster state fetch would go away. I suspect that another
> issue would come up, but....
>
> It's also possible this would all go away if you increase your
> timeouts significantly. That's still a "set it and hope" approach
> rather than a totally robust solution though.
>
> Let's assume that the above works and you start getting timeouts.
> You can back off the indexing rate at that point, or just go to
> sleep for a while. This isn't what you'd like for a permanent solution,
> but may let you get by.
>
> There's work afoot to separate out update thread pools from query
> thread pools so _querying_ doesn't suffer when indexing is heavy,
> but that hasn't been implemented yet. This could also address
> your cluster state fetch error.
>
> You will get significantly better throughput if you batch your
> docs and use the client.add(list_of_documents) BTW.
>
> Another possibility is to use the new metrics (since Solr 6.4). They
> provide over 200 metrics you can query, and it's quite
> possible that they'd help your clients know when to self-throttle
> but AFAIK, there's nothing built in to help you there.
>
> Best,
> Erick
>
> On Wed, Jul 4, 2018 at 2:32 AM, Arturas Mazeika <maze...@gmail.com> wrote:
> > Hi Solr Folk,
> >
> > I am trying to push solr to the limit and sometimes I succeed. The
> > questions is how to not go over it, e.g., avoid:
> >
> > java.lang.RuntimeException: Tried fetching cluster state using the node
> > names we knew of, i.e. [192.168.56.1:9998_solr, 192.168.56.1:9997_solr,
> > 192.168.56.1:9999_solr, 192.168.56.1:9996_solr]. However, succeeded in
> > obtaining the cluster state from none of them.If you think your Solr
> > cluster is up and is accessible, you could try re-creating a new
> > CloudSolrClient using working solrUrl(s) or zkHost(s).
> >         at org.apache.solr.client.solrj.impl.HttpClusterStateProvider.
> > getState(HttpClusterStateProvider.java:109)
> >         at org.apache.solr.client.solrj.impl.CloudSolrClient.
> resolveAliases(
> > CloudSolrClient.java:1113)
> >         at org.apache.solr.client.solrj.impl.CloudSolrClient.
> > requestWithRetryOnStaleState(CloudSolrClient.java:845)
> >         at org.apache.solr.client.solrj.impl.CloudSolrClient.request(
> > CloudSolrClient.java:818)
> >         at org.apache.solr.client.solrj.SolrRequest.process(
> > SolrRequest.java:194)
> >         at org.apache.solr.client.solrj.SolrClient.add(SolrClient.
> java:173)
> >         at org.apache.solr.client.solrj.SolrClient.add(SolrClient.
> java:138)
> >         at org.apache.solr.client.solrj.SolrClient.add(SolrClient.
> java:152)
> >         at com.asc.InsertDEWikiSimple$SimpleThread.run(
> > InsertDEWikiSimple.java:132)
> >
> >
> > Details:
> >
> > I am benchmarking solrcloud setup on a single machine (Intel 7 with 8
> "cpu
> > cores", an SSD as well as a HDD) using the German Wikipedia collection. I
> > created 4 nodes, 4 shards, rep factor: 2 cluster on the same machine (and
> > managed to push the CPU or SSD to the hardware limits, i.e., ~200MB/s,
> > ~100% CPU). Now I wanted to see what happens if I push HDD to the limits.
> > Indexing the files from the SSD (I am able to scan the collection at the
> > actual rate 400-500MB/s) with 16 threads, I tried to send those to the
> solr
> > cluster with all indexes on the HDD.
> >
> > Clearly solr needs to deal with a very slow hard drive (10-20MB/s actual
> > rate). If the cluster is not touched, solrj may start loosing connections
> > after a few hours. If one checks the status of the cluster, it may happen
> > sooner. After the connection is lost, the cluster calms down with writing
> > after a half a dozen of minutes.
> >
> > What would be a reasonable way to push to the limit without going over?
> >
> > The exact parameters are:
> >
> > - 4 cores running 2gb ram
> > - Schema:
> >
> >   <fieldType name="ft_wiki_de" class="solr.TextField"
> > positionIncrementGap="100">
> >      <analyzer>
> >        <charFilter class="solr.HTMLStripCharFilterFactory"/>
> >        <tokenizer  class="solr.StandardTokenizerFactory"/>
> >        <filter     class="solr.GermanMinimalStemFilterFactory"/>
> >        <filter     class="solr.LowerCaseFilterFactory"/>
> >      </analyzer>
> >   </fieldType>
> >
> >   <fieldType name="ft_url" class="solr.TextField"
> positionIncrementGap="100">
> >      <analyzer>
> >        <tokenizer  class="solr.StandardTokenizerFactory"/>
> >        <filter     class="solr.LowerCaseFilterFactory"/>
> >      </analyzer>
> >   </fieldType>
> >
> >   <fieldType name="uuid" class="solr.UUIDField" indexed="true" />
> >   <field name="id" type="uuid" indexed="true" stored="true"
> required="true"/>
> >   <field name="_root_" type="uuid" indexed="true" stored="false"
> > docValues="false" />
> >
> >   <field name="size"    type="pint"       indexed="true" stored="true"/>
> >   <field name="time"    type="pdate"      indexed="true" stored="true"/>
> >   <field name="content" type="ft_wiki_de" indexed="true" stored="true"/>
> >   <field name="url"     type="ft_url"     indexed="true" stored="true"/>
> >
> >   <field name="_version_" type="plong"        indexed="false"
> stored="false"/>
> >
> > I SolrJ-connect once:
> >
> >         ArrayList<String> urls = new ArrayList<>();
> >         urls.add("http://localhost:9999/solr";);
> >         urls.add("http://localhost:9998/solr";);
> >         urls.add("http://localhost:9997/solr";);
> >         urls.add("http://localhost:9996/solr";);
> >
> >         solrClient = new CloudSolrClient.Builder(urls)
> >             .withConnectionTimeout(10000)
> >             .withSocketTimeout(60000)
> >             .build();
> >         solrClient.setDefaultCollection("de_wiki_man");
> >
> > and then execute in 16 threads till there's anything to execute:
> >
> >                     Path p = getJobPath();
> >                                            String content = new String
> > (Files.readAllBytes(p));
> >                     UUID id = UUID.randomUUID();
> >                     SolrInputDocument doc = new SolrInputDocument();
> >
> >                     BasicFileAttributes attr = Files.readAttributes(p,
> > BasicFileAttributes.class);
> >
> >                     doc.addField("id",      id.toString());
> >                     doc.addField("content", content);
> >                     doc.addField("time",
> attr.creationTime().toString());
> >                     doc.addField("size",    content.length());
> >                     doc.addField("url",     p.getFileName().
> > toAbsolutePath().toString());
> >                     solrClient.add(doc);
> >
> >
> > to go through all the wiki html files.
> >
> > Cheers,
> > Arturas
>

Re: push to the limit without going over

Reply via email to