The numberformatexception is...odd. Clearly that's too big a number
for an integer, did anything in the underlying schema change?

Best,
Erick

On Wed, Sep 20, 2017 at 3:00 PM, Walter Underwood <wun...@wunderwood.org> wrote:
> Rolling restarts work fine for us. I often include installing new configs 
> with that. Here is our script. Pass it any hostname in the cluster. I use the 
> load balancer name. You’ll need to change the domain and the install 
> directory of course.
>
> #!/bin/bash
>
> cluster=$1
>
> hosts=`curl -s 
> "http://${cluster}:8983/solr/admin/collections?action=CLUSTERSTATUS&wt=json"; 
> | jq -r '.cluster.live_nodes[]' | sort`
>
> for host in $hosts
> do
>     host="${host}.cloud.cheggnet.com"
>     echo restarting Solr on $host
>     ssh $host 'cd /apps/solr6 ; sudo -u bin bin/solr stop; sudo -u bin 
> bin/solr start -cloud -h `hostname`'
> done
>
>
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
>
>> On Sep 20, 2017, at 1:42 PM, Bill Oconnor <bocon...@plos.org> wrote:
>>
>> Hello,
>>
>>
>> Background:
>>
>>
>> We have been successfully using Solr for over 5 years and we recently made 
>> the decision to move into SolrCloud. For the most part that has been easy 
>> but we have repeated problems with our rolling restart were server remain 
>> functional but stay in Recovery until they stop trying. We restarted because 
>> we increased the memory from 12GB to 16GB on the JVM.
>>
>>
>> Does anyone have any insight as to what is going on here?
>>
>> Is there a special procedure I should use for starting a stopping host?
>>
>> Is it ok to do a rolling restart on all the nodes in s shard?
>>
>>
>> Any insight would be appreciated.
>>
>>
>> Configuration:
>>
>>
>> We have a group of servers with multiple collections. Each collection 
>> consist of one shard and multiple replicates. We are running the latest 
>> stable version of SolrClound 6.6 on Ubuntu LTS and Oracle Corporation Java 
>> HotSpot(TM) 64-Bit Server VM 1.8.0_66 25.66-b17
>>
>>
>> (collection)              (shard)          (replicates)
>>
>> journals_stage   ->  shard1  ->  solr-220 (leader) , solr-223, solr-221, 
>> solr-222 (replicates)
>>
>>
>> Problem:
>>
>>
>> Restarting the system puts the replicates in a recovery state they never 
>> exit from. They eventually give up after 500 tries.  If I go to the 
>> individual replicates and execute a query the data is still available.
>>
>>
>> Using tcpdump I find the replicates sending this request to the leader (the 
>> leader appears to be active).
>>
>>
>> The exchange goes  like this - :
>>
>>
>> solr-220 is the leader.
>>
>> Solr-221 to Solr-220
>>
>>
>> 10:18:42.426823 IP solr-221:54341 > solr-220:8983:
>>
>>
>> POST /solr/journals_stage_shard1_replica1/update HTTP/1.1
>> Content-Type: application/x-www-form-urlencoded; charset=UTF-8
>> User-Agent: 
>> Solr[org.apache.solr<http://org.apache.solr/>.client.solrj.impl<http://client.solrj.impl/>.HttpSolrClient]
>>  1.0
>> Content-Length: 108
>> Host: solr-220:8983
>> Connection: Keep-Alive
>>
>>
>> commit_end_point=true&openSearcher=false&commit=true&softCommit=false&waitSearcher=true&wt=javabin&version=2
>>
>>
>> Solr-220 back to Solr-221
>>
>>
>> IP solr-220:8983 > solr-221:54341: Flags [P.], seq 1:5152, ack 385, win 235, 
>> options [nop,nop,
>> TS val 858155553 ecr 858107069], length 5151
>> ..HTTP/1.1 500 Server Error
>> Content-Type: application/octet-stream
>> Content-Length: 5060
>>
>>
>> .responseHeader..&statusT..%QTimeC.%error..#msg?.For input string: 
>> "1578578283947098112".%trace?.&java.lang.NumberFormatException: For
>> input string: "1578578283947098112"
>> at 
>> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
>> at java.lang.Integer.parseInt(Integer.java:583)
>> at java.lang.Integer.parseInt(Integer.java:615)
>> at 
>> org.apache.lucene.queries.function.docvalues.IntDocValues.getRangeScorer(IntDocValues.java:89)
>> at 
>> org.apache.solr<http://org.apache.solr/>.search.function.ValueSourceRangeFilter$1.iterator(ValueSourceRangeFilter.java:83)
>> at 
>> org.apache.solr<http://org.apache.solr/>.search.SolrConstantScoreQuery$ConstantWeight.scorer(SolrConstantScoreQuery.java:100)
>> at org.apache.lucene.search.Weight.scorerSupplier(Weight.java:126)
>> at 
>> org.apache.lucene.search.BooleanWeight.scorerSupplier(BooleanWeight.java:400)
>> at org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:381)
>> at 
>> org.apache.solr<http://org.apache.solr/>.update.DeleteByQueryWrapper$1.scorer(DeleteByQueryWrapper.java:90)
>> at 
>> org.apache.lucene.index.BufferedUpdatesStream.applyQueryDeletes(BufferedUpdatesStream.java:709)
>>
>> at 
>> org.apache.lucene.index.BufferedUpdatesStream.applyDeletesAndUpdates(BufferedUpdatesStream.java:267)
>>
>>
>

Reply via email to