Major GC does not reduce the old gen size

2013-10-21 Thread neoman
Hello everyone,
We are using solr 4.4 version production with 4 shards. This is our memory
settings.
-d64 -server -Xms8192m -Xmx12288m -XX:MaxPermSize=256m \
-XX:NewRatio=1 -XX:SurvivorRatio=6 \
-XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode
-XX:CMSIncrementalDutyCycleMin=0 \
-XX:CMSIncrementalDutyCycle=10 -XX:+CMSIncrementalPacing \
-XX:+PrintTenuringDistribution -XX:+PrintGCDetails -XX:+PrintGCTimeStamps
-XX:+PrintHeapAtGC \
-XX:+CMSClassUnloadingEnabled -XX:+DisableExplicitGC \
-XX:+UseLargePages \
-XX:+UseParNewGC \
-XX:ConcGCThreads=10 \
-XX:ParallelGCThreads=10 \
-XX:MaxGCPauseMillis=3 \
I notice in production that, the old generation becomes full and no amount
of garbage collection will free up the memory
This is similar to the issue discussed in this link. 
http://grokbase.com/t/lucene/solr-user/12bwydq5jr/permanently-full-old-generation
Did anyone have this problem? Can you please point anything wrong with the
GC configuration?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Major-GC-does-not-reduce-the-old-gen-size-tp4096880.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Major GC does not reduce the old gen size

2013-10-22 Thread neoman
can anyone please reply to this. i appreciate your help.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Major-GC-does-not-reduce-the-old-gen-size-tp4096880p4097036.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Major GC does not reduce the old gen size

2013-10-23 Thread neoman
help please



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Major-GC-does-not-reduce-the-old-gen-size-tp4096880p4097429.html
Sent from the Solr - User mailing list archive at Nabble.com.


JVM Crashed - SOLR deployed in Tomcat

2013-07-16 Thread neoman
Hello Everyone,
We are using solrcloud with Tomcat in our production environment.  
Here is our configuration.
solr-4.0.0
JVM 1.6.0_25

The JVM keeps crashing everyday with the following error. I think it is
happening while we try index the data with solrj APIs.

INFO: [aq-core] webapp=/solr path=/update
params={distrib.from=http://solr03-prod:8080/solr/aq-core/&update.distrib=TOLEADER&wt=javabin&version=2}
status=0 QTime=1 
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0xfd7ffadac771, pid=2411, tid=33662
#
# JRE version: 6.0_25-b06
# Java VM: Java HotSpot(TM) 64-Bit Server VM (20.0-b11 mixed mode
solaris-amd64 compressed oops)
# Problematic frame:
# J 
org.apache.lucene.codecs.PostingsConsumer.merge(Lorg/apache/lucene/index/MergeState;Lorg/apache/lucene/index/DocsEnum;Lorg/apache/lucene/util/FixedBitSet;)Lorg/apache/lucene/codecs/TermStats;
#
# An error report file with more information is saved as:
# /opt/tomcat/hs_err_pid2411.log
Jul 16, 2013 6:27:07 PM org.apache.solr.update.DirectUpdateHandler2 commit
INFO: start
commit{flags=0,_version_=0,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false}
#
# If you would like to submit a bug report, please visit:
#   http://java.sun.com/webapps/bugreport/crash.jsp



Instructions: (pc=0xfd7ffadac771)
0xfd7ffadac751:   89 4c 24 30 4c 89 44 24 28 4c 89 54 24 18 44 89
0xfd7ffadac761:   5c 24 20 4c 8b 57 10 4d 63 d9 49 8b ca 49 03 cb
0xfd7ffadac771:   44 0f be 01 45 8b d9 41 ff c3 44 89 5f 18 45 85
0xfd7ffadac781:   c0 0f 8c b0 05 00 00 45 8b d0 45 8b da 41 d1 eb 

Register to memory mapping:

RAX=0x14008cf2 is an unknown value
RBX=
[error occurred during error reporting (printing register info), id 0xb]

Stack: [0xfd7de4eff000,0xfd7de4fff000],  sp=0xfd7de4ffe140, 
free space=1020k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native
code)
J 
org.apache.lucene.codecs.PostingsConsumer.merge(Lorg/apache/lucene/index/MergeState;Lorg/apache/lucene/index/DocsEnum;Lorg/apache/lucene/util/FixedBitSet;)Lorg/apache/lucene/codecs/TermStats;

Please let me know if anyone has seen this before. Any input is appreciated. 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/JVM-Crashed-SOLR-deployed-in-Tomcat-tp4078439.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: JVM Crashed - SOLR deployed in Tomcat

2013-07-18 Thread neoman
Thanks for your reply. Yes, it worked. No more crashes after switching to
1.6.0_30



--
View this message in context: 
http://lucene.472066.n3.nabble.com/JVM-Crashed-SOLR-deployed-in-Tomcat-tp4078439p4078906.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr cloud shard goes down after SocketException in another shard

2013-09-12 Thread neoman
Thanks greg. Currently we have 60 seconds (we reduced it recently). I may
have to reduce it again. can you please share your timeout value.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-cloud-shard-goes-down-after-SocketException-in-another-shard-tp4089576p4089582.html
Sent from the Solr - User mailing list archive at Nabble.com.


Solr cloud shard goes down after SocketException in another shard

2013-09-12 Thread neoman
Exception in  shard1 (solr01-prod) primary
<09/12/13
13:56:46:635|http-bio-8080-exec-66|ERROR|apache.solr.servlet.SolrDispatchFilter|null:ClientAbortException:
 
java.net.SocketException: Broken pipe
at
org.apache.catalina.connector.OutputBuffer.realWriteBytes(OutputBuffer.java:406)
at org.apache.tomcat.util.buf.ByteChunk.append(ByteChunk.java:342)
at
org.apache.catalina.connector.OutputBuffer.writeBytes(OutputBuffer.java:431)
at
org.apache.catalina.connector.OutputBuffer.write(OutputBuffer.java:419)
at
org.apache.catalina.connector.CoyoteOutputStream.write(CoyoteOutputStream.java:91)
at
org.apache.solr.common.util.FastOutputStream.flush(FastOutputStream.java:214)
at
org.apache.solr.common.util.FastOutputStream.write(FastOutputStream.java:95)
at
org.apache.solr.common.util.JavaBinCodec.writeStr(JavaBinCodec.java:470)
at
org.apache.solr.common.util.JavaBinCodec.writePrimitive(JavaBinCodec.java:545)
at
org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:232)
at
org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:149)
at
org.apache.solr.common.util.JavaBinCodec.writeSolrDocument(JavaBinCodec.java:320)
at
org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:257)
at
org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:149)
at
org.apache.solr.common.util.JavaBinCodec.writeArray(JavaBinCodec.java:427)
at
org.apache.solr.common.util.JavaBinCodec.writeSolrDocumentList(JavaBinCodec.java:356)


Exception in  shard1 (solr08-prod) secondary

<09/12/13
13:56:46:729|http-bio-8080-exec-50|ERROR|apache.solr.core.SolrCore|org.apache.solr.common.SolrException:
ClusterState says we are the leader (http://solr08-prod:8080/solr/aq-core),
but locally we don't think so. Request came from
http://solr03-prod.phneaz:8080/solr/aq-core/
at
org.apache.solr.update.processor.DistributedUpdateProcessor.doDefensiveChecks(DistributedUpdateProcessor.java:381)
at
org.apache.solr.update.processor.DistributedUpdateProcessor.setupRequest(DistributedUpdateProcessor.java:243)
at
org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:428)
at 
org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:100)

Out configuration 
Solr 4.4, Tomcat 7, 3 shards
Thanks for your help



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-cloud-shard-goes-down-after-SocketException-in-another-shard-tp4089576.html
Sent from the Solr - User mailing list archive at Nabble.com.


Solr node goes down while trying to index records

2013-09-17 Thread neoman
Hello everyone,
one or more of the nodes in the solrcloud go down randomly when we try to
index data using solrj APIs. The nodes do recover. but when we try to index
back, they go down again

Our configuration:
3 shards 
Solr 4.4.

I see the following exceptions in the log file.
<09/17/13
15:33:32:976|localhost-startStop-1-SendThread(10.68.129.119:9080)|INFO|org.apache.zookeeper.ClientCnxn|Socket
connection established to 10.68.129.119/10.68.129.119:9080, initiating
session|
<09/17/13
15:33:32:978|localhost-startStop-1-SendThread(10.68.129.119:9080)|INFO|org.apache.zookeeper.ClientCnxn|Unable
to reconnect to ZooKeeper service, session 0x34109f9474b0029 has expired,
closing socket connection|
<09/17/13
15:34:36:080|localhost-startStop-1-EventThread|ERROR|apache.solr.cloud.ZkController|There
was a problem making a request to the
leader:org.apache.solr.client.solrj.SolrServerException: Timeout occured
while waiting response from server at: http://solr02-prod.phneaz:8080/solr
at
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:431)
at
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180)
at
org.apache.solr.cloud.ZkController.waitForLeaderToSeeDownState(ZkController.java:1421)
at
org.apache.solr.cloud.ZkController.registerAllCoresAsDown(ZkController.java:306)
at
org.apache.solr.cloud.ZkController.access$100(ZkController.java:86)
at
org.apache.solr.cloud.ZkController$1.command(ZkController.java:196)
at
org.apache.solr.common.cloud.ConnectionManager$1.update(ConnectionManager.java:117)
at
org.apache.solr.common.cloud.DefaultConnectionStrategy.reconnect(DefaultConnectionStrategy.java:46)
at
org.apache.solr.common.cloud.ConnectionManager.process(ConnectionManager.java:91)
at
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
at
org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)

We are also getting IOExcpetion in the client side.
Adding chunk 122
Total  Count 12422
org.apache.solr.client.solrj.SolrServerException: Timeout occured while
waiting response from server at:
http://solr-prod.com:8443/solr/aq-collection
at
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:409)
at
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:181)
at
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117)
at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:68)
at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:54)
at
com.billmelater.fraudworkstation.data.DataProvider.flushBatch(DataProvider.java:48)
at
com.billmelater.fraudworkstation.data.AQDBDataProvider.execute(AQDBDataProvider.java:114)
at
com.billmelater.fraudworkstation.data.AQDBDataProvider.main(AQDBDataProvider.java:244)
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)

Your help is appreciated.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-node-goes-down-while-trying-to-index-records-tp4090610.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr node goes down while trying to index records

2013-09-17 Thread neoman
yes. the nodes go down while indexing. if we stop indexing, it does not go
down.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-node-goes-down-while-trying-to-index-records-tp4090610p4090644.html
Sent from the Solr - User mailing list archive at Nabble.com.