Thanks for the response. I've checked the system logs and harddisk smartd
info, and no errors found. Any hints to locate the problem?


On Wed, Apr 30, 2014 at 9:26 AM, Michael Shuler <mich...@pbandjelly.org>wrote:

> Then you likely need to fix your I/O problem. The most recent error you
> posted is an EOFException - the file being read ended unexpectedly.
> Probably when you ran out of disk space.
>
> --
> Michael
>
>
> On 04/29/2014 07:48 PM, Yatong Zhang wrote:
>
>> Here is another type of exception, seems all are I/O related:
>>
>>   INFO [SSTableBatchOpen:1] 2014-04-29 14:44:35,548 SSTableReader.java
>> (line
>>
>>> 223) Opening
>>> /data2/cass/system/compaction_history/system-compaction_history-jb-6956
>>> (447252 bytes)
>>>   INFO [SSTableBatchOpen:2] 2014-04-29 14:44:35,553 SSTableReader.java
>>> (line 223) Opening
>>> /data2/cass/system/compaction_history/system-compaction_history-jb-6958
>>> (257 bytes)
>>>   INFO [SSTableBatchOpen:3] 2014-04-29 14:44:35,554 SSTableReader.java
>>> (line 223) Opening
>>> /data2/cass/system/compaction_history/system-compaction_history-jb-6957
>>> (257 bytes)
>>>   INFO [main] 2014-04-29 14:44:35,592 ColumnFamilyStore.java (line 248)
>>> Initializing system.batchlog
>>>   INFO [main] 2014-04-29 14:44:35,596 ColumnFamilyStore.java (line 248)
>>> Initializing system.sstable_activity
>>>   INFO [SSTableBatchOpen:1] 2014-04-29 14:44:35,601 SSTableReader.java
>>> (line 223) Opening
>>> /data2/cass/system/sstable_activity/system-sstable_activity-jb-8084
>>> (1562
>>> bytes)
>>>   INFO [SSTableBatchOpen:2] 2014-04-29 14:44:35,604 SSTableReader.java
>>> (line 223) Opening
>>> /data2/cass/system/sstable_activity/system-sstable_activity-jb-8083
>>> (2075
>>> bytes)
>>>   INFO [SSTableBatchOpen:3] 2014-04-29 14:44:35,605 SSTableReader.java
>>> (line 223) Opening
>>> /data2/cass/system/sstable_activity/system-sstable_activity-jb-8085
>>> (1555
>>> bytes)
>>>   INFO [main] 2014-04-29 14:44:35,687 AutoSavingCache.java (line 114)
>>> reading saved cache
>>> /data1/saved_caches/system-sstable_activity-KeyCache-b.db
>>>   INFO [main] 2014-04-29 14:44:35,696 ColumnFamilyStore.java (line 248)
>>> Initializing system.peer_events
>>>   INFO [SSTableBatchOpen:1] 2014-04-29 14:44:35,697 SSTableReader.java
>>> (line 223) Opening /data4/cass/system/peer_events/system-peer_events-jb-
>>> 181
>>> (12342 bytes)
>>>   INFO [main] 2014-04-29 14:44:35,717 ColumnFamilyStore.java (line 248)
>>> Initializing system.compactions_in_progress
>>>   INFO [SSTableBatchOpen:1] 2014-04-29 14:44:35,718 SSTableReader.java
>>> (line 223) Opening
>>> /data5/cass/system/compactions_in_progress/system-compactions_in_
>>> progress-jb-36448
>>> (167 bytes)
>>> ERROR [SSTableBatchOpen:1] 2014-04-29 14:44:35,730 CassandraDaemon.java
>>> (line 198) Exception in thread Thread[SSTableBatchOpen:1,5,main]
>>> org.apache.cassandra.io.sstable.CorruptSSTableException:
>>> java.io.EOFException
>>>          at
>>> org.apache.cassandra.io.compress.CompressionMetadata.<
>>> init>(CompressionMetadata.java:110)
>>>          at
>>> org.apache.cassandra.io.compress.CompressionMetadata.
>>> create(CompressionMetadata.java:64)
>>>          at
>>> org.apache.cassandra.io.util.CompressedPoolingSegmentedFile
>>> $Builder.complete(CompressedPoolingSegmentedFile.java:42)
>>>          at
>>> org.apache.cassandra.io.sstable.SSTableReader.load(
>>> SSTableReader.java:458)
>>>          at
>>> org.apache.cassandra.io.sstable.SSTableReader.load(
>>> SSTableReader.java:422)
>>>          at
>>> org.apache.cassandra.io.sstable.SSTableReader.open(
>>> SSTableReader.java:203)
>>>          at
>>> org.apache.cassandra.io.sstable.SSTableReader.open(
>>> SSTableReader.java:184)
>>>          at
>>> org.apache.cassandra.io.sstable.SSTableReader$1.run(
>>> SSTableReader.java:264)
>>>          at
>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>>>          at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>>          at
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(
>>> ThreadPoolExecutor.java:1145)
>>>          at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(
>>> ThreadPoolExecutor.java:615)
>>>          at java.lang.Thread.run(Thread.java:744)
>>> Caused by: java.io.EOFException
>>>          at
>>> java.io.DataInputStream.readUnsignedShort(DataInputStream.java:340)
>>>          at java.io.DataInputStream.readUTF(DataInputStream.java:589)
>>>          at java.io.DataInputStream.readUTF(DataInputStream.java:564)
>>>          at
>>> org.apache.cassandra.io.compress.CompressionMetadata.<
>>> init>(CompressionMetadata.java:85)
>>>          ... 12 more
>>>   INFO [main] 2014-04-29 14:44:35,733 ColumnFamilyStore.java (line 248)
>>> Initializing system.hints
>>>   INFO [main] 2014-04-29 14:44:35,734 AutoSavingCache.java (line 114)
>>> reading saved cache /data1/saved_caches/system-hints-KeyCache-b.db
>>>   INFO [main] 2014-04-29 14:44:35,737 ColumnFamilyStore.java (line 248)
>>> Initializing system.schema_keyspaces
>>>
>>>
>>
>>
>> On Tue, Apr 29, 2014 at 6:07 PM, Yatong Zhang <bluefl...@gmail.com>
>> wrote:
>>
>>  I am pretty sure the disk has plenty of space, I am sure of that. I
>>> restarted cassandra and everything went fine again.
>>>
>>> It's really wired
>>>
>>>
>>> On Tue, Apr 29, 2014 at 5:58 PM, Sylvain Lebresne <sylv...@datastax.com
>>> >wrote:
>>>
>>>  The important part of that stack trace is "java.io.IOException: No space
>>>> left on device", your disks are full (and it's not really a bug that
>>>> Cassandra error out in that case).
>>>>
>>>> --
>>>> Sylvain
>>>>
>>>>
>>>> On Tue, Apr 29, 2014 at 11:09 AM, Yatong Zhang <bluefl...@gmail.com>
>>>> wrote:
>>>>
>>>>  Hi there,
>>>>>
>>>>> Sorry if this is not the right place to report bugs. I am using 2.0.7
>>>>>
>>>> and I
>>>>
>>>>> have a 10 boxes clusters with about 200TB capacity. I just found I had
>>>>> 3
>>>>> boxes with error exceptions. With datastax opscenter I can see these
>>>>>
>>>> three
>>>>
>>>>> nodes lost connections (no reponse), but after I sshed to these server,
>>>>> cassandara were still running, and the 'system.log' still had logs.
>>>>>
>>>>> I think this might be a bug so any one would kindly help to investigate
>>>>> into it? Thanks~
>>>>>
>>>>> ERROR [CompactionExecutor:1] 2014-04-29 05:55:15,249
>>>>>
>>>> CassandraDaemon.java
>>>>
>>>>> (line 198) Exception in thread Thread[CompactionExecutor:1,1,main]
>>>>>> FSWriteError in
>>>>>>
>>>>> /data2/cass/mydb/images/mydb-images-tmp-jb-98219-Filter.db
>>>>>
>>>>>>          at
>>>>>>
>>>>>>
>>>>>  org.apache.cassandra.io.sstable.SSTableWriter$IndexWriter.close(
>>>> SSTableWriter.java:475)
>>>>
>>>>>          at
>>>>>>
>>>>>>  org.apache.cassandra.io.util.FileUtils.closeQuietly(
>>>> FileUtils.java:212)
>>>>
>>>>>          at
>>>>>>
>>>>>>
>>>>>  org.apache.cassandra.io.sstable.SSTableWriter.abort(
>>>> SSTableWriter.java:301)
>>>>
>>>>>          at
>>>>>>
>>>>>>
>>>>>  org.apache.cassandra.db.compaction.CompactionTask.
>>>> runWith(CompactionTask.java:209)
>>>>
>>>>>          at
>>>>>>
>>>>>>
>>>>>  org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(
>>>> DiskAwareRunnable.java:48)
>>>>
>>>>>          at
>>>>>>
>>>>>>  org.apache.cassandra.utils.WrappedRunnable.run(
>>>> WrappedRunnable.java:28)
>>>>
>>>>>          at
>>>>>>
>>>>>>
>>>>>  org.apache.cassandra.db.compaction.CompactionTask.executeInternal(
>>>> CompactionTask.java:60)
>>>>
>>>>>          at
>>>>>>
>>>>>>
>>>>>  org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(
>>>> AbstractCompactionTask.java:59)
>>>>
>>>>>          at
>>>>>>
>>>>>>
>>>>>  org.apache.cassandra.db.compaction.CompactionManager$
>>>> BackgroundCompactionTask.run(CompactionManager.java:197)
>>>>
>>>>>          at
>>>>>>
>>>>>>  java.util.concurrent.Executors$RunnableAdapter.
>>>> call(Executors.java:471)
>>>>
>>>>>          at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>>>>>          at
>>>>>>
>>>>>>
>>>>>  java.util.concurrent.ThreadPoolExecutor.runWorker(
>>>> ThreadPoolExecutor.java:1145)
>>>>
>>>>>          at
>>>>>>
>>>>>>
>>>>>  java.util.concurrent.ThreadPoolExecutor$Worker.run(
>>>> ThreadPoolExecutor.java:615)
>>>>
>>>>>          at java.lang.Thread.run(Thread.java:744)
>>>>>> Caused by: java.io.IOException: No space left on device
>>>>>>          at java.io.FileOutputStream.write(Native Method)
>>>>>>          at java.io.FileOutputStream.write(FileOutputStream.java:295)
>>>>>>          at
>>>>>>
>>>>> java.io.DataOutputStream.writeInt(DataOutputStream.java:197)
>>>>
>>>>>          at
>>>>>>
>>>>>>
>>>>>  org.apache.cassandra.utils.BloomFilterSerializer.serialize(
>>>> BloomFilterSerializer.java:34)
>>>>
>>>>>          at
>>>>>>
>>>>>>
>>>>>  org.apache.cassandra.utils.Murmur3BloomFilter$
>>>> Murmur3BloomFilterSerializer.serialize(Murmur3BloomFilter.java:44)
>>>>
>>>>>          at
>>>>>>
>>>>>>  org.apache.cassandra.utils.FilterFactory.serialize(
>>>> FilterFactory.java:41)
>>>>
>>>>>          at
>>>>>>
>>>>>>
>>>>>  org.apache.cassandra.io.sstable.SSTableWriter$IndexWriter.close(
>>>> SSTableWriter.java:468)
>>>>
>>>>>          ... 13 more
>>>>>> ERROR [CompactionExecutor:1] 2014-04-29 05:55:15,406
>>>>>>
>>>>> StorageService.java
>>>>
>>>>> (line 367) Stopping gossiper
>>>>>>   WARN [CompactionExecutor:1] 2014-04-29 05:55:15,406
>>>>>>
>>>>> StorageService.java
>>>>
>>>>> (line 281) Stopping gossip by operator request
>>>>>>   INFO [CompactionExecutor:1] 2014-04-29 05:55:15,406 Gossiper.java
>>>>>>
>>>>> (line
>>>>
>>>>> 1271) Announcing shutdown
>>>>>> ERROR [CompactionExecutor:1] 2014-04-29 05:55:17,406
>>>>>>
>>>>> StorageService.java
>>>>
>>>>> (line 372) Stopping RPC server
>>>>>>   INFO [CompactionExecutor:1] 2014-04-29 05:55:17,406
>>>>>> ThriftServer.java
>>>>>> (line 141) Stop listening to thrift clients
>>>>>> ERROR [CompactionExecutor:1] 2014-04-29 05:55:17,417
>>>>>>
>>>>> StorageService.java
>>>>
>>>>> (line 377) Stopping native transport
>>>>>>   INFO [CompactionExecutor:1] 2014-04-29 05:55:17,504 Server.java
>>>>>> (line
>>>>>> 181) Stop listening for CQL clients
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>>
>

Reply via email to