[ 
https://issues.apache.org/jira/browse/CASSANALYTICS-147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukasz Antoniak updated CASSANALYTICS-147:
------------------------------------------
    Description: 
Reading BTI partition index fails when reading trailer of the file that is not 
aligned within 4096-byte pages.
{code:java}
Caused by: FSReadError
        at org.apache.cassandra.io.util.ChannelProxy.read(ChannelProxy.java:157)
        at 
org.apache.cassandra.io.util.SimpleChunkReader.readChunk(SimpleChunkReader.java:52)
        at 
org.apache.cassandra.io.util.BufferManagingRebufferer.rebuffer(BufferManagingRebufferer.java:88)
        at 
org.apache.cassandra.io.util.RandomAccessReader.reBufferAt(RandomAccessReader.java:82)
        at 
org.apache.cassandra.io.util.RandomAccessReader.reBuffer(RandomAccessReader.java:67)
        at 
org.apache.cassandra.io.util.RebufferingInputStream.readByte(RebufferingInputStream.java:185)
        at 
org.apache.cassandra.io.util.RebufferingInputStream.readBigEndianPrimitiveSlowly(RebufferingInputStream.java:149)
        at 
org.apache.cassandra.io.util.RebufferingInputStream.readLong(RebufferingInputStream.java:243)
        at 
org.apache.cassandra.io.sstable.format.bti.PartitionIndex.load(PartitionIndex.java:226)
        at 
org.apache.cassandra.spark.reader.ReaderUtils.keysFromIndex(ReaderUtils.java:256)
        at 
org.apache.cassandra.spark.reader.ReaderUtils.keysFromIndex(ReaderUtils.java:231)
        at 
org.apache.cassandra.spark.reader.SSTableCache.lambda$keysFromIndex$1(SSTableCache.java:123)
        at 
com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4903)
        at 
com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3574)
        at 
com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2316)
        at 
com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2190)
        at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2080)
        ... 13 more {code}
Assume an index file of size 2527234. {{PartitionIndex}} will first read 4096 
bytes at position 2523136, and then remaining 2 starting at position 2527232. 
{{BufferingInputStream}} fails to read last one byte due to +1 shift in 
computed read range. As a consequence, FINISH marker is added too soon and EOF 
error is raised.

Unit test implemented in the PR shows the faulty behaviour.

 

  was:
Reading BTI partition index fails when reading trailer of the file that is not 
aligned within 4096-byte pages.

 
{code:java}
Caused by: FSReadError
        at org.apache.cassandra.io.util.ChannelProxy.read(ChannelProxy.java:157)
        at 
org.apache.cassandra.io.util.SimpleChunkReader.readChunk(SimpleChunkReader.java:52)
        at 
org.apache.cassandra.io.util.BufferManagingRebufferer.rebuffer(BufferManagingRebufferer.java:88)
        at 
org.apache.cassandra.io.util.RandomAccessReader.reBufferAt(RandomAccessReader.java:82)
        at 
org.apache.cassandra.io.util.RandomAccessReader.reBuffer(RandomAccessReader.java:67)
        at 
org.apache.cassandra.io.util.RebufferingInputStream.readByte(RebufferingInputStream.java:185)
        at 
org.apache.cassandra.io.util.RebufferingInputStream.readBigEndianPrimitiveSlowly(RebufferingInputStream.java:149)
        at 
org.apache.cassandra.io.util.RebufferingInputStream.readLong(RebufferingInputStream.java:243)
        at 
org.apache.cassandra.io.sstable.format.bti.PartitionIndex.load(PartitionIndex.java:226)
        at 
org.apache.cassandra.spark.reader.ReaderUtils.keysFromIndex(ReaderUtils.java:256)
        at 
org.apache.cassandra.spark.reader.ReaderUtils.keysFromIndex(ReaderUtils.java:231)
        at 
org.apache.cassandra.spark.reader.SSTableCache.lambda$keysFromIndex$1(SSTableCache.java:123)
        at 
com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4903)
        at 
com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3574)
        at 
com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2316)
        at 
com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2190)
        at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2080)
        ... 13 more {code}
Assume an index file of size 2527234. {{PartitionIndex}} will first read 4096 
bytes at position 2523136, and then remaining 2 starting at position 2527232. 
{{BufferingInputStream}} fails to read last one byte due to +1 shift in 
computed read range. As a consequence, FINISH marker is added too soon and EOF 
error is raised.

Unit test implemented in the PR shows the faulty behaviour.

 


> BufferingInputStream fails to read last unaligned chunk
> -------------------------------------------------------
>
>                 Key: CASSANALYTICS-147
>                 URL: https://issues.apache.org/jira/browse/CASSANALYTICS-147
>             Project: Apache Cassandra Analytics
>          Issue Type: Bug
>          Components: Reader
>            Reporter: Lukasz Antoniak
>            Priority: Normal
>
> Reading BTI partition index fails when reading trailer of the file that is 
> not aligned within 4096-byte pages.
> {code:java}
> Caused by: FSReadError
>         at 
> org.apache.cassandra.io.util.ChannelProxy.read(ChannelProxy.java:157)
>         at 
> org.apache.cassandra.io.util.SimpleChunkReader.readChunk(SimpleChunkReader.java:52)
>         at 
> org.apache.cassandra.io.util.BufferManagingRebufferer.rebuffer(BufferManagingRebufferer.java:88)
>         at 
> org.apache.cassandra.io.util.RandomAccessReader.reBufferAt(RandomAccessReader.java:82)
>         at 
> org.apache.cassandra.io.util.RandomAccessReader.reBuffer(RandomAccessReader.java:67)
>         at 
> org.apache.cassandra.io.util.RebufferingInputStream.readByte(RebufferingInputStream.java:185)
>         at 
> org.apache.cassandra.io.util.RebufferingInputStream.readBigEndianPrimitiveSlowly(RebufferingInputStream.java:149)
>         at 
> org.apache.cassandra.io.util.RebufferingInputStream.readLong(RebufferingInputStream.java:243)
>         at 
> org.apache.cassandra.io.sstable.format.bti.PartitionIndex.load(PartitionIndex.java:226)
>         at 
> org.apache.cassandra.spark.reader.ReaderUtils.keysFromIndex(ReaderUtils.java:256)
>         at 
> org.apache.cassandra.spark.reader.ReaderUtils.keysFromIndex(ReaderUtils.java:231)
>         at 
> org.apache.cassandra.spark.reader.SSTableCache.lambda$keysFromIndex$1(SSTableCache.java:123)
>         at 
> com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4903)
>         at 
> com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3574)
>         at 
> com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2316)
>         at 
> com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2190)
>         at 
> com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2080)
>         ... 13 more {code}
> Assume an index file of size 2527234. {{PartitionIndex}} will first read 4096 
> bytes at position 2523136, and then remaining 2 starting at position 2527232. 
> {{BufferingInputStream}} fails to read last one byte due to +1 shift in 
> computed read range. As a consequence, FINISH marker is added too soon and 
> EOF error is raised.
> Unit test implemented in the PR shows the faulty behaviour.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to