[ 
https://issues.apache.org/jira/browse/KAFKA-19014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hasil Sharma updated KAFKA-19014:
---------------------------------
    Description: 
A race condition between threads below results in MappedByteBuffer to reference 
to a deleted file and attempts to read the file are potentially resulting in 
JVM to crash.

 

Chain of events:

*Thread - 1 remote-log-reader*

1/ Fetches the offsetIndex from the indexCache which internally maps the 
physical offset index file as MappedByteBuffer.

OffsetIndex offsetIndex = 
indexCache.getIndexEntry(segmentMetadata).offsetIndex(); 
([here|https://github.com/apache/kafka/blob/cf7029c0264fd7f7b15c2e98acc874ec8c3403f2/core/src/main/java/kafka/log/remote/RemoteLogManager.java#L1772])

*Thread - 2 index cache thread*

Entry is marked for cleanup i.e physical offset index file is renamed.

*Thread - 3 remote-log-index-cleaner*

Physical offset index file is deleted.

*Thread - 1 remote-log-reader*

Attempts run binary search on the MappedByteBuffer that is mapped to a 
non-existent file.

long upperBoundOffset = offsetIndex.fetchUpperBoundOffset(startOffsetPosition, 
fetchSize).map(position -> position.offset).orElse(segmentMetadata.endOffset() 
+ 1); 
([here|https://github.com/apache/kafka/blob/3.8/core/src/main/java/kafka/log/remote/RemoteLogManager.java#L1619])

 

Results in JVM fatal error (SIGSEV) with stack trace:

 
{code:java}
Stack: [0x000072ee9112d000,0x000072ee9122d000],  sp=0x000072ee9122b360,  free 
space=1016k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
J 6483 c2 java.nio.DirectByteBuffer.getInt(I)I [email protected] (28 bytes) @ 
0x000072f23d2f12f1 [0x000072f23d2f12a0+0x0000000000000051]
j  
org.apache.kafka.storage.internals.log.OffsetIndex.relativeOffset(Ljava/nio/ByteBuffer;I)I+5
j  
org.apache.kafka.storage.internals.log.OffsetIndex.parseEntry(Ljava/nio/ByteBuffer;I)Lorg/apache/kafka/storage/internals/log/OffsetPosition;+11
j  
org.apache.kafka.storage.internals.log.OffsetIndex.parseEntry(Ljava/nio/ByteBuffer;I)Lorg/apache/kafka/storage/internals/log/IndexEntry;+3
j  
org.apache.kafka.storage.internals.log.AbstractIndex.binarySearch(Ljava/nio/ByteBuffer;JLorg/apache/kafka/storage/internals/log/IndexSearchType;Lorg/apache/kafka/storage/internals/log/AbstractIndex$SearchResultType;II)I+30
j  
org.apache.kafka.storage.internals.log.AbstractIndex.indexSlotRangeFor(Ljava/nio/ByteBuffer;JLorg/apache/kafka/storage/internals/log/IndexSearchType;Lorg/apache/kafka/storage/internals/log/AbstractIndex$SearchResultType;)I+126
j  
org.apache.kafka.storage.internals.log.AbstractIndex.smallestUpperBoundSlotFor(Ljava/nio/ByteBuffer;JLorg/apache/kafka/storage/internals/log/IndexSearchType;)I+8
 {code}
 

 

As per MappedByteBuffer documentation 
([here|https://devdocs.io/openjdk~17/java.base/java/nio/mappedbytebuffer]):

All or part of a mapped byte buffer may become inaccessible at any time, for 
example if the mapped file is truncated. An attempt to access an inaccessible 
region of a mapped byte buffer will not change the buffer's content and will 
cause an unspecified exception to be thrown either at the time of the access or 
at some later time. It is therefore strongly recommended that appropriate 
precautions be taken to avoid the manipulation of a mapped file by this 
program, or by a concurrently running program, except to read or write the 
file's content.

  was:
A race condition between threads below results in MappedByteBuffer to reference 
to a deleted file and attempts to read the file are potentially resulting in 
JVM to crash.

 

Chain of events:

*Thread - 1 remote-log-reader*

1/ Fetches the offsetIndex from the indexCache which internally maps the 
physical offset index file as MappedByteBuffer.

OffsetIndex offsetIndex = 
indexCache.getIndexEntry(segmentMetadata).offsetIndex(); 
([here|https://github.com/apache/kafka/blob/cf7029c0264fd7f7b15c2e98acc874ec8c3403f2/core/src/main/java/kafka/log/remote/RemoteLogManager.java#L1772])

*Thread - 2 index cache thread*

Entry is marked for cleanup i.e physical offset index file is renamed.

*Thread - 3 remote-log-index-cleaner*

Physical offset index file is deleted.

*Thread - 1 remote-log-reader*

Attempts run binary search on the MappedByteBuffer that is mapped to a 
non-existent file.

long upperBoundOffset = offsetIndex.fetchUpperBoundOffset(startOffsetPosition, 
fetchSize).map(position -> position.offset).orElse(segmentMetadata.endOffset() 
+ 1); 
([here|https://github.com/apache/kafka/blob/3.8/core/src/main/java/kafka/log/remote/RemoteLogManager.java#L1619])

 

Results in JVM fatal error (SIGSEV) with stack trace:

 
{code:java}
Stack: [0x000072ee9112d000,0x000072ee9122d000],  sp=0x000072ee9122b360,  free 
space=1016kNative frames: (J=compiled Java code, j=interpreted, Vv=VM code, 
C=native code)J 6483 c2 java.nio.DirectByteBuffer.getInt(I)I [email protected] 
(28 bytes) @ 0x000072f23d2f12f1 [0x000072f23d2f12a0+0x0000000000000051]j  
org.apache.kafka.storage.internals.log.OffsetIndex.relativeOffset(Ljava/nio/ByteBuffer;I)I+5j
  
org.apache.kafka.storage.internals.log.OffsetIndex.parseEntry(Ljava/nio/ByteBuffer;I)Lorg/apache/kafka/storage/internals/log/OffsetPosition;+11j
  
org.apache.kafka.storage.internals.log.OffsetIndex.parseEntry(Ljava/nio/ByteBuffer;I)Lorg/apache/kafka/storage/internals/log/IndexEntry;+3j
  
org.apache.kafka.storage.internals.log.AbstractIndex.binarySearch(Ljava/nio/ByteBuffer;JLorg/apache/kafka/storage/internals/log/IndexSearchType;Lorg/apache/kafka/storage/internals/log/AbstractIndex$SearchResultType;II)I+30j
  
org.apache.kafka.storage.internals.log.AbstractIndex.indexSlotRangeFor(Ljava/nio/ByteBuffer;JLorg/apache/kafka/storage/internals/log/IndexSearchType;Lorg/apache/kafka/storage/internals/log/AbstractIndex$SearchResultType;)I+126j
  
org.apache.kafka.storage.internals.log.AbstractIndex.smallestUpperBoundSlotFor(Ljava/nio/ByteBuffer;JLorg/apache/kafka/storage/internals/log/IndexSearchType;)I+8
 {code}
 

 

As per MappedByteBuffer documentation 
([here|https://devdocs.io/openjdk~17/java.base/java/nio/mappedbytebuffer]):

All or part of a mapped byte buffer may become inaccessible at any time, for 
example if the mapped file is truncated. An attempt to access an inaccessible 
region of a mapped byte buffer will not change the buffer's content and will 
cause an unspecified exception to be thrown either at the time of the access or 
at some later time. It is therefore strongly recommended that appropriate 
precautions be taken to avoid the manipulation of a mapped file by this 
program, or by a concurrently running program, except to read or write the 
file's content.


> Potential race condition in remote-log-reader and remote-log-index-cleaner 
> thread
> ---------------------------------------------------------------------------------
>
>                 Key: KAFKA-19014
>                 URL: https://issues.apache.org/jira/browse/KAFKA-19014
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 3.8.1
>            Reporter: Hasil Sharma
>            Priority: Major
>              Labels: tiered-storage
>
> A race condition between threads below results in MappedByteBuffer to 
> reference to a deleted file and attempts to read the file are potentially 
> resulting in JVM to crash.
>  
> Chain of events:
> *Thread - 1 remote-log-reader*
> 1/ Fetches the offsetIndex from the indexCache which internally maps the 
> physical offset index file as MappedByteBuffer.
> OffsetIndex offsetIndex = 
> indexCache.getIndexEntry(segmentMetadata).offsetIndex(); 
> ([here|https://github.com/apache/kafka/blob/cf7029c0264fd7f7b15c2e98acc874ec8c3403f2/core/src/main/java/kafka/log/remote/RemoteLogManager.java#L1772])
> *Thread - 2 index cache thread*
> Entry is marked for cleanup i.e physical offset index file is renamed.
> *Thread - 3 remote-log-index-cleaner*
> Physical offset index file is deleted.
> *Thread - 1 remote-log-reader*
> Attempts run binary search on the MappedByteBuffer that is mapped to a 
> non-existent file.
> long upperBoundOffset = 
> offsetIndex.fetchUpperBoundOffset(startOffsetPosition, 
> fetchSize).map(position -> 
> position.offset).orElse(segmentMetadata.endOffset() + 1); 
> ([here|https://github.com/apache/kafka/blob/3.8/core/src/main/java/kafka/log/remote/RemoteLogManager.java#L1619])
>  
> Results in JVM fatal error (SIGSEV) with stack trace:
>  
> {code:java}
> Stack: [0x000072ee9112d000,0x000072ee9122d000],  sp=0x000072ee9122b360,  free 
> space=1016k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native 
> code)
> J 6483 c2 java.nio.DirectByteBuffer.getInt(I)I [email protected] (28 bytes) @ 
> 0x000072f23d2f12f1 [0x000072f23d2f12a0+0x0000000000000051]
> j  
> org.apache.kafka.storage.internals.log.OffsetIndex.relativeOffset(Ljava/nio/ByteBuffer;I)I+5
> j  
> org.apache.kafka.storage.internals.log.OffsetIndex.parseEntry(Ljava/nio/ByteBuffer;I)Lorg/apache/kafka/storage/internals/log/OffsetPosition;+11
> j  
> org.apache.kafka.storage.internals.log.OffsetIndex.parseEntry(Ljava/nio/ByteBuffer;I)Lorg/apache/kafka/storage/internals/log/IndexEntry;+3
> j  
> org.apache.kafka.storage.internals.log.AbstractIndex.binarySearch(Ljava/nio/ByteBuffer;JLorg/apache/kafka/storage/internals/log/IndexSearchType;Lorg/apache/kafka/storage/internals/log/AbstractIndex$SearchResultType;II)I+30
> j  
> org.apache.kafka.storage.internals.log.AbstractIndex.indexSlotRangeFor(Ljava/nio/ByteBuffer;JLorg/apache/kafka/storage/internals/log/IndexSearchType;Lorg/apache/kafka/storage/internals/log/AbstractIndex$SearchResultType;)I+126
> j  
> org.apache.kafka.storage.internals.log.AbstractIndex.smallestUpperBoundSlotFor(Ljava/nio/ByteBuffer;JLorg/apache/kafka/storage/internals/log/IndexSearchType;)I+8
>  {code}
>  
>  
> As per MappedByteBuffer documentation 
> ([here|https://devdocs.io/openjdk~17/java.base/java/nio/mappedbytebuffer]):
> All or part of a mapped byte buffer may become inaccessible at any time, for 
> example if the mapped file is truncated. An attempt to access an inaccessible 
> region of a mapped byte buffer will not change the buffer's content and will 
> cause an unspecified exception to be thrown either at the time of the access 
> or at some later time. It is therefore strongly recommended that appropriate 
> precautions be taken to avoid the manipulation of a mapped file by this 
> program, or by a concurrently running program, except to read or write the 
> file's content.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to