[ 
https://issues.apache.org/jira/browse/CASSANDRA-20692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17956136#comment-17956136
 ] 

guo Maxwell edited comment on CASSANDRA-20692 at 6/4/25 3:14 PM:
-----------------------------------------------------------------

I made a pr for trunk : https://github.com/apache/cassandra/pull/4195 , let me 
run the 
[ut|https://ci-cassandra.apache.org/view/patches/job/Cassandra-devbranch-5/295/]
 here.
and I will prepare a pr for 5.0 if this pr is ok.
I  did a test locally on my mac (one c* node) use cassandra-stress just want to 
see if direct io can make any difference for cassandra ' write. 
my test cases are very simple ,client command :

{code:java}
./cassandra-stress write duration=10m -rate threads=10 -graph 
file=/tmp/graph_mmap_wal.html title=mmap_wal_graph
./cassandra-stress write duration=10m -rate threads=10 -graph 
file=/tmp/graph_direct_wal.html title=direct_wal_graph
{code}
 
I flushed the node when I finished test every time(I did two rounds of test). 
The result seems to be ok , the tps for direct io mode is better than mmap。

 !SCR-20250604-txwj.png!  

!SCR-20250604-txzv.png! 


was (Author: maxwellguo):
I made a pr for trunk : https://github.com/apache/cassandra/pull/4195 , let me 
run the ut here.
and I will prepare a pr for 5.0 if this pr is ok.
I  did a test locally on my mac (one c* node) use cassandra-stress just want to 
see if direct io can make any difference for cassandra ' write. 
my test cases are very simple ,client command :

{code:java}
./cassandra-stress write duration=10m -rate threads=10 -graph 
file=/tmp/graph_mmap_wal.html title=mmap_wal_graph
./cassandra-stress write duration=10m -rate threads=10 -graph 
file=/tmp/graph_direct_wal.html title=direct_wal_graph
{code}
 
I flushed the node when I finished test every time(I did two rounds of test). 
The result seems to be ok , the tps for direct io mode is better than mmap。
 !SCR-20250604-txwj.png!  !SCR-20250604-txzv.png! 

> Direct IO commit log does not flush data safely
> -----------------------------------------------
>
>                 Key: CASSANDRA-20692
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-20692
>             Project: Apache Cassandra
>          Issue Type: Bug
>          Components: Local/Commit Log
>            Reporter: Ariel Weisberg
>            Assignee: guo Maxwell
>            Priority: Urgent
>             Fix For: 5.0.x, 5.x
>
>         Attachments: SCR-20250604-txwj.png, SCR-20250604-txzv.png
>
>
> [~maxwellguo] spotted this a few days ago.
> The commit log is not safe as currently written with Direct IO. It writes to 
> the file, but doesn't sync the metadata on flush. That means the commit log 
> may claim it has flushed (and made durable) the data, but the filesystem 
> journal has not been flushed so the file length could be wrong and could 
> truncate the file on restart.
> Additionally Direct IO doesn't actually make data durable on disk (emit write 
> barriers) it just bypasses the page cache. [It doesn't even guarantee the 
> disk transaction is 
> complete.|https://man7.org/linux/man-pages/man2/open.2.html#:~:text=The%20O_DIRECT%20flag,for%20further%20discussion.]
>  If the disk cache is volatile then it can lose metadata and data.
> It can probably be fixed pretty trivially by opening the file with {{D_SYNC}} 
> because the commit log writes up to the entire segment when it flushes so 
> there is no issue with needing to add buffering to avoid too many small 
> writes.
> [~amit_pawar] [~jlewandowski] [~blambov] 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to