Specifically regarding the behavior in different kernels, from `man 
posix_fadvise`: "In kernels before 2.6.6, if len was specified as 0, then this 
was interpreted literally as "zero bytes", rather than as meaning "all bytes 
through to the end of the file"."

On Oct 18, 2016, at 8:57 AM, Michael Kjellman 
<mkjell...@internalcircle.com<mailto:mkjell...@internalcircle.com>> wrote:

Right, so in SSTableReader#GlobalTidy$tidy it does:
// don't ideally want to dropPageCache for the file until all instances have 
been released
CLibrary.trySkipCache(desc.filenameFor(Component.DATA), 0, 0);
CLibrary.trySkipCache(desc.filenameFor(Component.PRIMARY_INDEX), 0, 0);

It seems to me every time the reference is released on a new sstable we would 
immediately tidy() it and then call posix_fadvise with POSIX_FADV_DONTNEED with 
an offset of 0 and a length of 0 (which I'm thinking is doing so in respect to 
the API behavior in modern Linux kernel builds?). Am I reading things correctly 
here? Sorta hard as there are many different code paths the reference could 
have tidy() called.

Why would we want to drop the segment we just write from the page cache -- 
wouldn't that most likely be the most hot data, and even if it turned out not 
to be wouldn't it be better in this case to have kernel be smart at what it's 
best at?

best,
kjellman

On Oct 18, 2016, at 8:50 AM, Jake Luciani 
<jak...@gmail.com<mailto:jak...@gmail.com>> wrote:

The main point is to avoid keeping things in the page cache that are no
longer needed like compacted data that has been early opened elsewhere.

On Oct 18, 2016 11:29 AM, "Michael Kjellman" 
<mkjell...@internalcircle.com<mailto:mkjell...@internalcircle.com>>
wrote:

We use posix_fadvise in a bunch of places, and in stereotypical Cassandra
fashion no comments were provided.

There is a check the OS is Linux (okay, a start) but it turns out the
behavior of providing a length of 0 to posix_fadvise changed in some 2.6
kernels. We don't check the kernel version -- or even note it.

What is the *expected* outcome of our use of posix_fadvise -- not what
does it do or not do today -- but what problem was it added to solve and
what's the expected behavior regardless of kernel versions.

best,
kjellman

Sent from my iPhone


Reply via email to