Hi,

With compaction there can be hot and cold data mixed together. So we want
to drop the data and then warm it via early opening so only the hot data is
in the cache.

Some of those cases are for the old sstable that have been rewritten or
discarded so the data is entirely defunct. The files might not get deleted
though so they do add pressure to the cache until they are evicted.

In the instance you are looking at in a tidier won't there always be a
reference held in the current view for the column family? It don't think it
would constantly be evicting them nor closing/reopening and remapping the
file.


Specifically regarding the behavior in different kernels, from `man
> posix_fadvise`: "In kernels before 2.6.6, if len was specified as 0, then
> this was interpreted literally as "zero bytes", rather than as meaning "all
> bytes through to the end of the file"."

Not ideal, but at least not actively harmful right? The cache is supposed
to be scan/flush resistant.

Ariel

On Tue, Oct 18, 2016 at 11:57 AM, Michael Kjellman <
mkjell...@internalcircle.com> wrote:

> Right, so in SSTableReader#GlobalTidy$tidy it does:
> // don't ideally want to dropPageCache for the file until all instances
> have been released
> CLibrary.trySkipCache(desc.filenameFor(Component.DATA), 0, 0);
> CLibrary.trySkipCache(desc.filenameFor(Component.PRIMARY_INDEX), 0, 0);
>
> It seems to me every time the reference is released on a new sstable we
> would immediately tidy() it and then call posix_fadvise with
> POSIX_FADV_DONTNEED with an offset of 0 and a length of 0 (which I'm
> thinking is doing so in respect to the API behavior in modern Linux kernel
> builds?). Am I reading things correctly here? Sorta hard as there are many
> different code paths the reference could have tidy() called.
>
> Why would we want to drop the segment we just write from the page cache --
> wouldn't that most likely be the most hot data, and even if it turned out
> not to be wouldn't it be better in this case to have kernel be smart at
> what it's best at?
>
> best,
> kjellman
>
> > On Oct 18, 2016, at 8:50 AM, Jake Luciani <jak...@gmail.com> wrote:
> >
> > The main point is to avoid keeping things in the page cache that are no
> > longer needed like compacted data that has been early opened elsewhere.
> >
> > On Oct 18, 2016 11:29 AM, "Michael Kjellman" <
> mkjell...@internalcircle.com>
> > wrote:
> >
> >> We use posix_fadvise in a bunch of places, and in stereotypical
> Cassandra
> >> fashion no comments were provided.
> >>
> >> There is a check the OS is Linux (okay, a start) but it turns out the
> >> behavior of providing a length of 0 to posix_fadvise changed in some 2.6
> >> kernels. We don't check the kernel version -- or even note it.
> >>
> >> What is the *expected* outcome of our use of posix_fadvise -- not what
> >> does it do or not do today -- but what problem was it added to solve and
> >> what's the expected behavior regardless of kernel versions.
> >>
> >> best,
> >> kjellman
> >>
> >> Sent from my iPhone
>
>

Reply via email to