Re: [DISCUSS] CEP-11: Pluggable memtable implementations

2021-07-22 Thread Michael Burman
On Wed, 21 Jul 2021 at 17:24, Branimir Lambov 
wrote:

> > Why is flushing control bad to do in CFS and better in the
>   memtable?
>
> I wonder why you would understand this as something that takes away
> control instead of giving it. The CFS is not configurable. With the
> CEP, memtables are configurable at the table level. It is entirely
> possible to implement a memtable wrapper that provides any of the
> examples of functionalities you mention -- and that would be fully
> configurable (just as example, one could very well select a
> time-series-optimized-flush wrapper over skip-list memtable).
>
>
I think this was a bit of miscommunication. I'm not in favor of keeping it
in the CFS, but at least to me (as a reader) CEP indicates the flushing
behavior is suddenly more tied to the Memtable implementation level rather
than being configurable at the table level. Thus that would not reduce
coupling of different flush strategies, but instead just move it from CFS
to Memtable-implementation. And especially with multiple Memtable
implementations that would mean the reusable parts of flushing could end up
being difficult to reuse. If not the intention, then good.


>
> This is another question that the proposal leaves to the memtable
> implementation (or wrapper), but it does make sense to make sure the
> interfaces provide the necessary support for sharding
>

+ 1 to this, that's a good limitation of scope to get forward. I think this
was originally touched in 7282 (where I had it in the memtable impl), but
then got pushed one step outside.

writesShouldSkipCommitLog is a result of scope reduction (call it
> laziness on my part). I could not find a way to tell if commit log
> data may be required for point-in-time-restore or any other feature,
> and the existing method of turning the commit log off does not have
> the right granularity. I am very open to suggestions here.
>

Could this be limited to a single parameter? I'm not sure if the
"isDurable" + "shouldSkip" is interesting instead of "shouldWrite" (etc).
But I also wonder in cases where point-in-time restore is required how one
could achieve it without a commit log (can persistent memory memtable be
rolled back?). That does have an effect on backups. I have to read your
impl how you intended to rewrite the process from Keyspace (where the
requirement for "isDurable" starts from).

Although I do feel like persistent memory exceptions make stuff more
complex.



>
>
>
> > Why is streaming in the memtable? [...] the wanted behavior is just
>   disabling automated flushing
>
> Yes, if zero-copy-streaming is not enabled. And that's exactly what
> this method is there for -- to make sure sstables are not copied
> whole, and that a flush is not done at the end.
>
> Regards,
> Branimir
>
> On Wed, Jul 21, 2021 at 4:33 PM bened...@apache.org 
> wrote:
>
> > I would love to help out with this in any way that I can, FYI. Definitely
> > one of the more impactful performance improvements to the codebase, given
> > the benefits to compaction and memory behaviour.
> >
> > From: bened...@apache.org 
> > Date: Wednesday, 21 July 2021 at 14:32
> > To: dev@cassandra.apache.org 
> > Subject: Re: [DISCUSS] CEP-11: Pluggable memtable implementations
> > > memtable-as-a-commitlog-index
> >
> > Heh, based on 7282? Yeah, I’ve had this idea for a while now (actually
> > there was a paper that did this a long time ago), and it could be very
> nice
> > (if for no other benefit than reducing heap utilisation). I don’t think
> > this requires that they be modelled as the same concept, however, only
> that
> > the Memtable must be able to receive an address into a commit log entry
> and
> > to adopt partial ownership over the entry’s lifecycle.
> >
> >
> > From: Branimir Lambov 
> > Date: Wednesday, 21 July 2021 at 14:28
> > To: dev@cassandra.apache.org 
> > Subject: Re: [DISCUSS] CEP-11: Pluggable memtable implementations
> > > In general, I think we need to make up our mind as to whether we
> >   consider the Memtable and CommitLog one logical entity [...], or
> >   whether we want to further untangle those two components from an
> >   architectural perspective which we started down that road on with
> >   the pluggable storage engine work.
> >
> > This CEP is intentionally not attempting to answer this question. FWIW
> > I do not see them as separable (there's evidence to this fact in the
> > codebase), but there are valid secondary uses of the commit log that
> > are served well enough by the current architecture.
> >
> > It is important, however, to let the memtable implementation opt out,
> > to permit it to provide its own solution for data persistence.
> >
> > We should revisit this in the future, especially if Benedict's shared
> > log facility and my plans for a memtable-as-a-commitlog-index
> > evolve.
> >
> > Regards,
> > Branimir
> >
> > On Wed, Jul 21, 2021 at 1:34 PM Michael Burman  wrote:
> >
> > > Hi,
> > >
> > > It is nice to see these going forward (and a great us

[VOTE] Release Apache Cassandra 4.0.0 (third time is the charm)

2021-07-22 Thread Brandon Williams
I am proposing the test build of Cassandra 4.0.0 for release.

sha1: 902b4d31772eaa84f05ffdc1e4f4b7a66d5b17e6
Git: 
https://gitbox.apache.org/repos/asf?p=cassandra.git;a=shortlog;h=refs/tags/4.0.0-tentative
Maven Artifacts:
https://repository.apache.org/content/repositories/orgapachecassandra-1244/org/apache/cassandra/cassandra-all/4.0.0/

The Source and Build Artifacts, and Debian and RPM packages and
repositories are available here:
https://dist.apache.org/repos/dist/dev/cassandra/4.0.0/

The vote will be open for 72 hours (longer if needed). Everyone who
has tested the build is invited to vote. Votes by PMC members are
considered binding. A vote passes if there are at least three binding
+1s and no -1's.

[1]: CHANGES.txt:
https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/4.0.0-tentative
[2]: NEWS.txt: 
https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=NEWS.txt;hb=refs/tags/4.0.0-tentative

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [VOTE] Release Apache Cassandra 4.0.0 (third time is the charm)

2021-07-22 Thread Ekaterina Dimitrova
+1 (nb)

On Thu, 22 Jul 2021 at 18:40, Brandon Williams 
wrote:

> I am proposing the test build of Cassandra 4.0.0 for release.
>
> sha1: 902b4d31772eaa84f05ffdc1e4f4b7a66d5b17e6
> Git:
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=shortlog;h=refs/tags/4.0.0-tentative
> Maven Artifacts:
>
> https://repository.apache.org/content/repositories/orgapachecassandra-1244/org/apache/cassandra/cassandra-all/4.0.0/
>
> The Source and Build Artifacts, and Debian and RPM packages and
> repositories are available here:
> https://dist.apache.org/repos/dist/dev/cassandra/4.0.0/
>
> The vote will be open for 72 hours (longer if needed). Everyone who
> has tested the build is invited to vote. Votes by PMC members are
> considered binding. A vote passes if there are at least three binding
> +1s and no -1's.
>
> [1]: CHANGES.txt:
>
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/4.0.0-tentative
> [2]: NEWS.txt:
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=NEWS.txt;hb=refs/tags/4.0.0-tentative
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: [VOTE] Release Apache Cassandra 4.0.0 (third time is the charm)

2021-07-22 Thread Jeff Jirsa
+1

> On Jul 22, 2021, at 3:41 PM, Brandon Williams  
> wrote:
> 
> I am proposing the test build of Cassandra 4.0.0 for release.
> 
> sha1: 902b4d31772eaa84f05ffdc1e4f4b7a66d5b17e6
> Git: 
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=shortlog;h=refs/tags/4.0.0-tentative
> Maven Artifacts:
> https://repository.apache.org/content/repositories/orgapachecassandra-1244/org/apache/cassandra/cassandra-all/4.0.0/
> 
> The Source and Build Artifacts, and Debian and RPM packages and
> repositories are available here:
> https://dist.apache.org/repos/dist/dev/cassandra/4.0.0/
> 
> The vote will be open for 72 hours (longer if needed). Everyone who
> has tested the build is invited to vote. Votes by PMC members are
> considered binding. A vote passes if there are at least three binding
> +1s and no -1's.
> 
> [1]: CHANGES.txt:
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/4.0.0-tentative
> [2]: NEWS.txt: 
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=NEWS.txt;hb=refs/tags/4.0.0-tentative
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org