Re: Audit logging to tables.

2019-04-09 Thread Sagar
Thanks Jon.

So, it will basically have to read the audit log file which can certainly
be implemented. One of the things to consider is, if for example, we have
some external system which is polling the audit log table on a given node
every minute to get incremental audit logs. In that case, it will keep
track of the last offset read from the file and query from that point
onward from the virtual table. I think such querying APIs are already
present in the current Virtual table implementation. So, in this case, the
query would be to read the file from the specified offset. We can use
RandomAccessFile for example for this.

Another thing to consider is, what if the audit log files get rolled
over(does this happen?) and the fictitious external system from above has
not read the rolled over file fully. In that case, it will need to read the
ex- audit log file fully and the remaining bytes from the new audit log.

Thanks!
Sagar.

On Thu, Apr 4, 2019 at 8:59 AM Jon Haddad  wrote:

> The virtual table could read the data out of the audit log, just like
> it could read a hosts file or list the output of the ps command.
>
>
> On Wed, Apr 3, 2019 at 8:02 PM Sagar  wrote:
> >
> > Thanks Alex.
> >
> > I was going through the implementation of Virtual tables thus far and the
> > data that we get when we query against them seems to be more point in
> time
> > like CachesTable or fairly static like Settings. Having said that, audit
> > log's nature of data doesn't fall in either of the 2 categories. For
> Audit
> > log, it should be more of a stream of events that happen on that node and
> > almost all events need to be captured. The class AbstractDataSet being
> used
> > by the Virtual tables suggests that it can be built on demand and thrown
> > after use(which is what is happening currently) or can be persisted. IMO
> if
> > we need audit logs on Virtual tables, we will have to go the route of
> being
> > able to persist the events generated.
> >
> > Sagar.
> >
> > On Sun, Mar 31, 2019 at 11:35 PM Alex Ott  wrote:
> >
> > > Hi Sagar
> > >
> > > 3.x/4.x are versions for open source variant of drivers, while DSE
> versions
> > > are 1.x/2.x
> > >
> > > Description of this function is a thttps://
> > > docs.datastax.com/en/drivers/java/3.6/
> > >
> > > Sagar  at "Tue, 26 Mar 2019 22:12:56 +0530" wrote:
> > >  S> Thanks Andy,
> > >
> > >  S> This enhancement is in the datastax version and not in the apache
> > > cassandra
> > >  S> driver?
> > >
> > >  S> Thanks!
> > >  S> Sagar.
> > >
> > >  S> On Tue, Mar 26, 2019 at 3:23 AM Andy Tolbert <
> > > andrew.tolb...@datastax.com>
> > >  S> wrote:
> > >
> > >  >> Hello
> > >  >>
> > >  >> 1) yes its local only. The driver by default does connect to each
> host
> > >  >> > though so its pretty trivial to have a load balancing policy that
> > > you can
> > >  >> > direct to specific hosts (this should probably be in the driver
> so
> > > people
> > >  >> > dont have to keep reimplementing it).
> > >  >> >
> > >  >>
> > >  >> The capability to target a specific host was added to the java
> driver
> > > (and
> > >  >> others) recently in anticipation of Virtual Tables in version
> 3.6.0+
> > > via
> > >  >> Statement.setHost [1].  This will bypass the load balancing policy
> > >  >> completely and send the request directly to that that Host
> (assuming
> > > it's
> > >  >> connected).
> > >  >>
> > >  >> The drivers also parse virtual table metadata as well.
> > >  >>
> > >  >> [1]:
> > >  >>
> > >  >>
> > >
> https://docs.datastax.com/en/drivers/java/3.6/com/datastax/driver/core/Statement.html#setHost-com.datastax.driver.core.Host-
> > >  >>
> > >  >> Thanks!
> > >  >> Andy
> > >  >>
> > >  >> On Mon, Mar 25, 2019 at 11:29 AM Sagar 
> > > wrote:
> > >  >>
> > >  >> > Thanks Chris. I got caught up with a few things and couldn't
> reply
> > > back.
> > >  >> > So, I re-looked this again and I think virtual tables can be
> used for
> > >  >> audit
> > >  >> > logging. Considering that they don't have any replication - so we
> > > won't
> > >  >> be
> > >  >> > clogging the network with replication IO.
> > >  >> >
> > >  >> > In terms of storage, from what I understood, virtual tables don't
> > > have
> > >  >> any
> > >  >> > associated SSTables. So, is data stored only in Memtables? Can
> you
> > > please
> > >  >> > shed some light on storage and the retention because of this?
> > >  >> >
> > >  >> > Lastly, the driver changes, I agree, we should make the driver be
> > > able to
> > >  >> > contact to specific hosts with the correct LBP. If we do go this
> > > route, I
> > >  >> > can start taking a look at it.
> > >  >> >
> > >  >> > Thanks!
> > >  >> > Sagar.
> > >  >> >
> > >  >> > On Wed, Mar 6, 2019 at 10:42 PM Chris Lohfink <
> clohfin...@gmail.com>
> > >  >> > wrote:
> > >  >> >
> > >  >> > > 1) yes its local only. The driver by default does connect to
> each
> > > host
> > >  >> > > though so its pretty trivial to have a load balancing policy
> that
> > > you
> > >  >> ca

Re: Audit logging to tables.

2019-04-09 Thread Chris Lohfink
You should really build it on/wait for
https://issues.apache.org/jira/browse/CASSANDRA-14629 (Can use a reviewer
please btw!) then you can provide each entry of the audit log binlog as a
partition in iterator format. Can use settings to find where binlog output
is and go from there.

Chris

On Tue, Apr 9, 2019 at 10:27 AM Sagar  wrote:

> Thanks Jon.
>
> So, it will basically have to read the audit log file which can certainly
> be implemented. One of the things to consider is, if for example, we have
> some external system which is polling the audit log table on a given node
> every minute to get incremental audit logs. In that case, it will keep
> track of the last offset read from the file and query from that point
> onward from the virtual table. I think such querying APIs are already
> present in the current Virtual table implementation. So, in this case, the
> query would be to read the file from the specified offset. We can use
> RandomAccessFile for example for this.
>
> Another thing to consider is, what if the audit log files get rolled
> over(does this happen?) and the fictitious external system from above has
> not read the rolled over file fully. In that case, it will need to read the
> ex- audit log file fully and the remaining bytes from the new audit log.
>
> Thanks!
> Sagar.
>
> On Thu, Apr 4, 2019 at 8:59 AM Jon Haddad  wrote:
>
> > The virtual table could read the data out of the audit log, just like
> > it could read a hosts file or list the output of the ps command.
> >
> >
> > On Wed, Apr 3, 2019 at 8:02 PM Sagar  wrote:
> > >
> > > Thanks Alex.
> > >
> > > I was going through the implementation of Virtual tables thus far and
> the
> > > data that we get when we query against them seems to be more point in
> > time
> > > like CachesTable or fairly static like Settings. Having said that,
> audit
> > > log's nature of data doesn't fall in either of the 2 categories. For
> > Audit
> > > log, it should be more of a stream of events that happen on that node
> and
> > > almost all events need to be captured. The class AbstractDataSet being
> > used
> > > by the Virtual tables suggests that it can be built on demand and
> thrown
> > > after use(which is what is happening currently) or can be persisted.
> IMO
> > if
> > > we need audit logs on Virtual tables, we will have to go the route of
> > being
> > > able to persist the events generated.
> > >
> > > Sagar.
> > >
> > > On Sun, Mar 31, 2019 at 11:35 PM Alex Ott  wrote:
> > >
> > > > Hi Sagar
> > > >
> > > > 3.x/4.x are versions for open source variant of drivers, while DSE
> > versions
> > > > are 1.x/2.x
> > > >
> > > > Description of this function is a thttps://
> > > > docs.datastax.com/en/drivers/java/3.6/
> > > >
> > > > Sagar  at "Tue, 26 Mar 2019 22:12:56 +0530" wrote:
> > > >  S> Thanks Andy,
> > > >
> > > >  S> This enhancement is in the datastax version and not in the apache
> > > > cassandra
> > > >  S> driver?
> > > >
> > > >  S> Thanks!
> > > >  S> Sagar.
> > > >
> > > >  S> On Tue, Mar 26, 2019 at 3:23 AM Andy Tolbert <
> > > > andrew.tolb...@datastax.com>
> > > >  S> wrote:
> > > >
> > > >  >> Hello
> > > >  >>
> > > >  >> 1) yes its local only. The driver by default does connect to each
> > host
> > > >  >> > though so its pretty trivial to have a load balancing policy
> that
> > > > you can
> > > >  >> > direct to specific hosts (this should probably be in the driver
> > so
> > > > people
> > > >  >> > dont have to keep reimplementing it).
> > > >  >> >
> > > >  >>
> > > >  >> The capability to target a specific host was added to the java
> > driver
> > > > (and
> > > >  >> others) recently in anticipation of Virtual Tables in version
> > 3.6.0+
> > > > via
> > > >  >> Statement.setHost [1].  This will bypass the load balancing
> policy
> > > >  >> completely and send the request directly to that that Host
> > (assuming
> > > > it's
> > > >  >> connected).
> > > >  >>
> > > >  >> The drivers also parse virtual table metadata as well.
> > > >  >>
> > > >  >> [1]:
> > > >  >>
> > > >  >>
> > > >
> >
> https://docs.datastax.com/en/drivers/java/3.6/com/datastax/driver/core/Statement.html#setHost-com.datastax.driver.core.Host-
> > > >  >>
> > > >  >> Thanks!
> > > >  >> Andy
> > > >  >>
> > > >  >> On Mon, Mar 25, 2019 at 11:29 AM Sagar <
> sagarmeansoc...@gmail.com>
> > > > wrote:
> > > >  >>
> > > >  >> > Thanks Chris. I got caught up with a few things and couldn't
> > reply
> > > > back.
> > > >  >> > So, I re-looked this again and I think virtual tables can be
> > used for
> > > >  >> audit
> > > >  >> > logging. Considering that they don't have any replication - so
> we
> > > > won't
> > > >  >> be
> > > >  >> > clogging the network with replication IO.
> > > >  >> >
> > > >  >> > In terms of storage, from what I understood, virtual tables
> don't
> > > > have
> > > >  >> any
> > > >  >> > associated SSTables. So, is data stored only in Memtables? Can
> > you
> > > > please
> > > >  >> > shed some light on storage and the

Re: Audit logging to tables.

2019-04-09 Thread Sagar
Sure Chris. I would be watching for that ticket's closure. You had pointed
out about this ticket already and what I had decided was, while this gets
merged etc, I can work with the folks here and get a design out.

Thanks!
Sagar.

On Tue, Apr 9, 2019 at 9:58 PM Chris Lohfink  wrote:

> You should really build it on/wait for
> https://issues.apache.org/jira/browse/CASSANDRA-14629 (Can use a reviewer
> please btw!) then you can provide each entry of the audit log binlog as a
> partition in iterator format. Can use settings to find where binlog output
> is and go from there.
>
> Chris
>
> On Tue, Apr 9, 2019 at 10:27 AM Sagar  wrote:
>
> > Thanks Jon.
> >
> > So, it will basically have to read the audit log file which can certainly
> > be implemented. One of the things to consider is, if for example, we have
> > some external system which is polling the audit log table on a given node
> > every minute to get incremental audit logs. In that case, it will keep
> > track of the last offset read from the file and query from that point
> > onward from the virtual table. I think such querying APIs are already
> > present in the current Virtual table implementation. So, in this case,
> the
> > query would be to read the file from the specified offset. We can use
> > RandomAccessFile for example for this.
> >
> > Another thing to consider is, what if the audit log files get rolled
> > over(does this happen?) and the fictitious external system from above has
> > not read the rolled over file fully. In that case, it will need to read
> the
> > ex- audit log file fully and the remaining bytes from the new audit log.
> >
> > Thanks!
> > Sagar.
> >
> > On Thu, Apr 4, 2019 at 8:59 AM Jon Haddad  wrote:
> >
> > > The virtual table could read the data out of the audit log, just like
> > > it could read a hosts file or list the output of the ps command.
> > >
> > >
> > > On Wed, Apr 3, 2019 at 8:02 PM Sagar 
> wrote:
> > > >
> > > > Thanks Alex.
> > > >
> > > > I was going through the implementation of Virtual tables thus far and
> > the
> > > > data that we get when we query against them seems to be more point in
> > > time
> > > > like CachesTable or fairly static like Settings. Having said that,
> > audit
> > > > log's nature of data doesn't fall in either of the 2 categories. For
> > > Audit
> > > > log, it should be more of a stream of events that happen on that node
> > and
> > > > almost all events need to be captured. The class AbstractDataSet
> being
> > > used
> > > > by the Virtual tables suggests that it can be built on demand and
> > thrown
> > > > after use(which is what is happening currently) or can be persisted.
> > IMO
> > > if
> > > > we need audit logs on Virtual tables, we will have to go the route of
> > > being
> > > > able to persist the events generated.
> > > >
> > > > Sagar.
> > > >
> > > > On Sun, Mar 31, 2019 at 11:35 PM Alex Ott  wrote:
> > > >
> > > > > Hi Sagar
> > > > >
> > > > > 3.x/4.x are versions for open source variant of drivers, while DSE
> > > versions
> > > > > are 1.x/2.x
> > > > >
> > > > > Description of this function is a thttps://
> > > > > docs.datastax.com/en/drivers/java/3.6/
> > > > >
> > > > > Sagar  at "Tue, 26 Mar 2019 22:12:56 +0530" wrote:
> > > > >  S> Thanks Andy,
> > > > >
> > > > >  S> This enhancement is in the datastax version and not in the
> apache
> > > > > cassandra
> > > > >  S> driver?
> > > > >
> > > > >  S> Thanks!
> > > > >  S> Sagar.
> > > > >
> > > > >  S> On Tue, Mar 26, 2019 at 3:23 AM Andy Tolbert <
> > > > > andrew.tolb...@datastax.com>
> > > > >  S> wrote:
> > > > >
> > > > >  >> Hello
> > > > >  >>
> > > > >  >> 1) yes its local only. The driver by default does connect to
> each
> > > host
> > > > >  >> > though so its pretty trivial to have a load balancing policy
> > that
> > > > > you can
> > > > >  >> > direct to specific hosts (this should probably be in the
> driver
> > > so
> > > > > people
> > > > >  >> > dont have to keep reimplementing it).
> > > > >  >> >
> > > > >  >>
> > > > >  >> The capability to target a specific host was added to the java
> > > driver
> > > > > (and
> > > > >  >> others) recently in anticipation of Virtual Tables in version
> > > 3.6.0+
> > > > > via
> > > > >  >> Statement.setHost [1].  This will bypass the load balancing
> > policy
> > > > >  >> completely and send the request directly to that that Host
> > > (assuming
> > > > > it's
> > > > >  >> connected).
> > > > >  >>
> > > > >  >> The drivers also parse virtual table metadata as well.
> > > > >  >>
> > > > >  >> [1]:
> > > > >  >>
> > > > >  >>
> > > > >
> > >
> >
> https://docs.datastax.com/en/drivers/java/3.6/com/datastax/driver/core/Statement.html#setHost-com.datastax.driver.core.Host-
> > > > >  >>
> > > > >  >> Thanks!
> > > > >  >> Andy
> > > > >  >>
> > > > >  >> On Mon, Mar 25, 2019 at 11:29 AM Sagar <
> > sagarmeansoc...@gmail.com>
> > > > > wrote:
> > > > >  >>
> > > > >  >> > Thanks Chris. I got caught up with a few things and couldn't
> > > repl

Re: Stabilising Internode Messaging in 4.0

2019-04-09 Thread Joseph Lynch
*I am relatively new to these code paths—especially compared to the
committers that have been working on these issues for years such as the
15066 authors as well as Jason Brown—but like many Cassandra users I am
familiar with many of the classes of issues Aleksey and Benedict have
identified with this patchset (especially related to messaging correctness,
performance and the lack of message backpressure). We believe that every
single fix and feature in this patch is valuable and we desire that we are
able to get them all merged in and validated. We don’t think it’s even a
question if we want to merge these: we should want these excellent changes.
The only questions—in my opinion—are how do we safely merge them and when
do we merge them?Due to my and Vinay’s relative lack of knowledge of these
code paths, we hope that we can get as many experienced eyes as we can to
review the patch and evaluate the risk-reward tradeoffs of some of the
deeper changes. We don’t feel qualified to make assertions about risk vs
reward in this patchset, but I know there are a number of people on this
mailing list who are qualified and I think we would all appreciate their
insight and help.I completely understand that we don’t live in an ideal
world, but I do personally feel that in an ideal world it would be possible
to pull the bug fixes (bugs specific to the 4.0 netty refactor) out from
the semantic changes (e.g. droppability, checksumming, back pressure,
handshake changes), code refactors (e.g. verb handler,
MessageIn/MessageOut) and performance changes (various re-implementations
of Netty internals, some optimizations around dropping dead messages
earlier). Then we can review, validate, and benchmark each change
independently and iteratively move towards better messaging. At the same
time, I recognize that it may be hard to pull these changes apart, but I
worry that review and validation of the patch, as is, may take the testing
community many months to properly vet and will either mean that we cut 4.0
many, many months from now or we cut 4.0 before we can properly test the
patchset.I think we are all agreed we don’t want an unstable 4.0, so the
main decision point here is: what set of changes from this valuable and
important patch set do we put in 4.0, and which do we try to put in 4.next?
Once we determine that, the community can hopefully start allocating the
necessary review, testing, and benchmarking resources to ensure that 4.0 is
our first ever rock solid “.0” release.-Joey*

On Thu, Apr 4, 2019 at 5:56 PM Jon Haddad  wrote:

> Given the number of issues that are addressed, I definitely think it's
> worth strongly considering merging this in.  I think it might be a
> little unrealistic to cut the first alpha after the merge though.
> Being realistic, any 20K+ LOC change is going to introduce its own
> bugs, and we should be honest with ourselves about that.  It seems
> likely the issues the patch addressed would have affected the 4.0
> release in some form *anyways* so the question might be do we fix them
> now or after someone's cluster burns down because there's no inbound /
> outbound message load shedding.
>
> Giving it a quick code review and going through the JIRA comments
> (well written, thanks guys) there seem to be some pretty important bug
> fixes in here as well as paying off a bit of technical debt.
>
> Jon
>
> On Thu, Apr 4, 2019 at 1:37 PM Pavel Yaskevich  wrote:
> >
> > Great to see such a significant progress made in the area!
> >
> > On Thu, Apr 4, 2019 at 1:13 PM Aleksey Yeschenko 
> wrote:
> >
> > > I would like to propose CASSANDRA-15066 [1] - an important set of bug
> fixes
> > > and stability improvements to internode messaging code that Benedict,
> I,
> > > and others have been working on for the past couple of months.
> > >
> > > First, some context.   This work started off as a review of
> CASSANDRA-14503
> > > (Internode connection management is race-prone [2]), CASSANDRA-13630
> > > (Support large internode messages with netty) [3], and a pre-4.0
> > > confirmatory review of such a major new feature.
> > >
> > > However, as we dug in, we realized this was insufficient. With more
> than 50
> > > bugs uncovered [4] - dozens of them critical to correctness and/or
> > > stability of the system - a substantial rework was necessary to
> guarantee a
> > > solid internode messaging subsystem for the 4.0 release.
> > >
> > > In addition to addressing all of the uncovered bugs [4] that were
> unique to
> > > trunk + 13630 [3] + 14503 [2], we used this opportunity to correct some
> > > long-existing, pre-4.0 bugs and stability issues. For the complete
> list of
> > > notable bug fixes, read the comments to CASSANDRA-15066 [1]. But I’d
> like
> > > to highlight a few.
> > >
> > > # Lack of message integrity checks
> > >
> > > It’s known that TCP checksums are too weak [5] and Ethernet CRC cannot
> be
> > > relied upon [6] for integrity. With sufficient scale or time, you will
> hit
> > > bit flips. Sad

Re: Stabilising Internode Messaging in 4.0

2019-04-09 Thread Joseph Lynch
Let's try this again, apparently email is hard ...

I am relatively new to these code paths—especially compared to the
committers that have been working on these issues for years such as
the 15066 authors as well as Jason Brown—but like many Cassandra users
I am familiar with many of the classes of issues Aleksey and Benedict
have identified with this patchset (especially related to messaging
correctness, performance and the lack of message backpressure). We
believe that every single fix and feature in this patch is valuable
and we desire that we are able to get them all merged in and
validated. We don’t think it’s even a question if we want to merge
these: we should want these excellent changes. The only questions—in
my opinion—are how do we safely merge them and when do we merge them?

Due to my and Vinay’s relative lack of knowledge of these code paths,
we hope that we can get as many experienced eyes as we can to review
the patch and evaluate the risk-reward tradeoffs of some of the deeper
changes. We don’t feel qualified to make assertions about risk vs
reward in this patchset, but I know there are a number of people on
this mailing list who are qualified and I think we would all
appreciate their insight and help.

I completely understand that we don’t live in an ideal world, but I do
personally feel that in an ideal world it would be possible to pull
the bug fixes (bugs specific to the 4.0 netty refactor) out from the
semantic changes (e.g. droppability, checksumming, back pressure,
handshake changes), code refactors (e.g. verb handler,
MessageIn/MessageOut) and performance changes (various
re-implementations of Netty internals, some optimizations around
dropping dead messages earlier). Then we can review, validate, and
benchmark each change independently and iteratively move towards
better messaging. At the same time, I recognize that it may be hard to
pull these changes apart, but I worry that review and validation of
the patch, as is, may take the testing community many months to
properly vet and will either mean that we cut 4.0 many, many months
from now or we cut 4.0 before we can properly test the patchset.

I think we are all agreed we don’t want an unstable 4.0, so the main
decision point here is: what set of changes from this valuable and
important patch set do we put in 4.0, and which do we try to put in
4.next? Once we determine that, the community can hopefully start
allocating the necessary review, testing, and benchmarking resources
to ensure that 4.0 is our first ever rock solid “.0” release.

-Joey


On Thu, Apr 4, 2019 at 5:56 PM Jon Haddad  wrote:
>
> Given the number of issues that are addressed, I definitely think it's
> worth strongly considering merging this in.  I think it might be a
> little unrealistic to cut the first alpha after the merge though.
> Being realistic, any 20K+ LOC change is going to introduce its own
> bugs, and we should be honest with ourselves about that.  It seems
> likely the issues the patch addressed would have affected the 4.0
> release in some form *anyways* so the question might be do we fix them
> now or after someone's cluster burns down because there's no inbound /
> outbound message load shedding.
>
> Giving it a quick code review and going through the JIRA comments
> (well written, thanks guys) there seem to be some pretty important bug
> fixes in here as well as paying off a bit of technical debt.
>
> Jon
>
> On Thu, Apr 4, 2019 at 1:37 PM Pavel Yaskevich  wrote:
> >
> > Great to see such a significant progress made in the area!
> >
> > On Thu, Apr 4, 2019 at 1:13 PM Aleksey Yeschenko  wrote:
> >
> > > I would like to propose CASSANDRA-15066 [1] - an important set of bug 
> > > fixes
> > > and stability improvements to internode messaging code that Benedict, I,
> > > and others have been working on for the past couple of months.
> > >
> > > First, some context.   This work started off as a review of 
> > > CASSANDRA-14503
> > > (Internode connection management is race-prone [2]), CASSANDRA-13630
> > > (Support large internode messages with netty) [3], and a pre-4.0
> > > confirmatory review of such a major new feature.
> > >
> > > However, as we dug in, we realized this was insufficient. With more than 
> > > 50
> > > bugs uncovered [4] - dozens of them critical to correctness and/or
> > > stability of the system - a substantial rework was necessary to guarantee 
> > > a
> > > solid internode messaging subsystem for the 4.0 release.
> > >
> > > In addition to addressing all of the uncovered bugs [4] that were unique 
> > > to
> > > trunk + 13630 [3] + 14503 [2], we used this opportunity to correct some
> > > long-existing, pre-4.0 bugs and stability issues. For the complete list of
> > > notable bug fixes, read the comments to CASSANDRA-15066 [1]. But I’d like
> > > to highlight a few.
> > >
> > > # Lack of message integrity checks
> > >
> > > It’s known that TCP checksums are too weak [5] and Ethernet CRC cannot be
> > > relied upon [6] for in