Re: Infinite loop in org.apache.cassandra.hadoop.cql3.CqlBulkRecordWriter
I would recommend using the Spark Cassandra Connector instead of the Hadoop based writers. The Hadoop code has not had a lot of love in a long time. See https://github.com/datastax/spark-cassandra-connector On Wed, Apr 3, 2019 at 12:21 PM Brett Marcott wrote: > Hi folks, > > I am noticing my spark jobs being stuck when using the > org.apache.cassandra.hadoop.cql3.CqlBulkRecordWriter/CqlBulkOutputFormat. > > > It seems that whenever there is a stream failure it may be expected > behavior based on the code to infinite loop. > > Here are one executors logs: > 19/04/03 15:35:06 INFO streaming.StreamResultFuture: [Stream > #59290530-5625-11e9-a2bb-8bc7b49d56b0] Session with /10.82.204.173 is > complete > 19/04/03 15:35:06 WARN streaming.StreamResultFuture: [Stream > #59290530-5625-11e9-a2bb-8bc7b49d56b0] Stream failed > > > On stream failure it seems StreamResultFuture sets the exception for the > AbstractFuture. > AFAIK this should cause the Abstract future to return a new > ExecutionException. > > The problem seems to lie in the fact that the CqlBulkRecordWriter swallows > the Execution exception and continues in a while loop: > > https://github.com/apache/cassandra/blob/207c80c1fd63dfbd8ca7e615ec8002ee8983c5d6/src/java/org/apache/cassandra/hadoop/cql3/CqlBulkRecordWriter.java#L256-L274 > < > https://github.com/apache/cassandra/blob/207c80c1fd63dfbd8ca7e615ec8002ee8983c5d6/src/java/org/apache/cassandra/hadoop/cql3/CqlBulkRecordWriter.java#L256-L274 > > > > When taking consecutive thread dumps on the same process I see that the > only thread doing work is constantly creating new ExecutionExceptions (the > memory location for ExecutionException was different on each thread dump): > java.lang.Throwable.fillInStackTrace(Native Method) > java.lang.Throwable.fillInStackTrace(Throwable.java:783) => holding > Monitor(java.util.concurrent.ExecutionException@80240763}) > java.lang.Throwable.(Throwable.java:310) > java.lang.Exception.(Exception.java:102) > java.util.concurrent.ExecutionException.(ExecutionException.java:90) > > com.google.common.util.concurrent.AbstractFuture.getDoneValue(AbstractFuture.java:476) > > com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:357) > > org.apache.cassandra.hadoop.cql3.CqlBulkRecordWriter.close(CqlBulkRecordWriter.java:257) > > org.apache.cassandra.hadoop.cql3.CqlBulkRecordWriter.close(CqlBulkRecordWriter.java:237) > > org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12$$anonfun$apply$5.apply$mcV$sp(PairRDDFunctions.scala:1131) > > org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1359) > > org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12.apply(PairRDDFunctions.scala:1131) > > org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12.apply(PairRDDFunctions.scala:1102) > org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) > org.apache.spark.scheduler.Task.run(Task.scala:99) > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:285) > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > java.lang.Thread.run(Thread.java:748) > > It seems the logic that lies right below the while loop in linked code > above that checks for failed hosts/streamsessions maybe should have been > within the while loop? > > Thanks, > > Brett
Re: Can we upgrade Guava to the same version as master on 3.11 branch?
Why does the beam integration rely on Cassandra all, does it use the hadoop formats? On Sun, Dec 15, 2019, 9:07 PM Tomo Suzuki wrote: > Hi Cassandra developers, > > I want to backport the Guava version upgrade (CASSANDRA-15248) into > 3.11 branch, so that cassandra-all:3.11.X works with higher version of > Guava. > I just created a ticket > https://issues.apache.org/jira/browse/CASSANDRA-15453 explaining > background. > > Before committing anything, I'd like to hear any opinion on the > backporting. What do you think? > > Regards, > Tomo > > - > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > For additional commands, e-mail: dev-h...@cassandra.apache.org > >
Re: Can we upgrade Guava to the same version as master on 3.11 branch?
The hadoop formats should be compatible with any Cassandra version regardless of which Cassandra-all you include since they communicate with the driver under the hood and not Cassandra internal libraries. This means you should feel free to use Cassandra 4 in your integration without fear of losing backwards compatibility. In fact it should be able to speak to Cassandra 2.x as well. On Sun, Dec 15, 2019, 10:24 PM Tomo Suzuki wrote: > Hi Russell, > > Yes, Apache Beam uses hadoop format for Cassandra IO [1]. That test > (HadoopFormatIOCassandraTest) failed [2] when I tried to upgrade Guava > version. Added this information to the ticket. > > [1]: https://beam.apache.org/documentation/io/built-in/hadoop/ > [2]: > https://github.com/GoogleCloudPlatform/cloud-opensource-java/issues/1028#issuecomment-557680928 > > On Sun, Dec 15, 2019 at 10:36 PM Russell Spitzer > wrote: > > > > Why does the beam integration rely on Cassandra all, does it use the > hadoop > > formats? > > > > On Sun, Dec 15, 2019, 9:07 PM Tomo Suzuki > > wrote: > > > > > Hi Cassandra developers, > > > > > > I want to backport the Guava version upgrade (CASSANDRA-15248) into > > > 3.11 branch, so that cassandra-all:3.11.X works with higher version of > > > Guava. > > > I just created a ticket > > > https://issues.apache.org/jira/browse/CASSANDRA-15453 explaining > > > background. > > > > > > Before committing anything, I'd like to hear any opinion on the > > > backporting. What do you think? > > > > > > Regards, > > > Tomo > > > > > > - > > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > > > For additional commands, e-mail: dev-h...@cassandra.apache.org > > > > > > > > > > -- > Regards, > Tomo > > - > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > For additional commands, e-mail: dev-h...@cassandra.apache.org > >
Re: Github pull requests
This is one of my favorite aspects of how contributions to Spark work. This also makes it easier to have automated testing on new branches automatically occurring. -Russ On Fri, Aug 26, 2016 at 8:45 AM Ben Coverston wrote: > I think it would certainly make contributing to Cassandra more > straightforward. > > I'm not a committer, so I don't regularly create patches, and every time I > do I have to search/verify that I'm doing it right. > > But pull requests? I make pull requests every day, and GitHub makes that > process work the same everywhere. > > On Fri, Aug 26, 2016 at 9:33 AM, Jonathan Ellis wrote: > > > Hi all, > > > > Historically we've insisted that people go through the process of > creating > > a Jira issue and attaching a patch or linking a branch to demonstrate > > intent-to-contribute and to make sure we have a unified record of changes > > in Jira. > > > > But I understand that other Apache projects are now recognizing a github > > pull request as intent-to-contribute [1] and some are even making github > > the official repo, with an Apache mirror, rather than the other way > > around. (Maybe this is required to accept pull requests, I am not sure.) > > > > Should we revisit our policy here? > > > > [1] e.g. https://github.com/apache/spark/pulls?q=is%3Apr+is%3Aclosed > > > > -- > > Jonathan Ellis > > Project Chair, Apache Cassandra > > co-founder, http://www.datastax.com > > @spyced > > > > > > -- > Ben Coverston > DataStax -- The Apache Cassandra Company >
Re: If reading from materialized view with a consistency level of quorum am I guaranteed to have the most recent view?
PRIMARY KEY ( (Partition key), Clustering Keys) : On Fri, Feb 10, 2017 at 10:59 AM DuyHai Doan wrote: > See my blog post to understand how MV is implemented: > http://www.doanduyhai.com/blog/?p=1930 > > On Fri, Feb 10, 2017 at 7:48 PM, Benjamin Roth > wrote: > > > Same partition key: > > > > PRIMARY KEY ((a, b), c, d) and > > PRIMARY KEY ((a, b), d, c) > > > > PRIMARY KEY ((a), b, c) and > > PRIMARY KEY ((a), c, b) > > > > Different partition key: > > > > PRIMARY KEY ((a, b), c, d) and > > PRIMARY KEY ((a), b, d, c) > > > > PRIMARY KEY ((a), b) and > > PRIMARY KEY ((b), a) > > > > > > 2017-02-10 19:46 GMT+01:00 Kant Kodali : > > > > > Okies now I understand what you mean by "same" partition key. I think > > you > > > are saying > > > > > > PRIMARY KEY(col1, col2, col3) == PRIMARY KEY(col2, col1, col3) // so > far > > I > > > assumed they are different partition keys. > > > > > > On Fri, Feb 10, 2017 at 10:36 AM, Benjamin Roth < > benjamin.r...@jaumo.com > > > > > > wrote: > > > > > > > There are use cases where the partition key is the same. For example > if > > > you > > > > need a sorting within a partition or a filtering different from the > > > > original clustering keys. > > > > We actually use this for some MVs. > > > > > > > > If you want "dumb" denormalization with simple append only cases (or > > more > > > > general cases that don't require a read before write on update) you > are > > > > maybe better off with batched denormalized atomics writes. > > > > > > > > The main benefit of MVs is if you need denormalization to sort or > > filter > > > by > > > > a non-primary key field. > > > > > > > > 2017-02-10 19:31 GMT+01:00 Kant Kodali : > > > > > > > > > yes thanks for the clarification. But why would I ever have MV > with > > > the > > > > > same partition key? if it is the same partition key I could just > read > > > > from > > > > > the base table right? our MV Partition key contains the columns > from > > > the > > > > > base table partition key but in a different order plus an > additional > > > > column > > > > > (which is allowed as of today) > > > > > > > > > > On Fri, Feb 10, 2017 at 10:23 AM, Benjamin Roth < > > > benjamin.r...@jaumo.com > > > > > > > > > > wrote: > > > > > > > > > > > It depends on your model. > > > > > > If the base table + MV have the same partition key, then the MV > > > > mutations > > > > > > are applied synchronously, so they are written as soon the write > > > > request > > > > > > returns. > > > > > > => In this case you can rely on the R+F > RF > > > > > > > > > > > > If the partition key of the MV is different, the partition of the > > MV > > > is > > > > > > probably placed on a different host (or said differently it > cannot > > be > > > > > > guaranteed that it is on the same host). In this case, the MV > > updates > > > > are > > > > > > executed async in a logged batch. So it can be guaranteed they > will > > > be > > > > > > applied eventually but not at the time the write request returns. > > > > > > => You cannot rely and there is no possibility to absolutely > > > guarantee > > > > > > anything, not matter what CL you choose. A MV update may always > > > "arrive > > > > > > late". I guess it has been implemented like this to not block in > > case > > > > of > > > > > > remote request to prefer the cluster sanity over consistency. > > > > > > > > > > > > Is it now 100% clear? > > > > > > > > > > > > 2017-02-10 19:17 GMT+01:00 Kant Kodali : > > > > > > > > > > > > > So R+W > RF doesnt apply for reads on MV right because say I > set > > > > QUORUM > > > > > > > level consistency for both reads and writes then there can be a > > > > > scenario > > > > > > > where a write is successful to the base table and then say > > > > immediately > > > > > I > > > > > > do > > > > > > > a read through MV but prior to MV getting the update from the > > base > > > > > table. > > > > > > > so there isn't any way to make sure to read after MV had been > > > > > > successfully > > > > > > > updated. is that correct? > > > > > > > > > > > > > > On Fri, Feb 10, 2017 at 6:30 AM, Benjamin Roth < > > > > > benjamin.r...@jaumo.com> > > > > > > > wrote: > > > > > > > > > > > > > > > Hi Kant > > > > > > > > > > > > > > > > Is it clear now? > > > > > > > > Sorry for the confusion! > > > > > > > > > > > > > > > > Have a nice one > > > > > > > > > > > > > > > > Am 10.02.2017 09:17 schrieb "Kant Kodali" >: > > > > > > > > > > > > > > > > thanks! > > > > > > > > > > > > > > > > On Thu, Feb 9, 2017 at 8:51 PM, Benjamin Roth < > > > > > benjamin.r...@jaumo.com > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > Yes it is > > > > > > > > > > > > > > > > > > Am 10.02.2017 00:46 schrieb "Kant Kodali" < > k...@peernova.com > > >: > > > > > > > > > > > > > > > > > > > If reading from materialized view with a consistency > level > > of > > > > > > quorum > > > > > > > am > > > > > > > > I > > > > > > > > > > guaranteed to have the most recent view? other words is w >
Re: Patrick McFadin joins the PMC
Congratulations! On Wed, Jan 22, 2025 at 10:30 AM Ekaterina Dimitrova wrote: > Congratulations, Patrick 🎉 > > On Wed, 22 Jan 2025 at 11:05, Jordan West wrote: > >> The PMC's members are pleased to announce that Patrick McFadin has accepted >> an invitation to become a PMC member. >> >> Thanks a lot, Patrick, for everything you have done for the project all >> these years. >> >> Congratulations and welcome!! >> >> The Apache Cassandra PMC >> >
Re: Welcome Jeremiah Jordan to the PMC
Congratulations! On Fri, Feb 14, 2025 at 9:50 AM Dmitry Konstantinov wrote: > Congratulations, Jeremiah! > > On Fri, 14 Feb 2025 at 15:32, Enrico Olivelli wrote: > >> Congratulations ! >> >> Enrico >> >> Il giorno ven 14 feb 2025 alle ore 16:26 Jacek Lewandowski < >> lewandowski.ja...@gmail.com> ha scritto: >> >>> Congratulations!!! >>> >>> On Fri, Feb 14, 2025, 16:17 Jeremy Hanna >>> wrote: >>> Congratulations Jeremiah - well deserved. On Feb 14, 2025, at 9:11 AM, Ekaterina Dimitrova wrote: Congrats!! Well deserved! Thank you for all you do and I really appreciate how much you always help everyone, sharing your broad knowledge and expertise! Cheers On Fri, 14 Feb 2025 at 9:36, Brandon Williams wrote: > Congratulations Jeremiah! > > Kind Regards, > Brandon > > On Fri, Feb 14, 2025 at 8:32 AM Benedict Elliott Smith > wrote: > > > > The PMC is happy to announce that Jeremiah Jordan has joined its > membership. > > > > Jeremiah has been a member of this community for almost 15 years. I > hope you will join me in welcoming him to the committee. > > > -- > Dmitry Konstantinov >