path used by the java driver, but I fail to see how
that could be different when using CQLSH (python).
Does anybody more familiar with the reading path able to shed some light on
the stack trace?
Thanks,
Stefano
On Tue, Jan 2, 2018 at 6:44 PM, Stefano Ortolani wrote:
> Hi all,
>
> appar
Hi all,
apparently the year started with a node (version 3.0.15) exhibiting some
data corruption (discovered by a spark job enumerating all keys).
The exception is attached below.
The invalid string is a partition key, and it is supposed to be a file
name. If I manually decode the bytes I get so
to blow up that instance
>
>
>
> --
> Jeff Jirsa
>
>
> On Oct 15, 2017, at 1:42 PM, Stefano Ortolani wrote:
>
> Hi Jeff,
>
> this my third attempt bootstrapping the node so I tried several tricks
> that might partially explain the output I am posting.
>
> * T
nymize as needed) nodetool status, nodetool netstats,
> nodetool tpstats, and nodetool compctionstats ?
>
> --
> Jeff Jirsa
>
>
> On Oct 15, 2017, at 1:14 PM, Stefano Ortolani wrote:
>
> Hi Jeff,
>
> that would be 3.0.15, single disk, vnodes enabled (num_tokens
Hi Jeff,
that would be 3.0.15, single disk, vnodes enabled (num_tokens 256).
Stefano
On Sun, Oct 15, 2017 at 9:11 PM, Jeff Jirsa wrote:
> What version?
>
> Single disk or JBOD?
>
> Vnodes?
>
> --
> Jeff Jirsa
>
>
> On Oct 15, 2017, at 12:49 PM, Stefano Ortola
away.
Does anybody know anything else I could try?
Cheers,
Stefano
On Fri, Oct 13, 2017 at 3:58 PM, Stefano Ortolani
wrote:
> Other little update: at the same time I see the number of pending tasks
> stuck (in this case at 1847); restarting the node doesn't help, so I can't
>
have on other nodes.
Feeling more and more puzzled here :S
On Fri, Oct 13, 2017 at 1:28 PM, Stefano Ortolani
wrote:
> I have been trying to add another node to the cluster (after upgrading to
> 3.0.15) and I just noticed through "nodetool netstats" that all nodes have
> been stre
I have been trying to add another node to the cluster (after upgrading to
3.0.15) and I just noticed through "nodetool netstats" that all nodes have
been streaming to the joining node approx 1/3 of their SSTables, basically
their whole primary range (using RF=3)?
Is this expected/normal?
I was und
You might find this interesting:
https://medium.com/@foundev/synthetic-sharding-in-cassandra-to-deal-with-large-partitions-2124b2fd788b
Cheers,
Stefano
On Mon, Sep 18, 2017 at 5:07 AM, Adam Smith wrote:
> Dear community,
>
> I have a table with inlinks to URLs, i.e. many URLs point to
> http://
Hi Kurt,
On Wed, Aug 23, 2017 at 11:32 AM, kurt greaves wrote:
>
> 1) You mean restarting the node in the middle of the bootstrap with
>> join_ring=false? Would this option require me to issue a nodetool boostrap
>> resume, correct? I didn't know you could instruct the join via JMX. Would
>> it
Hi Kurt,
1) You mean restarting the node in the middle of the bootstrap with
join_ring=false? Would this option require me to issue a nodetool boostrap
resume, correct? I didn't know you could instruct the join via JMX. Would
it be the same of the nodetool boostrap command?
2) Yes, they are stream
Hi Kurt,
sorry, I forgot to specify. I am on 3.0.14.
Cheers,
Stefano
On Wed, Aug 23, 2017 at 12:11 AM, kurt greaves wrote:
> What version are you running? 2.2 has an improvement that will retain
> levels when streaming and this shouldn't really happen. If you're on 2.1
> best bet is to upgrade
the first
compaction at L0 is done with STCS, but 1 TB is way more than twice the
amount of data the node should own in theory, so something else might be
responsible for the over streaming.
Thanks in advance!
Stefano Ortolani
AM, Varun Gupta wrote:
> We upgraded from 2.2.5 to 3.0.11 and it works fine. I will suggest not to
> go with 3.013, we are seeing some issues with schema mismatch due to which
> we had to rollback to 3.0.11.
>
> Thanks,
> Varun
>
> On May 19, 2017, at 7:43 AM, Stefan
Here (https://github.com/apache/cassandra/blob/cassandra-3.0/NEWS.txt) is
stated that the minimum supported version for the 2.2.X branch is 2.2.2.
On Fri, May 19, 2017 at 2:16 PM, Nicolas Guyomar
wrote:
> Hi Xihui,
>
> I was looking for this documentation also, but I believe datastax removed
> i
, 2017 at 5:43 PM, Hannu Kröger wrote:
> This is a bit of guessing but it probably reads sstables in some sort of
> sequence, so even if sstable 2 contains the tombstone, it still scans
> through the sstable 1 for possible data to be read.
>
> BR,
> Hannu
>
> On 16 May 2017, at
017 at 5:17 PM, Stefano Ortolani
wrote:
> Yes, that was my intention but I wanted to cross-check with the ML and the
> devs keeping an eye on it first.
>
> On Tue, May 16, 2017 at 5:10 PM, Hannu Kröger wrote:
>
>> Well,
>>
>> sstables contain some statistics abou
timestamp it might be possible to skip some
> data but I’m not sure that Cassandra currently does that. Maybe it would be
> worth a JIRA ticket and see what the devs think about it. If optimizing
> this case would make sense.
>
> Hannu
>
> On 16 May 2017, at 18:03, Stefano Ortol
No, because C* has reverse iterators.
On Tue, May 16, 2017 at 4:47 PM, Nitan Kainth wrote:
> If the data is stored in ASC order and query asks for DESC, then wouldn’t
> it read whole partition in first and then pick data from reverse order?
>
>
> On May 16, 2017, at 10:03 AM, S
l statistics of cell ages would need to be
> kept in the column index for the skipping and that is probably not there.
>
> Hannu
>
> On 16 May 2017, at 17:33, Stefano Ortolani wrote:
>
> That is another way to see the question: are reverse iterators range
> tombstone aware
t; in the beginning of the partition). Eventually you will still end up
> reading a lot of tombstones but you will get a lot of live data first and
> the implicit query limit of 1 probably is reached before you get to the
> tombstones. Therefore you will get an immediate answer.
> &g
Hi all,
I am seeing inconsistencies when mixing range tombstones, wide partitions,
and reverse iterators.
I still have to understand if the behaviour is to be expected hence the
message on the mailing list.
The situation is conceptually simple. I am using a table defined as follows:
CREATE TABLE
leted'
> data from being returned in the read. It's a bit more complicated than
> that, but that's the general idea.
>
>
> On May 12, 2017 at 6:23:01 AM, Stefano Ortolani (ostef...@gmail.com)
> wrote:
>
> Thanks a lot Blake, that definitely helps!
>
>
nsistency issues. It can also
> cause a *lot* of over streaming, so you might want to take a look at how
> much streaming your cluster is doing with full repairs, and incremental
> repairs. It might actually be more efficient to run full repairs.
>
> Hope that helps,
>
> Blake
&g
Hi all,
I am trying to wrap my head around how C* evicts tombstones when using LCS.
Based on what I understood reading the docs, if the ratio of garbage
collectable tomstones exceeds the "tombstone_threshold", C* should start
compacting and evicting.
I am quite puzzled however by what might happe
cuted sequentially on each node (no overlapping, next node
waits for the previous to complete).
Regards,
Stefano Ortolani
On Mon, Oct 31, 2016 at 11:18 PM, kurt Greaves wrote:
> Blowing out to 1k SSTables seems a bit full on. What args are you passing to
> repair?
>
> Kurt Greaves
>
ldn't have any impact in theory.
Nodes do not seem that overloaded either and don't see any GC spikes
while those mutations are dropped :/
Hitting a dead end here, any further idea where to look for further ideas?
Regards,
Stefano
On Wed, Aug 10, 2016 at 12:41 PM, Stefano Ortolani wrote:
Did you try the workaround they posted (aka, downgrading Cython)?
Cheers,
Stefano
On Wed, Oct 26, 2016 at 10:01 AM, Zao Liu wrote:
> Same happen to my ubuntu boxes.
>
> File
> "/home/jasonl/.pex/install/cassandra_driver-3.7.0-cp27-none-linux_x86_64.whl.ebfb31ab99650d53ad134e0b312c7494296cdd2b/
, Sep 27, 2016 at 4:09 PM, Stefano Ortolani wrote:
> Didn't know about (2), and I actually have a time drift between the nodes.
> Thanks a lot Paulo!
>
> Regards,
> Stefano
>
> On Thu, Sep 22, 2016 at 6:36 PM, Paulo Motta
> wrote:
>>
>> There are a coup
as
> repaired, so nodes with different compaction cadences will have different
> data in their unrepaired set, what will cause mismatches in the subsequent
> incremental repairs. CASSANDRA-9143 will hopefully fix that limitation.
>
> 2016-09-22 7:10 GMT-03:00 Stefano Ortolani :
>
>&
Hi,
I am seeing something weird while running repairs.
I am testing 3.0.9 so I am running the repairs manually, node after node,
on a cluster with RF=3. I am using a standard repair command (incremental,
parallel, full range), and I just noticed that the third node detected some
ranges out of sync
ch is leveraged by full range repair,
> which would not work in many cases for partial range repairs, yielding
> higher I/O.
>
> 2016-08-26 10:17 GMT-03:00 Stefano Ortolani :
>
>> I see. Didn't think about it that way. Thanks for clarifying!
>>
>>
>> On Fri,
the problem of re-doing work as in non-inc non-pr repair.
>
> 2016-08-26 7:57 GMT-03:00 Stefano Ortolani :
>
>> Hi Paulo, could you elaborate on 2?
>> I didn't know incremental repairs were not compatible with -pr
>> What is the underlying reason?
>>
>&g
Hi Paulo, could you elaborate on 2?
I didn't know incremental repairs were not compatible with -pr
What is the underlying reason?
Regards,
Stefano
On Fri, Aug 26, 2016 at 1:25 AM, Paulo Motta
wrote:
> 1. Migration procedure is no longer necessary after CASSANDRA-8004, and
> since you never ran
Not really related, but know that on 12.04 I had to disable jemalloc,
otherwise nodes would randomly die at startup (
https://issues.apache.org/jira/browse/CASSANDRA-11723)
Regards,
Stefano
On Thu, Aug 11, 2016 at 10:28 AM, Riccardo Ferrari
wrote:
> Hi C* users,
>
> In recent time I had couple
ere else. Generally
> dropped mutations is a signal of cluster overload, so if there's nothing
> else wrong perhaps you need to increase your capacity. What version are you
> in?
>
> 2016-08-10 8:21 GMT-03:00 Stefano Ortolani :
>
>> Not yet. Right now I have it set at
detool
> setcompactionthroughput. Did you try lowering that and checking if that
> improves the dropped mutations?
>
> 2016-08-09 13:32 GMT-03:00 Stefano Ortolani :
>
>> Hi all,
>>
>> I am running incremental repaird on a weekly basis (can't do it every day
>> as one sin
Hi all,
I am running incremental repaird on a weekly basis (can't do it every day
as one single run takes 36 hours), and every time, I have at least one node
dropping mutations as part of the process (this almost always during the
anticompaction phase). Ironically this leads to a system where repa
FWIW, I've recently upgraded from 2.1 to 3.0 without issues of any sort,
but admittedly I haven't been using anything too fancy.
Cheers,
Stefano
On Wed, Jul 13, 2016 at 10:28 PM, Alain RODRIGUEZ
wrote:
> Hi Anuj
>
> From
> https://docs.datastax.com/en/latest-upgrade/upgrade/cassandra/upgrdBestP
Replaced OpsCenter with a mix of:
* metrics-graphite-3.1.0.jar installed in the same classpath of C*
* Custom script to push system metrics (cpu/mem/io)
* Grafana to create the dashboard
* Custom repairs script
Still not optimal but getting there...
Stefano
On Thu, Jul 14, 2016 at 10:18 AM, Rom
e one SStable at least, but
the pending tasks are still stuck though.
Regards,
Stefano
On Tue, Jun 28, 2016 at 10:09 AM, Stefano Ortolani
wrote:
> I am updating the following ticket
> https://issues.apache.org/jira/browse/CASSANDRA-12100 as I discover new
> bits.
>
> Regards,
> Ste
I am updating the following ticket
https://issues.apache.org/jira/browse/CASSANDRA-12100 as I discover new
bits.
Regards,
Stefano
On Tue, Jun 28, 2016 at 9:37 AM, Stefano Ortolani
wrote:
> Hi all,
>
> I've just updated to C* 3.0.7, and I am now seeing some compactions
> gettin
hours ago.
I am fairly confident this issue was not there in C* 3.0.5.
Any idea?
Regards,
Stefano Ortolani
Yes, because you keep a snapshot in the meanwhile if I remember correctly.
Regards,
Stefano
On Thu, Jun 23, 2016 at 4:22 PM, Jean Carlo
wrote:
> Cassandra 2.1.12
>
> In the moment of a repair -pr sequential, we are experimenting an
> exponential increase of number of sstables. For a table lcs.
Forgot to add the C* version. That would be 3.0.6.
Regards,
Stefano Ortolani
On Thu, Jun 2, 2016 at 3:55 PM, Stefano Ortolani wrote:
> Hi,
>
> While running incremental (parallel) repairs on the first partition range
> (-pr), I rarely see the progress percentage going over 20%/25%.
%)
Nodetool does return normally and no error is found in its output or in the
cassandra logs.
Any idea why? Is this behavior expected?
Regards,
Stefano Ortolani
, Stefano Ortolani wrote:
> Hi,
>
> I am experiencing some weird behaviors after upgrading 2 nodes (out of 13)
> to C* 3.0.5 (from 2.1.11). Basically, after restarting a second time, there
> is a small chance that the node will die without outputting anything to the
> logs (not eve
Hi,
I am experiencing some weird behaviors after upgrading 2 nodes (out of 13)
to C* 3.0.5 (from 2.1.11). Basically, after restarting a second time, there
is a small chance that the node will die without outputting anything to the
logs (not even dmesg).
This happened on both nodes I upgraded. The
As far as I know, docs is quite inconsistent on the matter.
Based on some research here and on IRC, recent versions of Cassandra do no
require anything specific when migrating to incremental repairs but the the
-inc switch even on LCS.
Any confirmation on the matter is more than welcome.
Regards,
I think those were referring to Java7 and G1GC (early versions were buggy).
Cheers,
Stefano
On Fri, Sep 25, 2015 at 5:08 PM, Kevin Burton wrote:
> Any issues with running Cassandra 2.0.16 on Java 8? I remember there is
> long term advice on not changing the GC but not the underlying version of
Hi Jean,
I am trying to solve a similar problem here. I would say that the only
deterministic way is to rebuild the SStable of that column family via
nodetool scrub.
Otherwise you'd need to :
* decrease tombstone_threshold
* wait for gc_grace_time
Cheers,
Stefano
On Tue, May 26, 2015 at 12:51
f83a3631034ea4fa697/system-schema_columnfamilies-ka-7-Data.db
> (922 bytes) for commitlog position ReplayPosition(segmentId=1432265013436,
> position=423408)
> INFO [MigrationStage:1] 2015-05-26 12:12:26,598
> ColumnFamilyStore.java:882 - Enqueuing flush of dogtypes: 0 (0%) on-heap, 0
&
possible
without downtime, and how fast those values are picked up?
Cheers,
Stefano
On Mon, May 25, 2015 at 1:32 PM, Stefano Ortolani
wrote:
> Hi all,
>
> Thanks for your answers! Yes, I agree that a delete intensive workload is
> not something Cassandra is designed for.
>
>
tion seems to be that
>> leveled compaction is suited for read intensive workloads.
>>
>> Depending on your use case, you might better of with data tiered or size
>> tiered strategy.
>>
>> regards
>>
>> regards
>>
>>> On Sun, May 24, 2015
compaction took place)?
Regards,
Stefano Ortolani
Definitely, I think the very same re this issue.
On Thu, Feb 12, 2015 at 7:04 AM, Eric Stevens wrote:
> I definitely find it surprising that a node which was decommissioned is
> willing to rejoin a cluster. I can't think of any legitimate scenario
> where you'd want that, and I'm surprised the
restart after the gc_grace_seconds passed
would have violated consistency permanently?
Cheers,
Stefano
On Wed, Feb 11, 2015 at 10:56 AM, Robert Coli wrote:
> On Tue, Feb 10, 2015 at 9:13 PM, Stefano Ortolani
> wrote:
>
>> I recommissioned a node after decommissioningit.
>
ent requests without having a consistent view of the data.
> A safer approach would be to wipe the data directory and bootstrap it as a
> clean new member.
>
> I'm curious what prompted that cycle of decommission then recommission.
>
> On Tue, Feb 10, 2015 at 10:13 PM, Stefan
Hi,
I recommissioned a node after decommissioningit.
That happened (1) after a successfull decommission (checked), (2) without
wiping the data directory on the node, (3) simply by restarting the
cassandra service. The node now reports himself healty and up and running
Knowing that I issued the "r
59 matches
Mail list logo