Re: SStable format change in 3.0.18 ?

2019-04-04 Thread Léo FERLIN SUTTON
72c1129c9f > src/java/org/apache/cassandra/io/sstable/format/big/BigFormat.java > <https://github.com/apache/cassandra/commit/d60c78358b6f599a83f3c112bfd6ce72c1129c9f#diff-62875acfa21fb24c7167a0a2d761780e> > : > 129 > // md (3.0.18, 3.11.4): corrected sstable min/max cluste

Re: SStable format change in 3.0.18 ?

2019-04-04 Thread Dmitry Saprykin
Hello, I think it was done in the following issue: Sstable min/max metadata can cause data loss (CASSANDRA-14861) https://github.com/apache/cassandra/commit/d60c78358b6f599a83f3c112bfd6ce72c1129c9f src/java/org/apache/cassandra/io/sstable/format/big/BigFormat.java <https://github.com/apa

Re: SStable format change in 3.0.18 ?

2019-04-04 Thread Jeff Jirsa
This is CASSANDRA-14861 -- Jeff Jirsa > On Apr 4, 2019, at 8:23 AM, Léo FERLIN SUTTON > wrote: > > Hello ! > > I have noticed something since I upgraded to cassandra 3.0.18. > > Before all my Sstable used to be named this way : > ``` > mc-130817-big-CompressionInfo.db > mc-130817-big-Dat

SStable format change in 3.0.18 ?

2019-04-04 Thread Léo FERLIN SUTTON
Hello ! I have noticed something since I upgraded to cassandra 3.0.18. Before all my Sstable used to be named this way : ``` mc-130817-big-CompressionInfo.db mc-130817-big-Data.db mc-130817-big-Digest.crc32 mc-130817-big-Filter.db mc-130817-big-Index.db mc-130817-big-Statistics.db mc-130817-big-S

Re: SSTable format

2012-07-15 Thread prasenjit mukherjee
Appreciate the insightful replies. Understood Sylvain's argument that having different partitioning locally and globally could create problem in data movement. Edward, for a given sstable in a node, why having lexicographically closer rows clumped together should matter ? Anyways the lookups for

Re: SSTable format

2012-07-14 Thread Edward Capriolo
There is a more subtle and profound aspect here as well. The md5 transformation is a hash that is good at spraying data around the hash space evenly. For a given sstable their should be good entropy where if the data was not transformed it could be "clumpy" and the sorted string structure and index

Re: SSTable format

2012-07-14 Thread Sylvain Lebresne
> Any reason rowkeys are not stored by their raws keys on a given node > for RP ? I understand the partitioning across nodes should be > randomized, but on a given node why they are sorted by hash of their > keys and not just by the raw keys ? All the operation that change the topology of the clu

Re: SSTable format

2012-07-13 Thread prasenjit mukherjee
> > It depends on what partitioner you use. You should be using the > RandomPartitioner, and if so, the rows are sorted by the hash of the row > key. there are partitioners that sort based on the raw key value but these > partitioners shouldn't be used as they have problems due to uneven > partitio

Re: SSTable format

2012-07-13 Thread Dave Brosius
While in memory cassandra calls it a MemTable, but yes sstables are write-once, and later combined with others into new ones thru compaction. On 07/13/2012 09:54 PM, Michael Theroux wrote: Thanks for the information, So is the SStable essentially kept in memory, then sorted and written to di

Re: SSTable format

2012-07-13 Thread Michael Theroux
Thanks for the information, So is the SStable essentially kept in memory, then sorted and written to disk on flush? After that point, an SStable is not modified, but can be written to another SStable through compaction? -Mike On Jul 13, 2012, at 8:22 PM, Rob Coli wrote: > On Fri, Jul 13, 201

Re: SSTable format

2012-07-13 Thread Rob Coli
On Fri, Jul 13, 2012 at 5:18 PM, Dave Brosius wrote: > It depends on what partitioner you use. You should be using the > RandomPartitioner, and if so, the rows are sorted by the hash of the row > key. there are partitioners that sort based on the raw key value but these > partitioners shouldn't be

Re: SSTable format

2012-07-13 Thread Dave Brosius
On 07/13/2012 08:00 PM, Michael Theroux wrote: Hello, I've been trying to understand in greater detail how SStables are stored, and how information is transferred between Cassandra nodes, especially when a new node is joining a cluster. Specifically, Is information stored to SStables ordered

SSTable format

2012-07-13 Thread Michael Theroux
Hello, I've been trying to understand in greater detail how SStables are stored, and how information is transferred between Cassandra nodes, especially when a new node is joining a cluster. Specifically, Is information stored to SStables ordered by rowkeys? Some of the articles I've read sugg