....it does not support the CQL types ... do you mean cql type in here? http://docs.datastax.com/en/cql/3.1/cql/cql_reference/cql_data_types_c.html
if so, i ran cassandra 2.2.0 beta1 with this. cqlsh:jw_schema1> desc table all_data_types; CREATE TABLE jw_schema1.all_data_types ( type_ascii ascii PRIMARY KEY, type_bigint bigint, type_blob blob, type_boolean boolean, type_decimal decimal, type_double double, type_float float, type_inet inet, type_int int, type_list list<int>, type_map map<text, text> ) WITH bloom_filter_fp_chance = 0.01 AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}' AND comment = '' AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99.0PERCENTILE'; cqlsh:jw_schema1> select * from all_data_types; type_ascii | type_bigint | type_blob | type_boolean | type_decimal | type_double | type_float | type_inet | type_int | type_list | type_map -------------+-------------+-----------+--------------+--------------+-------------+------------+-----------+----------+-----------+------------------ primary key | 1234567890 | 0x12 | True | 10 | 33 | 12.34 | 127.0.0.1 | 123 | [1, 2, 3] | {'key': 'value'} (1 rows) well, not all data type in cql per se but some yes, and then flush the data into sstable..and use sstable2json $ ../tools/bin/sstable2json ../data/all_data_types-84ff26d0054311e5b4ecdf2eb198bdde/la-1-big-Data.db [ {"key": "primary key", "cells": [["","",1432823024401968], ["type_bigint","1234567890",1432823024401968], ["type_blob","12",1432823024401968], ["type_boolean","true",1432823024401968], ["type_decimal","10",1432823024401968], ["type_double","33.0",1432823024401968], ["type_float","12.34",1432823024401968], ["type_inet","127.0.0.1",1432823024401968], ["type_int","123",1432823024401968], ["type_list:_","type_list:!",1432823079281149,"t",1432823079], ["type_list:462c9f80054511e5b4ecdf2eb198bdde","00000001",1432823079281150], ["type_list:462c9f81054511e5b4ecdf2eb198bdde","00000002",1432823079281150], ["type_list:462c9f82054511e5b4ecdf2eb198bdde","00000003",1432823079281150], ["type_map:_","type_map:!",1432823024401967,"t",1432823024], ["type_map:6b6579","76616c7565",1432823024401968]]} ] for the list and map, i think you need to decode further but sstable2json should serve as a start? https://github.com/apache/cassandra/blob/cassandra-2.2.0-beta1/src/java/org/apache/cassandra/tools/SSTableExport.java hth jason On Tue, May 26, 2015 at 10:16 PM, Malcolm Matalka <malc...@spotify.com> wrote: > The current code I have is using SSTableReader (like sstable2json) and > then trying to interpret the bytes with the help of CFMetaData. > > Any pointers for where in the Cassandra codebase I could see how this is > done? > > Thanks for the help. > > > 2015-05-26 16:07 GMT+02:00 Tyler Hobbs <ty...@datastax.com>: > > Trying to parse and export an sstable at a higher, CQL level with the > > current codebase is going to be pretty tough. Handling static columns, > > collections (multi-cell columns), and the four minor variants of sstable > > formats (sparse vs dense, composite vs simple) is not easy. If you want > to > > handle things at a CQL level, you should probably go through the normal > > read path. > > > > With that said, CASSANDRA-8099 will substantially change the format of > > sstables to more closely match CQL, making this more feasible. > > > > On Tue, May 26, 2015 at 8:49 AM, Malcolm Matalka <malc...@spotify.com> > > wrote: > > > >> Thanks Tyler, > >> > >> The problem with sstable2json is that it does not support the CQL > >> types as far as I can see and there isn't any indication as to modify > >> it to do that. It seems like the CQL things are a layer above the > >> SSTable. > >> > >> 2015-05-26 15:44 GMT+02:00 Tyler Hobbs <ty...@datastax.com>: > >> > I would start by looking at sstable2json. It may be simplest for you > to > >> > run sstable2json and then process the resulting json. If that's not > >> > adequate, modifying the sstable2json code is probably your best bet. > >> > > >> > On Mon, May 25, 2015 at 11:12 AM, Malcolm Matalka < > malc...@spotify.com> > >> > wrote: > >> > > >> >> Hello, > >> >> > >> >> For efficiency reasons I am trying to parse the raw SSTable files in > >> >> order to transform them into another format. I understand this is > >> >> like poking a sleeping beast and there aren't many guarantees around > >> >> this but I'm asking if anyone has any pointers to make this possible? > >> >> In a search I have stumbled upon FullContact's SSTable parser, but it > >> >> does not parse the complicated data structures that CQL supports. In > >> >> attempting to reverse engineer how Cassandra handles the actual data > >> >> there are a few cases that are unclear and I'm concerned that my > >> >> attempts to interpret them will result in a fragile result. > >> >> > >> >> Are there any suggestions? Existing libraries? Tips on how > Cassandra > >> >> parses the data itself? Pointers into the code to read? SSTable > >> >> design doc? > >> >> > >> >> Thanks, > >> >> /Malcolm > >> >> > >> > > >> > > >> > > >> > -- > >> > Tyler Hobbs > >> > DataStax <http://datastax.com/> > >> > > > > > > > > -- > > Tyler Hobbs > > DataStax <http://datastax.com/> >