I have a similar set of problems. I will set the stage: in the past, for a
variety of reasons I had to create tables(column families) by time range
for an event processing system.
The man reason was expiring data (TTL) did not purge easily. It was easier
to simply truncate/drop old column families than two deal with different
evolving compaction strategies.
The main loop of my program looked like this:
public void writeThisStuff(List<Event> event ){
MutationBatch mb;
for (Event event : events){
mb.add(event)
}
maybeCreateNeededTables(mb)
executeBatch(mb);
}
public void maybeCreateNeededTables(mb){
Set<String> columnFamilieToCreate =
for (mutation : batch) {
columnFamiliesToCreate.add(extractColumnFamilyFromMutation(mutation));
}
for (String cf: columnFamiliesToCreate){
if ! hectorAstyanaxFlavoroftheweekclientDoesCfExist(cf)){
hectorAstyanaxFlavoroftheweekclienCreateCf(cf);
}
}
}
The size of the batches were in the 5-10K range. For a given batch the
number of target cfs was typically one, but at most two. That mean worst
case scenario 1 would need to be created. Effectively this meant 1 metadata
read before write. (You could cache the already existing columns as well).
One quick read is not a huge cost when you consider the savings of batching
5K roundtrips.
Even with this type of scenario you can run into a concurrent schema
problem. But you can add whatever gizmo to confirm schema agreement here:
for (String cf: columnFamiliesToCreate){
* waitForSchemaToSettleGizmo()*
if ! hectorAstyanaxFlavoroftheweekclientDoesCfExist(cf)){
* waitForSchemaToSettleGizmo()*
hectorAstyanaxFlavoroftheweekclienCreateCf(cf);
}
}
On Wed, Sep 28, 2016 at 12:01 PM, Aleksey Yeschenko <[email protected]>
wrote:
> No way to do that via Thrift I’m afraid, nor will there be one. Sorry.
>
> --
> AY
>
> On 28 September 2016 at 16:43:58, Roman Bielik (roman.bielik@
> openmindnetworks.com) wrote:
>
> Hi,
>
> in CQL it is possible to create a table with explicit ID: CREATE TABLE ...
> WITH ID='xyz'.
>
> Is something like this possible via Thrift interface?
> There is an int32 "id" field in CfDef, but it has no effect on the table
> ID.
>
> My problem is, that concurrent create table (add_column_family) requests
> for the same table name result in clash with somewhat unpredictable
> behavior.
>
> This problem was reported in:
> https://issues.apache.org/jira/browse/CASSANDRA-9933
>
> and seems to be related to changes from ticket:
> https://issues.apache.org/jira/browse/CASSANDRA-5202
>
> A workaround for me could be using the same ID in create table, however I'm
> using Thrift interface only.
>
> Thank you.
> Regards,
> Roman
>
> --
>
> <http://www.openmindnetworks.com>
> <https://www.linkedin.com/company/openmind-networks>
> <https://twitter.com/Openmind_Ntwks> <http://www.openmindnetworks.com>
>