I am facing issue with cassandra ordering in the tables for column types of
list.

Suppose I have a table as follows...
FirstName: <list> of string
LastName:  <list> of string

now if were to Issue
update <tableName> (FirstName,LastName) add values ("Leonardo","DiCaprio")
update <tableName> (FirstName,LastName) add values ("Brad","Pitt")
update <tableName> (FirstName,LastName) add values ("mathhew","mcconehey")
update <tableName> (FirstName,LastName) add values ("Kate","Beckinsale")
update <tableName> (FirstName,LastName) add values ("Eva","Green")


If I use the upserts with some time gap in between I get expected results
i.e

cqlsh output:->
Firsname: Leonardo | Brad | Matthew  | Kate        | Eva
LastName: DiCaprio | Pitt | Mcconahey | Beckinsale | Green


But If I updsert in a quick burst (Imagine in a for loop), I get unexpected
results

cqlsh
Firstname: Leonardo | brad    | Matthew  | Kate        | Eva
Lastname:  pitt    | dicaprio  | Mcconahey | Beckinsale | Green

As you can see above, generally two or so values in a column (here
lastname) are interchanged. When the data is more, the tendency of
unordering increases than 5 upsert queries

When I flushed the db tables and took a ssTable dump using sstabledump
<sstableName>, I observed that the ordering reflected in the cqlsh output
is exactly the way it is written in the sstable. Which means that in the
example 2 above, for column "firstname" leonardo was written before brad
and for column "lastname" pitt was written before dicarpio.

Now I am confused as to why writes which should be one one at a time are
seemed to be written in an unordered fashion across columns. Please note
that a write where the whole pair (firstname,lastname) TOGETHER changes
it's position is still acceptable i.e.

cqlsh
Firstname: brad | leonardo    | Matthew  | Kate        | Eva
Lastname:  pitt    | dicaprio  | Mcconahey | Beckinsale | Green

... would have been completely acceptable provided the fact I will retrieve
them by using the indedx [0] would mean brad pitt and [1] would mean
leonardo dicaprio in my applications
But the same indexed based retrieval would fail in case 2 where [0] would
mean leonardo pitt and [1] would mean brad dicaprio.

Please help with any insights, I would be really grateful.

Reply via email to