Re: Optimization for partitions with high number of rows

2023-04-17 Thread Gil Ganz
Thank you, I will look into that option. On Mon, Apr 17, 2023 at 3:29 AM Bowen Song via user < user@cassandra.apache.org> wrote: > Using a frozen UDT for all the non-key columns is a good starting point. > You can go a step further and use frozen UDTs for the partition keys and > clustering keys

Re: Optimization for partitions with high number of rows

2023-04-16 Thread Bowen Song via user
Using a frozen UDT for all the non-key columns is a good starting point. You can go a step further and use frozen UDTs for the partition keys and clustering keys too if appropriate. This alone will dramatically reduce the number of cells per row from from 13 to 3, and save 77% of deserialisatio

Re: Optimization for partitions with high number of rows

2023-04-11 Thread Gil Ganz
Is there something I can do to speed up the deserialisation ? In this example I did a count query, but in reality I need the actual data. Write pattern in this table is such that all data for a given row is written at the same time, so I know I can use frozen udt instead of this, making it faster,

Re: Optimization for partitions with high number of rows

2023-04-11 Thread Bowen Song via user
Reading 4MB from 70k rows and 13 columns (0.91 million cells) from disk in 120ms doesn't sound bad. That's a lots of deserialisation to do. If you want it to be faster, you can store the number of rows elsewhere if that's the only thing you need. On 11/04/2023 07:13, Gil Ganz wrote: Hey I hav

Optimization for partitions with high number of rows

2023-04-10 Thread Gil Ganz
Hey I have a 4.0.4 cluster, with reads of partitions that are a bit on the bigger side, taking longer than I would expect. Reading entire partition that has ~7 rows, total partition size of 4mb, takes 120ms, I would expect it to take less. This is after major compaction, so there is only one s