Re: Cluster sizing for huge dataset

2019-09-29 Thread Jeff Jirsa
> On Sep 29, 2019, at 12:30 AM, DuyHai Doan wrote: > > Thank you Jeff for the hints > > We are targeting to reach 20Tb/machine using TWCS and 8 vnodes (using > the new token allocation algo). Also we will try the new zstd > compression. I’d provably still be inclined to run two instances pe

Re: Challenge with initial data load with TWCS

2019-09-29 Thread DuyHai Doan
Thanks Jeff for sharing the ideas. I have some question though: - CQLSSTableWriter and explicitly break between windows --> Even if you break between windows, If we have data worth of 1 years it would requires us to use CQLSSTableWriter during 1 year (365 days) because the write time taken into ac

Re: Cluster sizing for huge dataset

2019-09-29 Thread DuyHai Doan
Thank you Jeff for the hints We are targeting to reach 20Tb/machine using TWCS and 8 vnodes (using the new token allocation algo). Also we will try the new zstd compression. About transient replication, the underlying trade-offs and semantics are hard to understand for common people (for example,