Re: multi datacenter cluster, without fibre speeds

2011-11-14 Thread M Vieira
Broadband here is fairly stable, to be honest don't remember last time I had problems such as larger than expected latency or downtime - ISP Bethere /UK My application can cope fine with up to 10 min lag (data freshness), however taking your input into consideration I agree with you, so don't think

Re: multi datacenter cluster, without fibre speeds

2011-11-14 Thread M Vieira
of better consistency, community support, no single point of failure and some! Thanks, Marco On Nov 11, 2011 7:09 PM, "Radim Kolar" wrote: > Dne 11.11.2011 19:14, M Vieira napsal(a): > >> Has anyone experimented running cassandra clusters in geographicly >> separa

multi datacenter cluster, without fibre speeds

2011-11-11 Thread M Vieira
Has anyone experimented running cassandra clusters in geographicly separated locations connected thru ordinary broadband? By ordinary broadband I mean 30Mbps or 50Mbps Thanks Marco

Thrift transport error

2011-10-05 Thread M Vieira
@Sylvain thanks for the eye opening hint on 3213 There are some critical issues with Thrift 0.6 that were fixed in 0.7 Thrift 0.6 critical issues https://issues.apache.org/jira/browse/THRIFT-788 https://issues.apache.org/jira/browse/THRIFT-1067 @Jonathan you're right, the error message is relate

Thrift transport error

2011-10-05 Thread M Vieira
I'm using Thrift 0.7 with Cassandra 0.8.6 and "Cassandra Cluster Admin" to work around my single node [testing] cluster. All seams to work fine, but I'm getting a contant error message "CustomTThreadPoolServer.java (line 197) Thrift transport e

Re: Very large rows VS small rows

2011-09-29 Thread M Vieira
Thank you very much! Just read some stuff in the wiki, such as limitations and secondary index. Adding up to what you said, the search in large rows, by which I mean rows with millions of columns, seams to be like searching normal hash instead of btree style. So model A it is! Once again thank yo

Very large rows VS small rows

2011-09-29 Thread M Vieira
What would be the best approach A) millions of ~2Kb rows, where each row could have ~6 columns B) hundreds of ~100Gb rows, where each row could have ~1million columns Considerarions: Most entries will be searched for (read+write) at least once a day but no more than 3 times a day. Cheap hardware a

Cassandra data modeling

2011-09-29 Thread M Vieira
I'm trying to get my head around Cassandra data modeling, but I can't quite see what would be the best approach to the problem I have. The supposed scenario: You have around 100 domains, each domain have from few hundreds to millions of possible URLs (think of different combinations of GET args, ex