Bulk loading and timestamps

2012-05-03 Thread Oleg Proudnikov
Hello, group Will the bulk loader preserve original column timestamps? Thank you very much, Oleg

Re: Question regarding major compaction.

2012-05-01 Thread Oleg Proudnikov
Henrik Schröder gmail.com> writes: > But what's the difference between doing an extra read from that > One Big File, than doing an extra read from whatever SSTable > happen to be largest in the course of automatic minor compaction? There is this note regarding major compaction in the tuning gu

Re: Bulkload into a different CF

2012-05-01 Thread Oleg Proudnikov
Benoit Perroud noisette.ch> writes: > > You can copy the sstables (renaming them accordingly) and > call nodetool refresh. > Thank you, Benoit. In that case could I try snapshot+move&rename+refresh on a live system? Regards, Oleg

Bulkload into a different CF

2012-05-01 Thread Oleg Proudnikov
Hello, Is it possible to create an exact replica of a CF by these steps? 1. Take a snapshot 2. Isolate sstables for CF1 3. Rename sstables into CF2 4. Bulk load renamed sstables into newly created CF2 within the same Keyspace Or would you suggest using sstable2json instead? Thank you very much,

Re: Linux Filesystem for Cassandra

2012-04-04 Thread Oleg Proudnikov
Thanks, Radim! What OS are you using and would ZFS be a good option under Linux on EC2? Thank you, Oleg On 2012-04-04, at 9:42 AM, Radim Kolar wrote: > >> Would you, please share, what filesystem you are using? > > zfs 28

Linux Filesystem for Cassandra

2012-04-04 Thread Oleg Proudnikov
Hi, There has been no discussion on this list on the choice of a Linux file system for Cassandra. Does this choice make a difference? Would you, please share, what filesystem you are using? Thank you very much, Oleg

Pushing through major compaction

2012-03-27 Thread Oleg Proudnikov
Hello, Could you please share you experience on pushing through a major compaction on a CF with a large number of sstables? I get an OOM even after dropping CFs that I can drop and increasing JVM heap to the limit. My caches are minimal and memtables are empty. This only happens on a single nod

One or Two clusters?

2012-03-26 Thread Oleg Proudnikov
Hi, Could someone please help me understand the benefits of having a single large cluster vs. having two smaller clusters separated by the pattern of use? One, MOSTLY WRITE cluster could incrementally accumulate large amounts of data throughout the day. The daily increment would be processed, s

Quick advice needed - CF backup and restore

2011-09-26 Thread Oleg Proudnikov
Hi, What is the easiest way to save/backup a single column family across the cluster and later reload it? Thank you very much, Oleg

Unable to read byte sub-column after upgrade to Cassandra 0.8

2011-09-07 Thread Oleg Proudnikov
Hi, Just wanted to share an issue I had to overcome after upgrading to Cassandra 0.8 from 0.7. My app became unable to read a BytesType sub-column. It turned out that ByteBuffer returned as a value of a sub-column can not be assumed to contain only the bytes of the sub-column. As a result one has

Cassandra 0.8 CLI: Inconsistent treatment of literals for keys/columns and values

2011-08-29 Thread Oleg Proudnikov
Hi, After installing Cassandra 0.8 I discovered that my app stopped working. The issue is that the app is now unable to read a row that was inserted by a CLI set command with a numeric string key. CLI in Cassandra 0.8 seems to be treating literals inconsistently. Please let me know if I am missin

Cassandra Thrift calls block indefinitely

2011-02-06 Thread Oleg Proudnikov
Hi All, This is the first time I see this. I am using Hector for a bulk load into a 3 node Cassandra 0.7.0 cluster. I have been doing this for a while now but this time the load was more intense compared to the ones before and it was running from a single client machine because I was afraid to ove

New Generation Size guidelines

2011-02-04 Thread Oleg Proudnikov
Hi All, I have a 3 server cluster with RF=2. My heap is 2G out of a 4G RAM. The servers have 4 cores. I used default heap settings. The Eden space ended up around 60M and the Survivor spaces are around 7M. This feels a little bit low for a process that creates so much short-lived garbage. I just w

Re: How to monitor Cassandra's throughput?

2011-02-04 Thread Oleg Proudnikov
The issue has been resolved, the fix is on Hector's GitHub. Oleg Proudnikov cloudorange.com> writes: > > I have posted on Hector ML: > > http://thread.gmane.org/gmane.comp.db.hector.user/1690 > > Oleg > >

Re: CF Read and Write Latency Histograms

2011-02-04 Thread Oleg Proudnikov
David Dabbs gmail.com> writes: > > Is this 0.7? > Yes

CF Read and Write Latency Histograms

2011-02-04 Thread Oleg Proudnikov
Hi All, I suspect that Write and Read Latency column headers need to be swapped. I am running a bulk load with no reads on this CF but I see Read column with values while the Write column has zeros only. The MBean shows the values correctly. Thank you, Oleg

Re: Unavalible Exception

2011-02-04 Thread Oleg Proudnikov
ruslan usifov gmail.com> writes: > > > 2011/2/4 Oleg Proudnikov cloudorange.com> > ruslan usifov gmail.com> writes: > > > > HelloWhy i can get Unavalible Exception on live cluster (all nodes is up andnever shutdown)PS: v 0.7.0 > Can the nodes see each othe

Re: Unavalible Exception

2011-02-04 Thread Oleg Proudnikov
ruslan usifov gmail.com> writes: > > HelloWhy i can get Unavalible Exception on live cluster (all nodes is up and never shutdown)PS: v 0.7.0 Can the nodes see each other? Check Cassandra logs for messages regarding other nodes. Oleg

Re: Slow network writes

2011-02-02 Thread Oleg Proudnikov
ruslan usifov gmail.com> writes: > > > 2011/2/3 Oleg Proudnikov cloudorange.com> > Is it possible that the key "1212" maps to the first node? I am assuming RF=1. > You could try random keys to test this theory... > > > Yes you right "1212&

Re: py_stress error in Cassandra 0.7

2011-02-02 Thread Oleg Proudnikov
Have you generated Cassandra Thrift interface? You will need to install Thrift first: http://wiki.apache.org/cassandra/InstallThrift Then, in the interface directory under Cassandra's home you can run thrift --gen py cassandra.thrift If the above does not install generated cassandra thrift mo

Re: Slow network writes

2011-02-02 Thread Oleg Proudnikov
Is it possible that the key "1212" maps to the first node? I am assuming RF=1. You could try random keys to test this theory... Oleg

Cassandra memory needs

2011-02-02 Thread Oleg Proudnikov
Hi All, I am trying to understand the relationship between data set/SSTable(s) size and Cassandra heap. Q1. Here is the memory calc from the Wiki: For a rough rule of thumb, Cassandra's internal datastructures will require about memtable_throughput_in_mb * 3 * number of hot CFs + 1G + internal

Re: How to monitor Cassandra's throughput?

2011-02-01 Thread Oleg Proudnikov
I have posted on Hector ML: http://thread.gmane.org/gmane.comp.db.hector.user/1690 Oleg

Re: How to monitor Cassandra's throughput?

2011-02-01 Thread Oleg Proudnikov
Thanks for the insight, Jonathan! As it turns out using single threaded clients with Hector's LeastActiveBalancingPolicy leads to the first node always winning :-) Is StorageProxy bean the only way to detect this, considering that all nodes are evenly loaded? Oleg

Re: How to monitor Cassandra's throughput?

2011-01-31 Thread Oleg Proudnikov
Thanks, Aaron! Is StorageProxy only exposed on the seed node? I consistently see it only on a single node that happens to be seed. Oleg

How to monitor Cassandra's throughput?

2011-01-31 Thread Oleg Proudnikov
Hi All, Is there a way to tell how many mutations/s my cluster is processing across all column families? Per node value would be OK as well. I see WriteCount per CF per node as well as TotalWriteLatencyMicros. Are they the right metrics to aggregate for this purpose? Thank you very much, Oleg

count in 0.7.0

2011-01-29 Thread Oleg Proudnikov
Hi All, Does Cassandra 0.7.0 need to deserialize the complete row in order to count all columns? I know from this ML that Cassandra 0.6 did that. Thank you very much, Oleg

Re: Stress test inconsistencies

2011-01-26 Thread Oleg Proudnikov
I returned to periodic commit log fsync. Jonathan Shook gmail.com> writes: > > Would you share with us the changes you made, or problems you found? >

Re: Stress test inconsistencies

2011-01-26 Thread Oleg Proudnikov
Hi All, I was able to run contrib/stress at a very impressive throughput. Single threaded client was able to pump 2,000 inserts per second with 0.4 ms latency. Multithreaded client was able to pump 7,000 inserts per second with 7ms latency. Thank you very much for your help! Oleg

Re: Stress test inconsistencies

2011-01-25 Thread Oleg Proudnikov
buddhasystem bnl.gov> writes: > > > Oleg, > > I'm a novice at this, but for what it's worth I can't imagine you can have a > _sustained_ 1kHz insertion rate on a single machine which also does some > reads. If I'm wrong, I'll be glad to learn that I was. It just doesn't seem > to square with a

Re: Stress test inconsistencies

2011-01-25 Thread Oleg Proudnikov
Brandon Williams gmail.com> writes: > > On Tue, Jan 25, 2011 at 1:23 PM, Oleg Proudnikov cloudorange.com> wrote: > > When I run contrib/stress with a higher thread count, the server does scale to > 200 inserts a second with latency of 200ms. At the same time Windows de

Re: Stress test inconsistencies

2011-01-25 Thread Oleg Proudnikov
Tyler Hobbs riptano.com> writes: > Try using something higher than -t 1, like -t 100.- Tyler > Thank you, Tyler! When I run contrib/stress with a higher thread count, the server does scale to 200 inserts a second with latency of 200ms. At the same time Windows desktop scales to 900 inserts a s

Stress test inconsistencies

2011-01-24 Thread Oleg Proudnikov
Hi All, I am struggling to make sense of a simple stress test I ran against the latest Cassandra 0.7. My server performs very poorly compared to a desktop and even a notebook. Here is the command I execute - a single threaded insert that runs on the same host as Cassnadra does (I am using new con

UnserializableColumnFamilyException: Couldn't find cfId

2011-01-20 Thread Oleg Proudnikov
Hi All, Could you please help me understand the impact on my data? I am running a 6 node 0.7-rc4 Cassandra cluster with RF=2. Schema was defined when the cluster was created and did not change. I am doing batch load with CL=ONE. The cluster is under some stress in memory and I/O. Each node has 1G

Lost MUTATIONS on several Cassandra nodes - no impact on the client

2011-01-20 Thread Oleg Proudnikov
Hi All, Could you please help me understand the impact of this behaviour? I am running a 6 node 0.7-rc4 Cassandra cluster with RF=2 6 Hector clients (one per node) are performing single-threaded batch load running on the same servers. CL=ONE. Client performs one simple small query and an insert