Re: pycassa failures in large batch cycling

2013-05-17 Thread John R. Frank
IMHO you are going to have more success breaking up your work load to work with the current settings.  The buffers created by thrift are going to eat up the server side memory. They grow dynamically but persist for the life of the connection.  Amen to that. Already refactoring our workload to

Re: pycassa failures in large batch cycling

2013-05-17 Thread aaron morton
IMHO you are going to have more success breaking up your work load to work with the current settings. The buffers created by thrift are going to eat up the server side memory. They grow dynamically but persist for the life of the connection. Cheers - Aaron Morton Freelance C

Re: pycassa failures in large batch cycling

2013-05-16 Thread John R. Frank
On Tue, 14 May 2013, aaron morton wrote: After several cycles, pycassa starts getting connection failures. Do you have the error stack ?Are the TimedOutExceptions or socket time outs or something else. I figured out the problem here and made this ticket in jira: https://issues.apa

Re: pycassa failures in large batch cycling

2013-05-13 Thread aaron morton
> After several cycles, pycassa starts getting connection failures. Do you have the error stack ? Are the TimedOutExceptions or socket time outs or something else. > Would things be any different if we used multiple nodes and scaled the data > and worker count to match? I mean, is there somethin

pycassa failures in large batch cycling

2013-05-09 Thread John R. Frank
C* users, We have a process that loads a large batch of rows from Cassandra into many separate compute workers. The rows are one-column wide and range in size for a couple KB to ~100 MB. After manipulating the data for a while, each compute worker writes the data back with *new* row keys com