Re: Deletion batch mutate

2010-05-06 Thread Weijun Li
Mutation.ColumnOrSuperColumn takes either super column or regular column. On Thu, May 6, 2010 at 11:16 AM, Sonny Heer wrote: > The Deletion Class only has a setSuper_column method. Does this work > with regular columns as well? if not, how do you add a mutation for > column delete? >

Re: performance tuning - where does the slowness come from?

2010-05-06 Thread Weijun Li
use random file access seems to make more sense. mmap should be used when you have enough ram for OS to cache most or all of your data files. -Weijun On Thu, May 6, 2010 at 10:49 AM, Vick Khera wrote: > On Thu, May 6, 2010 at 1:06 PM, Weijun Li wrote: > > In this case using mmap wil

No column index in Cassandra?

2010-05-06 Thread Weijun Li
Hello, it seems that sstable index file only contains key/position and each sstable doesn't have column index. So how does range slice query work? Does it iterate through every key in the range for column name/value comparison? -Weijun

Re: Anti compaction and readonly compaction?

2010-05-05 Thread Weijun Li
Thanks Roger! It helps a lot! On Wed, May 5, 2010 at 9:20 AM, Roger Schildmeijer wrote: > http://wiki.apache.org/cassandra/Streaming > > On 5 maj 2010, at 18.18em, Weijun Li wrote: > > What's the purpose of anti-compaction? In what scenario does Cassandra need > to

Anti compaction and readonly compaction?

2010-05-05 Thread Weijun Li
What's the purpose of anti-compaction? In what scenario does Cassandra need to split bit sstables into smaller piece? Also I noticed readonly compaction in the code. What's the use of this compaction type? Thanks, -Weijun

Re: Use binary memtable to load data

2010-05-05 Thread Weijun Li
; On Tue, May 4, 2010 at 8:09 PM, Weijun Li wrote: > > Does anyone use binary memtable to import data into Cassandra? > > Yes. > > > When you do > > this how do you determine the destination node that should own those > data? > > You let the StorageProxy API figure

Re: Cassandra Streaming Service

2010-05-05 Thread Weijun Li
Thank you Jonathan! Good to know. On Tue, May 4, 2010 at 9:13 PM, Jonathan Ellis wrote: > The Streaming service is what moves data around for load balancing, > bootstrap, and decommission operations. > > On Tue, May 4, 2010 at 8:08 PM, Weijun Li wrote: > > A dumb question:

Use binary memtable to load data

2010-05-04 Thread Weijun Li
Does anyone use binary memtable to import data into Cassandra? When you do this how do you determine the destination node that should own those data? Is replication factor taken into consideration when you import binary memtable? Thanks, -Weijun

Cassandra Streaming Service

2010-05-04 Thread Weijun Li
A dumb question: what is the use of Cassandra streaming service? Any use case or example? Thanks, -Weijun

Re: BloomFilter is taking too much memory

2010-05-04 Thread Weijun Li
cause it stores information about > _all_ keys while the index summary stores every 1/128 key. > > On Tue, May 4, 2010 at 3:47 PM, Weijun Li wrote: > > Hello, > > > > We stored about 47mil keys in one Cassandra node and what a memory dump > > shows for one of the SStabl

BloomFilter is taking too much memory

2010-05-04 Thread Weijun Li
Hello, We stored about 47mil keys in one Cassandra node and what a memory dump shows for one of the SStableReader: SSTableReader: 386MB. Among this 386MB, IndexSummary takes about 231MB but BloomFilter takes 155MB with an embedded huge array long[19.4mil]. It seems that BloomFilter is taking

Re: Cassandra Java Client

2010-04-16 Thread Weijun Li
I'm using spymemcached and it works great! Easy to use, support sharding and compression and can handle high volume traffic. http://code.google.com/p/spymemcached/ -Weijun On Fri, Apr 16, 2010 at 3:29 AM, Linton N wrote: > import java.util.List; > import java.io.UnsupportedEncodingException; >

Re: Heap sudden jump during import

2010-04-03 Thread Weijun Li
cense). I successfully load and browse heap > bigger than the available memory on the system. > > Regards, > > Benoit > > 2010/4/3 Weijun Li : > > Thank you Benoit. I did a search but couldn't find any that you > mentioned. > > Both jhat and netbean load entire map

Re: Heap sudden jump during import

2010-04-03 Thread Weijun Li
t exists other tools than jhat to browse a heap dump, which stream > the heap dump instead of loading it full in memory like jhat do. > > Kind regards, > > Benoit. > > 2010/4/3 Weijun Li : > > I'm running a test to write 30 million columns (700bytes each) to > Cassan

Heap sudden jump during import

2010-04-02 Thread Weijun Li
I'm running a test to write 30 million columns (700bytes each) to Cassandra: the process ran smoothly for about 20mil then the heap usage suddenly jumped from 2GB to 3GB which is the up limit of JVM, --from this point Cassandra will freeze for long time (terrible latency, no response to nodetool th

RE: compression

2010-04-01 Thread Weijun Li
Thrift client doesn’t seem to compress anything unless you change thrift protocol or use a transport that support compression. I modified TSocket to support compression but it occasionally has broken pipe error due to crappy Java zlib support (so that clients has to reconnect to get around the s

get_string_property(token map) freezes for cluster

2010-03-17 Thread Weijun Li
get_string_property(token map) worked for one node on localhost, but it freezed when I was trying to call it against a cluster of 6 nodes. What's the correct way to return the list of all nodes in a cluster? Thanks, -Weijun

nodetool-compact duplicated data files again and again

2010-03-17 Thread Weijun Li
I'm testing the ExpiringColumn patch in 0.6-beta2, inserted 26GB data with TTL, after columns have expired I use get_slice to verify that no columns can be retrieved. When I run "nodetool compact" I think all data should be gone. But the problem is: 1) After the first nodetool-comact, Cassandra du

Re: question about deleting from cassandra

2010-03-15 Thread Weijun Li
OK I will try to separate them out. On Sat, Mar 13, 2010 at 5:35 AM, Jonathan Ellis wrote: > You should submit your minor change to jira for others who might want to > try it. > > On Sat, Mar 13, 2010 at 3:18 AM, Weijun Li wrote: > > Tried Sylvain's feature in 0.6 beta2

Re: question about deleting from cassandra

2010-03-15 Thread Weijun Li
r 13, 2010 at 3:36 PM, Jonathan Ellis wrote: > >> since they are separate changes, it's much easier to review if they >> are submitted separately. >> >> On 3/13/10, Weijun Li wrote: >> > Sure. I'm making another change for cross multiple DC replicati