Thanks.
The tag 0.6.8 is not available in SVN
On Sat, Nov 13, 2010 at 8:02 AM, Eric Evans wrote:
>
> Greetings,
>
> I have some bad news, and some good news.
>
> The Bad News is that a regression[1] made its way into our latest
> release, 0.6.7. Sorry about that, we try really hard to keep tha
Hi JE,
0.6.6:
org.apache.cassandra.service.AntiEntropyService
I found the rowHash method uses "row.buffer.getData()" directly.
Since row.buffer.getData() is a byte[], and there may have some junk bytes
in the end by the buffer, I think we should use the exact length.
private MerkleTree.
Hi Jonathan,
Could you provide info about "the special case where a minor compaction,
also
happens to be a major one"?
On Wed, May 19, 2010 at 2:29 PM, Jonathan Ellis wrote:
> No. (Except in the special case where a minor compaction, also
> happens to be a major one.)
>
> On Tue, May 18, 2010
Up
On Sat, Jun 5, 2010 at 4:30 PM, Anty wrote:
> Hi:All
> in the code of SSTableReader.java
> private static final ReferenceQueue finalizerQueue = new
> ReferenceQueue()
> {{
> Runnable runnable = new Runnable()
> {
> public void run()
> {
>
Agree to Peter Schuller.
On Sun, Jul 18, 2010 at 8:40 PM, Jonathan Ellis wrote:
> On Sun, Jul 18, 2010 at 2:45 AM, Schubert Zhang wrote:
> > In a heavy inserting (many client threads), the memtable flush (generate
> new
> > sstable) is frequent (e.g. one in 30s).
>
> T
) Can we implement multi-thread compaction?
Schubert
On Sun, Jul 18, 2010 at 3:34 PM, Schubert Zhang wrote:
> Benjamin,
>
> It is not difficult to stack thousands of SSTables.
> In a heavy inserting (many client threads), the memtable flush (generate
> new sstable) is fren
>
Benjamin,
It is not difficult to stack thousands of SSTables.
In a heavy inserting (many client threads), the memtable flush (generate new
sstable) is fren
On Mon, Jun 14, 2010 at 2:03 AM, Benjamin Black wrote:
> On Sat, Jun 12, 2010 at 7:46 PM, Anty wrote:
> > Hi:ALL
> > I have 10 nodes clus
I fact, in my cassandra-0.6.2, I can only get about 40~50 reads/s with
disabled Key/Row cache.
On Sun, Jul 18, 2010 at 1:02 AM, Schubert Zhang wrote:
> Hi Jonathan,
> The 7k reads/s is very high, could you please make more explain about your
> benchmark?
>
> 7000 reads/s makes
Hi Jonathan,
The 7k reads/s is very high, could you please make more explain about your
benchmark?
7000 reads/s makes average latency of each read operation only talks
0.143ms. Consider 2 disks in the benchmark, it may be 0.286ms.
But in most random read applications on very large dataset, OS cac
Maybe the OrderPreservingPartitioner should let user define the customized
comparator.
In fact, user can implement his/her own XXXOrderPreservingPartitioner.
On Tue, Jun 22, 2010 at 8:34 PM, Sylvain Lebresne wrote:
> 2010/6/22 Maxim Kramarenko :
> > Hello!
> >
> > I use OrderPreservingPartitione
I think your read throughput is very high, and it may be unauthentic.
For random read, the disk seek will always be the bottleneck (100% utils)
There will be about 3 random disk-seeks for a random read, and aout 10ms for
one seek. So, there will be 30ms for a random read.
If you have only one dis
We integrate ganglia
On Mon, Jun 28, 2010 at 1:53 AM, Jonathan Ellis wrote:
> short version:
>
> if o.a.c.concurrent.{ROW-READ-STAGE,ROW-MUTATION-STAGE} and
> o.a.c.db.CompactionManager have
>
> - completed task count increasing
> - pending tasks stable (for RRS and RMS, stable in low hundreds
Yes, I think current HintedHandOff implementation in 0.6.x cannot support
large hints, it is a risk in a production system.
On Tue, Jun 29, 2010 at 12:31 AM, albert_e wrote:
> In 0.6.2, HH sending MUTATION message using the same OutboundTcpConnection
> with READ message. When HH transfering big
I found, for large dataset, long-term random reading test, the performance
with mmap is very bad.
See the attached chart in
https://issues.apache.org/jira/browse/CASSANDRA-1214.
On Fri, Jul 16, 2010 at 12:41 AM, Peter Schuller <
peter.schul...@infidyne.com> wrote:
> > Can someone please explain t
for your apps, how about this schema:
key: website1123
columnName: UserID
...
On Thu, Jul 15, 2010 at 6:13 AM, Aaron Morton wrote:
> The key structure you have should group the keys based on the website There
> are some differences between range queries with RP and OPP this article may
> help
>
For read, the bottleneck is usually the disk.
Use iostat to check the utility of your disks.
On Tue, Jul 13, 2010 at 2:07 PM, Peter Schuller wrote:
> > Has anyone experimented with different settings for concurrent reads? I
> > have set our servers to 4 ( 2 per processor core ). I have notic
Disk space includes:
1. Live SSTable files (Data, Index, Filter)
2. Garbage (compacted) SSTable files.
For each column, except for the value bytes, there are anditional bytes
include (2+columnname+1+8)
On Sat, Jul 10, 2010 at 2:57 AM, Jonathan Ellis wrote:
> you should read the "cassandra disk
The answer of Benjamin is very right.
On Sun, Jul 11, 2010 at 6:27 AM, Benjamin Black wrote:
> You constructed a pathological case and then got confused at the result.
>
> Consider instead a realistic case: RF=3, CL=QUORUM. Writes should go
> to all of A, B, and C. B is down when the write req
t is ardently discussing @http://news.ycombinator.com/item?id=1502756
Here are my comments:
1. Cassandra is very young! Especially, the design and implementation of
local storage and local indexing are junior and not good.
2. Pool read-performance is also due to the poor local storage
implementatio
3. B node is down during write operation, so return failure message to
client, and write a hint to C node.
Will write to the coordinator node.
On Thu, Jul 8, 2010 at 10:04 PM, ChingShen wrote:
> If so, when does hinted handoff work?
>
>
> On Thu, Jul 8, 2010 at 9:55 PM, Anty wrote:
>
>>
>>
>
After a long time (hours) of running, we cannot use nodetool to retrieve
information of cassandra.
[cassan...@nd3-rack0-cloud cassandra]$ ../cassandra/bin/nodetool -h
10.24.1.16 -p 8081 info
Exception in thread "main" java.lang.IllegalArgumentException:
java.lang:type=Memory not found in the conn
tarted before that happens, we clean out the old SSTables at
> startup time.)
>
> On Tue, May 11, 2010 at 10:50 AM, Schubert Zhang
> wrote:
> > In current 0.6.1, after a long time of compation, the old SSTable files
> are
> > still there, with the mark of
> > &q
Is it a problem for me to have millions of columns in a supercolumn?
You will have problem, because there is no index in supercolumn for
subcolumns.
On Tue, May 11, 2010 at 10:03 PM, David Boxenhorn wrote:
> I have a similar issue, but I can't create a CF per type, because types are
> an open-en
In the future, maybe cassandra can provide some "Filter" or "Coprocessor"
interfaces. Just like what of Bigtable do.
But now, cassandra is too young, there are many things to do for a clear
core.
On Tue, May 11, 2010 at 11:35 PM, Mike Malone wrote:
> On Mon, May 10, 2010 at 11:36 PM, vd wrote:
In current 0.6.1, after a long time of compation, the old SSTable files are
still there, with the mark of
"CFName-id-Compacted" zero sized file.
Whey not delete them immediately? What is the policy in 0.6.1?
See following examples.
-rw-rw-r-- 1 cassandra cassandra 0 May 11 23:35 LZO-12
"(I originally saw 3-5 ms read latency with a small amount of data and 1
Keyspace/CF)? "
The 3~5ms latency is offered by the Filesystem page cache.
Because your dataset is small, it can be cached totally by Filesystm.
2010/5/11 Peter Schüller
> > isolated requests, obviously in scale the RAID
. Most ORMs these days work by building a
>>> > propositional directed acyclic graph that's serialized to SQL. This
>>> would
>>> > work the same way, but it wouldn't be converted into a 4GL.
>>> > Mike
>>> >
>>> >>
> Maybe... but honestly, it doesn't affect the architecture or interface at
> all. I'm more interested in thinking about how the system should work than
> what things are called. Naming things are important, but that can happen
> later.
>
> Does anyone have any thoughts
Yes, the "column" here is not appropriate.
Maybe we need not to create new terms, in Google's Bigtable, the term
"qualifier" is a good one.
On Thu, May 6, 2010 at 3:04 PM, David Boxenhorn wrote:
> That would be a good time to get rid of the confusing "column" term, which
> incorrectly suggests a
Hi Jonathan,
Could you please have a check this?
On Wed, May 5, 2010 at 6:19 PM, Schubert Zhang wrote:
> Include d...@cassandra.apache.org
>
>
> On Wed, May 5, 2010 at 3:09 PM, Anty wrote:
>
>> HI:All
>>
>> In source code of 0.6.1 ,in SSTableWriter,
>>
Include d...@cassandra.apache.org
On Wed, May 5, 2010 at 3:09 PM, Anty wrote:
> HI:All
>
> In source code of 0.6.1 ,in SSTableWriter,
> private void afterAppend(DecoratedKey decoratedKey, long dataPosition, int
> dataSize) throws IOException
> {
> String diskKey = partitioner.convert
, May 4, 2010 at 1:10 AM, Schubert Zhang wrote:
> We make a patch to 0.6 branch and 0.6.1 for this feature.
>
> https://issues.apache.org/jira/browse/CASSANDRA-1041
>
1. When initially startup your nodes, please plan your InitialToken of each
node evenly.
2. standard
On Tue, May 4, 2010 at 9:09 PM, Boris Shulman wrote:
> I think that the extra (more than 4GB) memory usage comes from the
> mmaped io, that is why it happens only for reads.
>
> On Tue, May 4, 20
Seems your adding node is not a "new" node.
INFO [main] 2010-05-03 08:36:58,993 SystemTable.java (line 164) Saved Token
found: 113225717064305079230489016527619806663
INFO [main] 2010-05-03 08:36:58,994 SystemTable.java (line 179) Saved
ClusterName found: Image Cluster
Above log says, this node
We make a patch to 0.6 branch and 0.6.1 for this feature.
https://issues.apache.org/jira/browse/CASSANDRA-1041
I have ever modify the code to set INDEX_INTERVAL = 512, to decrease the
memory usage. And it seems working fine.
Is it right?
2010/4/30 casablinca126.com
> hi,
>It seems changing the INDEX_INTERVAL with conflict with
> AntiEntropyService, right?
>I will reconstruct my sstables.
Thanks!
I want have a detailed study of Hector.
On Thu, Apr 29, 2010 at 1:39 PM, Ran Tavory wrote:
> Hi Schubert, I'm sorry Hector isn't a good fit for you, so let's see what's
> missing for your.
>
> On Thu, Apr 29, 2010 at 8:22 AM, Schubert Zhang wrote:
&
Yes, it is ture.
Current cassandra has many limitations or bad implementations, especially on
storage level.
In my opinion, these limitations or bad implementations are just
implementation, not the original intention of design.
And I also want to give a suggestion/advice to the project leaders, w
I found hector is not a good design.
1. We cannot create multiple threads (each thread have a connection to
cassandra server) to one cassandra server.
As we known, usually, cassandra client should be multiple-threads to
achieve good throughput.
2. The implementation is too fat.
3. Introduce
key : stock ID, e.g. AAPL+year
column family: closting price and valume, tow CFs.
colum name: timestamp LongType
AAPL+2010-> CF:closingPrice -> {'04-13' : 242, '04-14': 245}
AAPL+2010-> CF:volume -> {'04-13' : 242, '04-14': 245}
On Thu, Apr 22, 2010 at 2:00 AM, Miguel Verde wrote:
> On Wed, Ap
On Wed, Apr 21, 2010 at 10:08 PM, Oleg Anastasjev wrote:
> Hello,
>
> I am testing how cassandra behaves on single node disk failures to know
> what to
> expect when things go bad.
> I had a cluster of 4 cassandra nodes, stress loaded it with client and made
> 2
> tests:
> 1. emulated disk failure
Your schema desigin is a RDBMS schema, not a Cassandra schema.
On Thu, Apr 15, 2010 at 11:44 PM, Miguel Verde wrote:
> Just to nitpick your representation a little bit, columnB/etc... are
> supercolumnB/etc..., key1/etc... are column1/etc..., and you can probably
> omit valueA/valueD designations
I think your file (as cassandra column value) is too large.
And I also think Cassandra is not good at store files.
On Wed, Apr 28, 2010 at 10:24 PM, Jussi P?öri
wrote:
> new try, previous went to wrong place...
>
> Hi all,
>
> i'm trying to run a scenario of adding files from specific folder to
>
I think, at least currently, we should leave the logic of current
SuperColumn and addational indexing features to application layer of
cassandra core.
On Wed, Apr 28, 2010 at 6:44 PM, Schubert Zhang wrote:
> I don't think secondary index is necessary for cassandra core, at least it
rving them for internal use).
>
> On Mon, Apr 26, 2010 at 11:05 AM, Schubert Zhang
> wrote:
> > I don't think the SuperColumn is so necessary.
> > I think this level of logic can be leaved to application.
> >
> > Do you think so?
> >
> > If Supe
I think even through the real deletion is done when compaction.
The get/get_range_slices should not return the deleted-marked keys (or
columns).
Schubert
On Wed, Apr 28, 2010 at 1:39 PM, Jeff Zhang wrote:
> Thanks Lu, it's helpful.
>
>
> On Wed, Apr 28, 2010 at 11:42 AM, Greg Lu wrote:
> > He
Seems:
ROW-MUTATION-STAGE 32 3349 63897493
is the clue, too many mutation requests are pending.
Yes, I also think cassandra should add a mechanism to avoid too many
requests pending (in queue).
When the queue is full, just reject the request from client.
Seems https://issues.apache.
I don't think the SuperColumn is so necessary.
I think this level of logic can be leaved to application.
Do you think so?
If SuperColumn is needed, as
https://issues.apache.org/jira/browse/CASSANDRA-598, we should build index
in SuperColumns level and SubColumns level.
Thus, the levels of index
RandomPartioner is for row-keys.
#1 no
#2 yes
#3 yes
On Sat, Apr 24, 2010 at 4:33 AM, Larry Root wrote:
> I trying to better understand how using the RandomPartitioner will affect
> my ability to select ranges of keys. Consider my simple example where we
> have many online games across differ
I think you should forget these RDBMS tech.
On Sat, Apr 24, 2010 at 11:00 AM, aXqd wrote:
> On Sat, Apr 24, 2010 at 1:36 AM, Ned Wolpert
> wrote:
> > There is nothing wrong with what you are asking. Some work has been done
> to
> > get an ORM layer ontop of cassandra, for example, with a RubyO
Hi Jonathan Ellis and Stu Hood,
I think, finally, we should provide a user customizable key abstract class.
User can define what types of key and its class, which define how to compare
keys.
Schubert
On Sat, Apr 24, 2010 at 1:16 PM, Stu Hood wrote:
> Your keys cannot be an encoded as binary fo
The column index in a row is a sorted-blocked index (like b-tree), just like
bigtable.
On Mon, Apr 26, 2010 at 2:43 AM, Stu Hood wrote:
> The indexes within rows are _not_ implemented with Lucene: there is a
> custom index structure that allows for random access within a row. But, you
> should p
I think that is not what cassandra good at.
On Mon, Apr 26, 2010 at 4:22 AM, Mark Greene wrote:
> http://wiki.apache.org/cassandra/CassandraLimitations
>
>
> On Sun, Apr 25, 2010 at 4:19 PM, S Ahmed wrote:
>
>> Is there a suggested sized maximum that you can set the value of a given
>> key?
>>
Please refer the code:
org.apache.cassandra.db.ColumnFamilyStore
public String getFlushPath()
{
long guessedSize = 2 * DatabaseDescriptor.getMemtableThroughput() *
1024*1024; // 2* adds room for keys, column indexes
String location =
DatabaseDescriptor.getDataFileLocationF
When starting your cassandra cluster, please configure the InitialToken for
each node, which make the key range balance.
On Mon, Apr 26, 2010 at 6:17 PM, Mark Robson wrote:
> On 26 April 2010 01:18, 刘兵兵 wrote:
>
>> i do some INSERT ,because i will do some scan operations, i use the
>> OrderPres
Since the scale of GC graph in the slides is different from the throughput
ones. I will do another test for this issue.
Thanks for your advices, Masood and Jonathan.
---
Here, i just post my cossandra.in.sh.
JVM_OPTS=" \
-ea \
-Xms128M \
-Xmx6G \
-XX:Tar
ike the slowdown doesn't hit until after several GCs,
>> >> although it's hard to tell since the scale is different on the GC
>> >> graph and the insert throughput ones.
>> >>
>> >> Perhaps this is compaction kicking in, not GCs? Definitel
You can have a look at org.apache.cassandra.service.StorageService
public void initServer() throws IOException
1. If AutoBootstrap=false, it means the the node is bootstaped (not a new
node)
Usually, the first new node is set false.
(1) check the system table to find the saved token, if found
=0 \
>
> -Dcom.sun.management.jmxremote.port=8080 \
>
> -Dcom.sun.management.jmxremote.ssl=false \
>
> -Dcom.sun.management.jmxremote.authenticate=false"
>
>
> and my box is normal pc with 2GB ram, Intel E3200 @ 2.40GHz. By the way, I
Seems you should configure larger jvm-heap.
On Tue, Apr 20, 2010 at 9:32 AM, Schubert Zhang wrote:
> Please also post your jvm-heap and GC options, i.e. the seting in
> cassandra.in.sh
> And what about you node hardware?
>
> On Tue, Apr 20, 2010 at 9:22 AM, Ken Sandney wrote:
Please also post your jvm-heap and GC options, i.e. the seting in
cassandra.in.sh
And what about you node hardware?
On Tue, Apr 20, 2010 at 9:22 AM, Ken Sandney wrote:
> Hi
> I am doing a insert test with 9 nodes, the command:
>
>> stress.py -n 10 -t 1000 -c 10 -o insert -i 5 -d
>> 10.0.
; > On Sat, Apr 17, 2010 at 11:31 PM, Chris Goffinet
> wrote:
> >> I wonder if that might be related to this:
> >> https://issues.apache.org/jira/browse/CASSANDRA-896
> >> We switched from a Concurrent structure to LinkedBlockingQueue in 0.6.
> >> -Chris
&
We are testing 0.6.0, compares with 0.5.1, and it seems:
1. 0.6.0 need more memory/heap.
2. after inserted billions of columns, tens-million of keys, the inseting
operation become very slow and jamed.
Exceptions TimeoutException and UnavailableException are throwed sometimes.
I add more log, suc
63 matches
Mail list logo