I am testing on one node only right now and simply adding data and reading it.
There is no real problem but I feel I get differencing results from the tests
depending on something I would like to know what it is.
What is happening between two "points in time"
-->Here is a point with Slower result
You insert 500 rows with key "x"
And 1000 rows with key "y"
You make a query getting all rows.
It will only show two rows, the ones with the latest timestamps.
/Justus
Från: Rana Aich [mailto:aichr...@gmail.com]
Skickat: den 29 juli 2010 08:23
Till: user@cassandra.apache.org
Ämne: Re: Consequences
Thanks for your reply! I thought in that case a new row would be inserted
with a new timestamp and cassandra will report the new row. But how this
will affect my range query?
On Wed, Jul 28, 2010 at 7:03 PM, Benjamin Black wrote:
> If you write new data with a key that is already present, the ex
this page is probably not spam
this page is translated into Chinese.
we should notice AndroidYou make "CassandraLimitations_CH"
2010/7/29 Ashwin Jayaprakash :
>
> I see spam on this page -
> http://wiki.apache.org/cassandra/CassandraLimitations
>
> Look at this -
> http://wiki.apache.org/cassandr
Thanks for the help Aaron.
I sorted it out: the problem was that I was using the latest version of
pycassa against cassandra 0.6.x.
When I downloaded the 0.3.0 pycassa and used the previous api, all worked
properly.
Cheers,
db
On Wed, Jul 28, 2010 at 2:44 PM, Aaron Morton wrote:
> Just check
Thanks for the help Aaron.
I sorted it out: the problem was that I was using the latest version of
pycassa against cassandra 0.6.x.
When I downloaded the 0.3.0 pycassa and used the previous api, all worked
properly.
Cheers,
db
On Wed, Jul 28, 2010 at 2:44 PM, Aaron Morton wrote:
> Just check
I know there is no native support for "order by", "group by" etc but I
was wondering how it could be accomplished with some custom indexes?
For example, say I have a list of word counts like (notice 2 words have
the same count):
"cassandra" => 100
"foo" => 999
"bar" => 1
"
If you want to process millions of rows at a time take a look at the Hadoop and Pig integration. Try the Cloudera distro of Hadoop CHD3 it includes Pig with it. Pig is a "SQL" like language for doing large scale data analysis that compiles down to Java that is run in Hadoop jobs. http://hadoop.apac
Have you considered Redis http://code.google.com/p/redis/? It may be more suited to the master-slave configuration you are after. - You can have a master to write to, then slave to a slave master, then your web heads run a local redis and slave from the slave master. - Backup at the master or the s
If you write new data with a key that is already present, the existing
columns are overwritten or new columns are added. There is no way to
cause a duplicate key to be inserted.
On Wed, Jul 28, 2010 at 6:16 PM, Rana Aich wrote:
> Hello,
> I was wondering what may the pitfalls in Cassandra when t
Hi all,
Are there any better way to retrieve data from Cassandra than using
get_range_slices?
Now I'm going to port some programs using MySQL to Cassandra. The
program query is like
below:
"select * from Table_A where date > 1/1/2008 and date < 12/31/2009 and
locationID = 1"
The result of
Hello,
I was wondering what may the pitfalls in Cassandra when the Key value is not
UNIQUE?
Will it affect the range query performance?
Thanks and regards,
raich
We recently migrated part of our MySQL database to a 3-node Cassandra
cluster with a replication factor of 3. Couple of days ago we noticed
that Cassandra sometimes returns the wrong data. Not corrupted data,
but data for a different key than the one being asked for. This error
appears to be random
Just as a followup, here's what seems to be the resolution:
1. 0.6.4 should fix this problem.
2. Using OPP as the DHT should solve it as well.
3. Prior to 0.6.4, when using RandomPartitioner as the DHT, there's no good
way to guarantee that you see *all* row keys for a column family.
Strategies t
>Is it possible to configure Cassandra in such a way that a
>node only every asks itself for the data, and if so what sort of
>effect will that have on read performance?
Check out the RingCache class which lets you make your clients smart enough to
ask the right server. (Also, if all nodes have a
Hi,
I'm currently looking at NoSQL solutions to replace a bespoke system
that we currently have in place. Currently I think the best fit is
Cassandra, but I would like to get some feedback from those who know
it better before spending more time on it.
Our current system is geared to allowing our
On 7/28/10 12:26 PM, YES Linux wrote:
i was wondering what the trade offs were between the key cache and row
cache? which is more important from a read? if you have a large row
cache can your key cache be small?
- The row cache is a superset of the key cache. If you have a row cache
on a CF,
On 7/28/10 2:43 PM, Dave Viner wrote:
Hi all,
I'm having a strange result in trying to iterate over all row keys for a
particular column family. The iteration works, but I see the same row
key returned multiple times during the iteration.
I'm using cassandra 0.6.3, and I've put the code in use
> "As a result, we designed and built Flume...
> (I wonder if this could deliver into Cassanda :) )
Yes - apparently it's pretty easy to do - I was thinking of doing it but
haven't found the time yet.
https://issues.cloudera.org//browse/FLUME-20
On Jul 28, 2010, at 4:30 PM, Aaron Morton wrote:
Yes, didn't know if you saw the reply in the channel.
This bug has been fixed in the forthcoming 0.6.4 release. It was bug
CASSANDRA-1042 - https://issues.apache.org/jira/browse/CASSANDRA-1042
(0.6.4 will be out really soon)
On Jul 28, 2010, at 4:43 PM, Dave Viner wrote:
> Hi all,
>
> I'm ha
Just checking the obvious, your connecting to the local host so is this code running on one of the machines with the cassandra installed ? Second, assuming your using the current git hub source, put a break point in connection.py at line 191 to see what the actual error is when it tries to connect.
Hi all,
I'm having a strange result in trying to iterate over all row keys for a
particular column family. The iteration works, but I see the same row key
returned multiple times during the iteration.
I'm using cassandra 0.6.3, and I've put the code in use at
http://pastebin.com/zz5xJQ8f
Using
Did you start the repair on all nodes at once or one at a time ? Take a look at the streams on the nodes, using either nodetool -h localhost -p 8080 streams Or the JMX interface. Check if the numbers are changing. AaronOn 28 Jul, 2010,at 08:14 AM, Michael Andreasen wrote:I've started repair on 6 n
If you are looking to store web logs and then do ad hoc queries you might/should be using Hadoop (depending on how big your logs are) I agree, take a look at the Cloudera Hadopp 3 CDH3, they include an app called Flume for moving data..."As a result, we designed and built Flume. Flume is a distribu
i was wondering what the trade offs were between the key cache and row
cache? which is more important from a read? if you have a large row cache
can your key cache be small?
here is some background to my questions:
i have a data set that has alot of random access for rows using get slices
from
if i have data of the form
Keyspace1 ->
Super2 ->
icecream ->
20100701 -> :vanille 100 :chocolade 2 :riesling-sorbet 900
20100702 -> :vanille 100 :chocolade 200 :riesling-sorbet 100
cake
20100701 -> :cheescake 2 :linzer 100 :apfel 2
20100702 -> :cheescake
uncle mantis gmail.com> writes:
> Why is everything always in California or Las Vegas?
Because you can convince your employer to pay your vacation near ocean or slots,
i believe ;-)
If you are looking to store web logs and then do ad hoc queries you
might/should be using Hadoop (depending on how big your logs are)
While MongoDB has MapReduce (built in) it is there to simulate SQL GROUP BY
and not for large scale analytics by any means.
MongoDB uses a global read/write lock p
Hi,
That fixed the problem!
I added the Framed option and like magic things have started working again.
Example:
thrift_client:start_link("localhost", 9160, cassandra_thrift, [ { framed,
true } ] )
JT.
On Tue, Jul 27, 2010 at 10:04 PM, Jonathan Ellis wrote:
> trunk is using framed thrift c
They have approximately nothing in common. And, no, Cassandra is
definitely not dying off.
On Tue, Jul 27, 2010 at 8:14 AM, Mark wrote:
> Can someone quickly explain the differences between the two? Other than the
> fact that MongoDB supports ad-hoc querying I don't know whats different. It
> al
30 matches
Mail list logo