Yes, a disadvantage of more no. of CF in terms of memory utilization
which I see is: -
if some CF is written less often as compared to other CFs, then the
memtable would consume space in the memory until it is flushed, this
memory space could have been much better used by a CF that's heavily
writt
Thanks Tyler !
I could not fully understand the reason why more no of column families
would mean more memory.. if you have under control parameters like
memtable_throughput & memtable_operations which are set per column
family basis then you can directly control & adjust by splitting the
memory sp
On Fri, Feb 4, 2011 at 9:47 PM, Matt Kennedy wrote:
> Found the culprit. There is a new feature in Pig 0.8 that will try to
> reduce the number of splits used to speed up the whole job. Since the
> ColumnFamilyInputFormat lists the input size as zero, this feature
> eliminates all of the splits
Found the culprit. There is a new feature in Pig 0.8 that will try to reduce
the number of splits used to speed up the whole job. Since the
ColumnFamilyInputFormat lists the input size as zero, this feature eliminates
all of the splits except for one.
The workaround is to disable this featu
> I read somewhere that more no of column families is not a good idea as
> it consumes more memory and more compactions to occur
This is primarily true, but not in every case.
But the caching requirements may be different as they cater to two
> different features.
>
This is a great reason to *n
Start with "grep -i down system.log" on each machine
On Fri, Feb 4, 2011 at 7:37 PM, David King wrote:
> We're going to need *way* more information than this
>
> On 03 Feb 2011, at 20:03, ruslan usifov wrote:
>
>> Hello
>>
>> Why i can get Unavalible Exception on live cluster (all nodes is up and
I read somewhere that more no of column families is not a good idea as
it consumes more memory and more compactions to occur & thus I am
trying to reduce the no. of column families by adding the rows of
other Column families(with similar attributes) as separate rows into
one.
I have two kinds of d
We're going to need *way* more information than this
On 03 Feb 2011, at 20:03, ruslan usifov wrote:
> Hello
>
> Why i can get Unavalible Exception on live cluster (all nodes is up and never
> shutdown)
>
> PS: v 0.7.0
Thanks Aaron,
Yes I can put the column names without using the userId in the
timeline row, and when I want to retrieve the row corresponding to
that column name, I will attach the userId to get the row key.
Yes I'll store it as a long & I guess I'll have to write with a custom
comparator type (Re
Please provide some information the client you are using, the client side error
stack, the command you are running, the output from nodetool ring
Aaron
On 5 Feb 2011, at 05:10, Oleg Proudnikov wrote:
> ruslan usifov gmail.com> writes:
>
>>
>>
>> 2011/2/4 Oleg Proudnikov cloudorange.com>
IMHO If you know the time of the event use store the time as a long, rather
than a UUID. It will make it easier to get back to a
time and make it easier for you to compare columns. TimeUUIDS has a pseudo
random part as well as the time part, it could be set to a constant. By why
bother if you k
Brandon,
Thanks for the response. I have also noticed that stress.py's progress
interval gets thrown off in low memory situations.
What did you mean by "contrib/stress on 0.7 instead". I don't see that dir
in the src version of 0.7.
- Sameer
On Thu, Feb 3, 2011 at 5:22 PM, Brandon Williams w
On Fri, Feb 4, 2011 at 1:45 PM, Oleg Proudnikov wrote:
>
> Hi All,
>
> I have a 3 server cluster with RF=2. My heap is 2G out of a 4G RAM. The
> servers
> have 4 cores. I used default heap settings. The Eden space ended up around 60M
> and the Survivor spaces are around 7M. This feels a little bi
Hi All,
I have a 3 server cluster with RF=2. My heap is 2G out of a 4G RAM. The servers
have 4 cores. I used default heap settings. The Eden space ended up around 60M
and the Survivor spaces are around 7M. This feels a little bit low for a process
that creates so much short-lived garbage. I just w
On Fri, Feb 4, 2011 at 2:44 PM, David Dabbs wrote:
>
> Our data is on sdb, commit logs on sdc.
> So do I read this correctly that we're 'await'ing 6+millis on average for
> data drive (sdb)
> requests to be serviced?
>
>
That is right. Those numbers look pretty good for rotational media. What
sor
What operation are you calling ? Are you trying to read the entire row back?
How many SSTables do you have for the CF? Does your data have a lot of
overwrites ? Have you modified the default compaction settings ?
Do you have row cache enabled ?
How long does the second request take ?
Can you
So do we need to write a script ? or its some thing i can do as a system
admin without involving and developer.If yes please guide me in this case.
On 02/04/2011 10:36 PM, Jonathan Ellis wrote:
In that case, you should shut down the server before removing data files.
On Fri, Feb 4, 2011 at
Thank you both for your advice. See my updated iostats below.
>From: sridhar.ba...@gmail.com [mailto:sridhar.ba...@gmail.com] On Behalf Of
sridhar basam
>Sent: Thursday, February 03, 2011 10:58 AM
>To: user@cassandra.apache.org
>Subject: Re: Tracking down read latency
>
>The data provided is als
The issue has been resolved, the fix is on Hector's GitHub.
Oleg Proudnikov cloudorange.com> writes:
>
> I have posted on Hector ML:
>
> http://thread.gmane.org/gmane.comp.db.hector.user/1690
>
> Oleg
>
>
Hi all,
It often takes more than two seconds to load:
- one row of ~450 events comprising ~600k
- cluster size of 1
- client is pycassa 1.04
- timeout on recv
- cold read (I believe)
- load generally < 0.5 on a 4-core machine, 2 EC2 instance store drives for
cassandra
- cpu wait generally < 1%
O
Thanks so much Ryan for the links; I'll definitely take them into
consideration.
Just another thought which came to my mind:-
perhaps it may be beneficial to store(or duplicate) some of the data
like the Login credentials & particularly userId to User's Name
mapping, etc (which is very heavily rea
David Dabbs gmail.com> writes:
>
> Is this 0.7?
>
Yes
Is this 0.7?
-Original Message-
From: Oleg Proudnikov [mailto:ol...@cloudorange.com]
Sent: Friday, February 04, 2011 11:42 AM
To: user@cassandra.apache.org
Subject: CF Read and Write Latency Histograms
Hi All,
I suspect that Write and Read Latency column headers need to be swapped. I
am
FWIW, I'm working on migrating a large amount of data out of Oracle into my
test cluster. The data has been warehoused as CSV files on Amazon S3. Having
that in place allows me to not put extra load on the production service when
doing many repeated tests. I then parse the data using CSV Python mo
For the number of file the OP has why not just use a traditional filesystem
and solr to index the pdf data. You get to search inside of the files for
relevant information?
Sri
On Fri, Feb 4, 2011 at 12:47 PM, buddhasystem wrote:
>
> Even when storage is in NFS, Cassandra can still be quite us
I'm afraid there is no short answer.
The long answer is,
1) Read about Cassandra data modeling at
http://wiki.apache.org/cassandra/ArticlesAndPresentations. It is not
as simple as "one table equals one columnfamily."
2) Write a program to read your data out of SQL Server and write it
into Cassan
yes, definitely a database for mapping ofcourse!
On Fri, Feb 4, 2011 at 11:17 PM, buddhasystem wrote:
>
> Even when storage is in NFS, Cassandra can still be quite useful as a file
> catalog. Your physical storage can change, move etc. Therefore, it's a good
> idea to provide mapping of logical n
Even when storage is in NFS, Cassandra can still be quite useful as a file
catalog. Your physical storage can change, move etc. Therefore, it's a good
idea to provide mapping of logical names to physical store points (which in
fact can be many). This is a standard technique used in mass storage.
Can you create a ticket?
On Fri, Feb 4, 2011 at 9:41 AM, Oleg Proudnikov wrote:
> Hi All,
>
> I suspect that Write and Read Latency column headers need to be swapped. I am
> running a bulk load with no reads on this CF but I see Read column with values
> while the Write column has zeros only. The
I am also looking to possible solutions to store pdfs & word documents.
But why wont you store in them in the filesystem instead of a database
unless your files are too small in which case it would be recommended
to use a database.
-Aditya
On Fri, Feb 4, 2011 at 5:30 PM, Daniel Doubleday
wrote
Hi All,
I suspect that Write and Read Latency column headers need to be swapped. I am
running a bulk load with no reads on this CF but I see Read column with values
while the Write column has zeros only. The MBean shows the values correctly.
Thank you,
Oleg
In that case, you should shut down the server before removing data files.
On Fri, Feb 4, 2011 at 9:01 AM, wrote:
> I thought truncate() was not available before 0.7 (in 0.6.3)was it?
>
> ---
> Sent from BlackBerry
>
> -Original Message-
I thought truncate() was not available before 0.7 (in 0.6.3)was it?
---
Sent from BlackBerry
-Original Message-
From: Jonathan Ellis
Date: Fri, 4 Feb 2011 08:58:35
To: user
Reply-To: user@cassandra.apache.org
Subject: Re: How to delete
You can't create a row with no columns without tombstones being
involved somehow. :)
There's no distinction between "a row with no columns because the
individual columns were removed," and "a row with no columns because
the row was removed." the latter is just a more efficient expression
of the f
You should use truncate instead. (Then remove the snapshot truncate creates.)
On Fri, Feb 4, 2011 at 2:05 AM, Ali Ahsan wrote:
> Hi All
>
> Is there any way i can delete column families data (not removing column
> families ) from Cassandra without effecting ring integrity.What if i delete
> some
Looks like https://issues.apache.org/jira/browse/CASSANDRA-1992, fixed
for 0.7.1.
On Fri, Feb 4, 2011 at 12:18 AM, Stu King wrote:
> I am running a move on one node in a 5 node cluster. There are no writes to
> the cluster during the move.
> I am seeing an exception on one of the nodes (not the n
Some potential problems:
1) sounds like you are using OPP/BOP and not adjusting tokens to
balance the data on each node
2) 8 client threads is not enough to saturate 16 cassandra cores
3) if your commitlog is not on a separate device from your data
directories you will have a lot of contention bet
On Thu, Feb 3, 2011 at 9:12 PM, Aklin_81 wrote:
> Thanks Matthew & Ryan,
>
> The main inspiration behind me trying to generate Ids in sequential
> manner is to reduce the size of the userId, since I am using it for
> heavy denormalization. UUIDs are 16 bytes long, but I can also have a
> unique Id
ruslan usifov gmail.com> writes:
>
>
> 2011/2/4 Oleg Proudnikov cloudorange.com>
> ruslan usifov gmail.com> writes:
> >
> > HelloWhy i can get Unavalible Exception on live cluster (all nodes is up
andnever shutdown)PS: v 0.7.0
> Can the nodes see each other? Check Cassandra logs for messages
2011/2/4 Oleg Proudnikov
> ruslan usifov gmail.com> writes:
>
> >
> > HelloWhy i can get Unavalible Exception on live cluster (all nodes is up
> and
> never shutdown)PS: v 0.7.0
>
>
> Can the nodes see each other? Check Cassandra logs for messages regarding
> other
> nodes.
>
>
Yes they can, nod
ruslan usifov gmail.com> writes:
>
> HelloWhy i can get Unavalible Exception on live cluster (all nodes is up and
never shutdown)PS: v 0.7.0
Can the nodes see each other? Check Cassandra logs for messages regarding other
nodes.
Oleg
create a ReversedIntegerType.
On Fri, Feb 4, 2011 at 5:15 AM, Aditya Narayan wrote:
> Is there any way to sort the columns named as integers in the descending
> order ?
>
>
> Regards
> -Aditya
>
--
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professi
I have several large SQL Server 2005 tables. I need to load the data in
these tables into Cassandra. FYI, the Cassandra installation is on a
linux server running CentOS.
Can anyone suggest the best way to accomplish this? I am a newbie to
Cassandra, so any advice would be greatly appreciated
Is there any way to sort the columns named as integers in the descending order ?
Regards
-Aditya
We are doing this with cassandra.
But we cache a lot. We get around 20 writes/s and 1k reads/s (~ 100Mbit/s) for
that particular CF but only 1% of them hit our cassandra cluster (5 nodes,
rf=3).
/Daniel
On Feb 4, 2011, at 9:37 AM, Brendan Poole wrote:
> Hi Daniel
>
> When you say "We are do
Thanks Eric.
I am able to make it running.
-Original Message-
From: Eric Evans [mailto:eev...@rackspace.com]
Sent: Wednesday, February 02, 2011 9:34 PM
To: user@cassandra.apache.org
Subject: Re: CQL
On Wed, 2011-02-02 at 06:57 +, Vivek Mishra wrote:
> I am trying to run CQL from a jav
Hi!
I'm getting tombstones from get_range_slices(). I know that's normal.
But is there a way to know that a key is tombstone? I know tombstone
has no columns but I can create a row without any columns that would
look like a tombstone in get_range_slices().
Regards,
Patrik
Hi All
Is there any way i can delete column families data (not removing column
families ) from Cassandra without effecting ring integrity.What if i
delete some column families data in linux with rm command ?
--
S.Ali Ahsan
Senior System Engineer
e-Business (Pvt) Ltd
49-C Jail Road, Lahor
On Fri, Feb 4, 2011 at 12:35 AM, Mike Malone wrote:
> On Thu, Feb 3, 2011 at 6:44 AM, Sylvain Lebresne wrote:
>
>> On Thu, Feb 3, 2011 at 3:00 PM, David Boxenhorn wrote:
>>
>>> The advantage would be to enable secondary indexes on supercolumn
>>> families.
>>>
>>
>> Then I suggest opening a ticke
Brendan Poole would like to recall the message, "Using Cassandra to store
files".
Brendan Poole
Systems Developer
NewLaw Solicitors
Helmont House
Churchill Way
Cardiff
brendan.po...@new-law.co.uk
029 2078 4283
www.new-law.co.uk
Please consider the e
The first line on the couchDB website doesn't fill me with confidence...
"The 1.0.0 release has a critical bug which can lead to data loss in the
default configuration"
Brendan Poole
Systems Developer
NewLaw Solicitors
Helmont House
Churchill Way
Cardiff
bre
Hi Daniel
When you say "We are doing this" do you mean via NFS or Cassandra.
Thanks
Brendan
Brendan Poole
Systems Developer
NewLaw Solicitors
Helmont House
Churchill Way
Cardiff
brendan.po...@new-law.co.uk
029 2078 4283
www.new-law.co.
That's because of an issue I found in the ANT scripts while doing the
maven-ant-tasks switch on 0.7.0.
Any jar in build will be bundled... (so ivy goes into the bin dist...
when I did the m-a-t version eric was wondering why i was including
m-a-t in the bin dist, and I said I was being symmetric w
On Thu, Feb 3, 2011 at 10:39 PM, Yang wrote:
> the pdf at the design doc
>
> https://issues.apache.org/jira/secure/attachment/12459754/Partitionedcountersdesigndoc.pdf
>
> does say so:
> page 2 "- strongly consistent read: requires consistency level ALL.
> (QUORUM is insufficient.)
> "
>
> but th
I am running a move on one node in a 5 node cluster. There are no writes to
the cluster during the move.
I am seeing an exception on one of the nodes (not the node which I am doing
the move on).
The exception stack is
ERROR [CompactionExecutor:1] 2011-02-04 08:10:46,855 PrecompactedRow.java
(lin
55 matches
Mail list logo