Suggestion or rather food for thought
Do you expect to read/analyze the written data right away? Or will it be a
batch process, kicked off later in time? What I am trying to say is that if
the 'read/analysis' part is a) batch process and b) kicked off later in
time, then #3 is a fine solution?
Interesting approach Oded.
Is this something similar that has been described here:
http://radar.oreilly.com/2014/07/questioning-the-lambda-architecture.html
Regards,
Shahab
On Sun, Apr 26, 2015 at 4:29 AM, Peer, Oded wrote:
> I would maintain two tables.
>
> An “archive” table that holds all
Spark is not storage, rather it is a streaming framework supposed to be run
on big data, distributed architecture (a very high-level intro/definition).
It provides batched version of in-memory map/reduce like jobs. It is not
completely streaming like Storm but rather batches collection of tuples an
Nate,
(slightly OT), what client API/library is recommended now that Hector is
sunsetting? Thanks.
Regards,
Shahab
On Wed, Nov 13, 2013 at 9:28 AM, Nate McCall wrote:
> You basically want option (c). Option (d) might work, but you would be
> bending the paradigm a bit, IMO. Certainly do not u
Ahh, yes, 'compaction'. I blanked out while mentioning repair and cleanup.
That is in fact what needs to be done first and what I meant. Thanks
Robert.
Regards,
Shahab
On Wed, Oct 9, 2013 at 1:50 PM, Robert Coli wrote:
> On Wed, Oct 9, 2013 at 7:35 AM, Ravikumar Govindarajan <
> ravikumar.govi
I might be missing something obvious here but can't you afford (time-wise)
to run cleanup or repair after the deletion so that the deleted data is
gone? Assuming that your columns are time-based data?
Regards,
Shahab
On Wed, Oct 9, 2013 at 10:35 AM, Ravikumar Govindarajan <
ravikumar.govindara.
tool
Regards,
Shahab
On Sat, Oct 5, 2013 at 7:06 PM, Shahab Yunus wrote:
> Yes you can:
>
> http://hbase.apache.org/book/regions.arch.html#compaction
> http://hbase.apache.org/book/important_configurations.html (Managed
> Compaction section)
>
> Regards,
> Shahab
Yes you can:
http://hbase.apache.org/book/regions.arch.html#compaction
http://hbase.apache.org/book/important_configurations.html (Managed
Compaction section)
Regards,
Shahab
On Sat, Oct 5, 2013 at 6:02 PM, Sebastian Schmidt wrote:
> Am 06.10.2013 00:00, schrieb Cem Cayiroglu:
> > It will be
Couple of things which I could I think of. Other might have better ideas.
1- The exception is about encoding mismatch. Do you know what is your
source files's encoding and what is your system's default? E.g. it can be
ISO8859-1 in Windows, UTF-8 in Linux etc.and your file has something else.
You c
ich is the last change, right?
> >
> > Yes
> >
> > In MR world, each file COULD be processed by different Mapper, but will
> be sent to the same reducer as both data will be shared same key.
> >
> > If that is the way you are writing it, then yes
> >
&g
java8964, basically are you asking that what will happen if we put large
amount of data in one column of one row at once? How will this blob of data
representing one column and one row i.e. cell will be split into multiple
SSTable? Or in such particular cases it will always be one extra large
SSTab
Have you tried specifying your hostname (not localhost) in cassandra.yaml
and start it?
Regards,
Shahab
On Tue, Sep 17, 2013 at 8:39 AM, pradeep kumar wrote:
> I am very new to cassandra. Just started exploring.
>
> I am running a single node cassandra server & facing a problem in seeing
> stat
ore columns at a
time.
Regards,
Shahab
On Thu, Sep 12, 2013 at 1:51 AM, Aaron Turner wrote:
>
>
>
>
> On Wed, Sep 11, 2013 at 4:40 PM, Shahab Yunus wrote:
>
>> Thanks Aaron for the reply. Yes, VMs or the nodes will be in cloud if we
>> don't go the phys
or Safety.
> -- Benjamin Franklin
>
>
>
> On Wed, Sep 11, 2013 at 4:21 PM, Shahab Yunus wrote:
>
>> Hello,
>>
>> We are deciding whether to get VMs or physical machines for a Cassandra
>> cluster. I know this is a very high-level question depending on lot
Hello,
We are deciding whether to get VMs or physical machines for a Cassandra
cluster. I know this is a very high-level question depending on lots of
factors and in fact I want to know that how to tackle this is and what
factors should we take into consideration while trying to find the answer.
Also, Sylvain, you have couple of great posts about relationships between
CQL3/Thrift entities and naming issues:
http://www.datastax.com/dev/blog/cql3-for-cassandra-experts
http://www.datastax.com/dev/blog/thrift-to-cql3
I always refer to them when I get confuse :)
Regards,
Shahab
On Fri, Sep
It only reads till that column (a sequential scan, I believe) and do not
read the whole row. It uses a row-level column index to reduce the amount
of data read.
Much more details at (first 2-3 are must-reads in fact):
http://thelastpickle.com/blog/2011/07/04/Cassandra-Query-Plans.html
http://www.d
Thanks a lot.
Regards,
Shahab
On Thu, Aug 1, 2013 at 8:32 PM, Robert Coli wrote:
> On Thu, Aug 1, 2013 at 2:34 PM, Shahab Yunus wrote:
>
>> Can you shed some more light (or point towards some other resource) that
>> why you think built-in Secondary Indexes should not
Hi Robert,
Can you shed some more light (or point towards some other resource) that
why you think built-in Secondary Indexes should not be used easily or
without much consideration? Thanks.
Regards,
Shahab
On Thu, Aug 1, 2013 at 3:53 PM, Robert Coli wrote:
> On Thu, Aug 1, 2013 at 12:49 PM, G
Hi Jan,
One question...you say
"- I must make sure the disks are directly attached, to prevent
problems when multiple nodes flush the commit log at the
same time"
What do you mean by that?
Thanks,
Shahab
On Wed, Jul 31, 2013 at 3:10 AM, Jan Algermissen wrote:
> Jon,
>
> On 31.07.2013, at
You have lot of questions there so I can't answer all but for the following:
*"Can a user of the system define new jobs in an ad-hoc fashion (like a
query) or do map reduce jobs need to be prepared by a developer (e.g. in
RIAK you do a developer to compile-in the job when you need the perormance
of
See this as this was discussed earlier:
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Representation-of-dynamically-added-columns-in-table-column-family-schema-using-cqlsh-td7588997.html
Regards,
Shahab
On Fri, Jul 12, 2013 at 11:13 AM, Shahab Yunus wrote:
> A basic quest
Rahul,
See this as it was discussed earlier:
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Representation-of-dynamically-added-columns-in-table-column-family-schema-using-cqlsh-td7588997.html
Regards,
Shahab
On Tue, Jul 23, 2013 at 2:51 PM, Rahul Gupta wrote:
> I am using C
//www.thelastpickle.com
>
> On 20/07/2013, at 9:32 AM, sankalp kohli wrote:
>
> With Auto discovery, you can provide the DC you are local to and it will
> only use hosts from that.
>
>
> On Fri, Jul 19, 2013 at 2:08 PM, Shahab Yunus wrote:
>
>> Hello,
>>
&
I think the former is for client communication to the nodes and the latter
for communication between nodes themselves as evident by the name of the
property. Please feel free to correct me if I am wrong.
Regards,
Shahab
On Saturday, July 20, 2013, Mohammad Hajjat wrote:
> Hi,
>
> What's the diff
Hello,
I want my Thrift client(s) (using hector 1.1-3) to randomly connect to any
node in the Cassandra (1.2.4) cluster.
1- One way is that I pass in a comma separated list of hosts and ports to
the CassandraHostConfguration object.
2- The other option is that I configure the auto discovery of ho
Aaron Morton can confirm but I think one problem could be that to create an
index on a field with small number of possible values is not good.
Regards,
Shahab
On Sat, Jul 13, 2013 at 9:14 AM, Tristan Seligmann
wrote:
> On Fri, Jul 12, 2013 at 10:38 AM, aaron morton wrote:
>
>> CREATE INDEX ON c
Thanks Eric for the explanation.
Regards,
Shahab
On Fri, Jul 12, 2013 at 11:13 AM, Shahab Yunus wrote:
> A basic question and it seems that I have a gap in my understanding.
>
> I have a simple table in Cassandra with multiple column families. I add
> new columns to each of
A basic question and it seems that I have a gap in my understanding.
I have a simple table in Cassandra with multiple column families. I add new
columns to each of these column families on the fly. When I view (using the
'DESCRIBE table' command) the schema of a particular column family, I see
onl
Aaron,
Can you explain a bit when you say that the client needs to support Atomic
Batches in 1.2 and Hector doesn't support it? Does it mean that there is no
way of using atomic batch of inserts through Hector? Or did I misunderstand
you? Feel free to point me to any link or resource, thanks.
Reg
ing we have key cache enabled)
>
> ** **
>
> ** **
>
> *From:* Shahab Yunus [mailto:shahab.yu...@gmail.com]
> *Sent:* 20 June 2013 14:32
> *To:* user@cassandra.apache.org
> *Subject:* Re: block size
>
> ** **
>
> Have you seen this?
>
> http://www.da
Have you seen this?
http://www.datastax.com/dev/blog/cassandra-file-system-design
Regards,
Shahab
On Thu, Jun 20, 2013 at 3:17 PM, Kanwar Sangha wrote:
> Hi – What is the block size for Cassandra ? is it taken from the OS
> defaults ?
>
ther in the embedded database.
> >
> > Setup/tear down time is pretty reasonable.
> >
> > Ben
> >
> > From: Shahab Yunus [shahab.yu...@gmail.com]
> > Sent: Wednesday, June 19, 2013 8:46 AM
> > To: user@cassandra.apache.org
> > Subject: Re: Unit Test
ole... That is performance testing.
>
> When searching for the above, you will not get much luck if you are
> looking for them in the context of "unit testing" as those things are
> *outside the scope of unit testing"
>
>
> On Wednesday, 19 June 2013, Shahab Yunus w
Hello Arthur,
What do you mean by "The queries need to be lightened"?
Thanks,
Shahb
On Tue, Jun 18, 2013 at 8:47 PM, Arthur Zubarev wrote:
> Cem hi,
>
> as per http://wiki.apache.org/cassandra/FAQ#dropped_messages
>
>
> Internode messages which are received by a node, but do not get not to b
Hello,
Can anyone suggest a good/popular Unit Test tools/frameworks/utilities out
there for unit testing Cassandra stores? I am looking for testing from
performance/load and monitoring perspective. I am using 1.2.
Thanks a lot.
Regards,
Shahab
Dynamic columns are not supported in CQL3. We just had a discussion a day
or two ago about this where Eric Stevens explained it. Please see this:
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/CQL-3-returning-duplicate-keys-td7588181.html
Regards,
Shahab
On Thu, Jun 6, 2013 at
Though, I am a newbie bust just had a thought regarding your question 'How
will it handle requests for data which unavailable?', wouldn't the data be
served in that case from other nodes where it has been replicated?
Regards,
Shahab
On Wed, Jun 5, 2013 at 5:32 AM, Christopher Wirt wrote:
> Hell
or standard column families representing as one row per key/column
> pair, you can read more about that here:
> http://www.datastax.com/dev/blog/thrift-to-cql3 - this is also in the
> "Mixing static and dynamic" section, a little farther down.
>
>
>
> On Tue, Jun 4, 2013
Thanks Eric for the detailed explanation but can you point to a source or
document for this restriction in CQL3 tables? Doesn't it take away the main
feature of the NoSQL store? Or am I am missing something obvious here?
Regards,
Shahab
On Tue, Jun 4, 2013 at 2:12 PM, Eric Stevens wrote:
> If
40 matches
Mail list logo