7;s going to a separate
> hadoop cluster, I don't think you'd need to co-locate task trackers or data
> nodes on your cassandra nodes - it would just need to copy over the network
> though. We also use oozie for job scheduling, fwiw.
>
> On Dec 23, 2011, at 9:12 AM, ravikum
to co-locate task trackers
>> or data nodes on your cassandra nodes - it would just need to copy over the
>> network though. We also use oozie for job scheduling, fwiw.
>>
>> On Dec 23, 2011, at 9:12 AM, ravikumar visweswara wrote:
>>
>> > Hello All,
>> &
swara wrote:
>
> > Hello All,
> >
> > I have a situation to dump cassandra data to hadoop cluster for further
> analytics. Lot of other relevant data which is not present in cassandra is
> already available in hdfs for analysis. Both are independent clusters right
&
or job scheduling, fwiw.
>>
>> On Dec 23, 2011, at 9:12 AM, ravikumar visweswara wrote:
>>
>>> Hello All,
>>>
>>> I have a situation to dump cassandra data to hadoop cluster for further
>>> analytics. Lot of other relevant data which is no
isweswara wrote:
>
>> Hello All,
>>
>> I have a situation to dump cassandra data to hadoop cluster for further
>> analytics. Lot of other relevant data which is not present in cassandra is
>> already available in hdfs for analysis. Both are independent clusters right
&
your cassandra nodes - it would just need to copy over the network
though. We also use oozie for job scheduling, fwiw.
On Dec 23, 2011, at 9:12 AM, ravikumar visweswara wrote:
> Hello All,
>
> I have a situation to dump cassandra data to hadoop cluster for further
> analytics.
to refresh the
data in HDFS. I presume its even speedier if you are running enterprise,
because the Hadoop process is collocated with Cassandra.
-brian
On Fri, Dec 23, 2011 at 10:12 AM, ravikumar visweswara <
talk2had...@gmail.com> wrote:
> Hello All,
>
> I have a situation to dump
Hello All,
I have a situation to dump cassandra data to hadoop cluster for further
analytics. Lot of other relevant data which is not present in cassandra is
already available in hdfs for analysis. Both are independent clusters right
now.
Is there a suggested way to get the data periodically or