On Tuesday, August 19, 2014 06:33:29 AM Rich Freeman wrote:
> On Tue, Aug 19, 2014 at 5:34 AM, J. Roeleveld <jo...@antarean.org> wrote:
> > On Monday, August 18, 2014 10:53:51 AM Alec Ten Harmsel wrote:
> >> On Mon 18 Aug 2014 10:50:23 AM EDT, Rich Freeman wrote:
> >> > Hadoop is a very specialized tool.  It does what it does very well,
> >> > but if you want to use it for something other than map/reduce then
> >> > consider carefully whether it is the right tool for the job.
> >> 
> >> Agreed; unless you have decent hardware and can comfortably measure
> >> your data in TB, it'll be quicker to use something else once you factor
> >> in the administration time and learning curve.
> > 
> > The benefit of clustering technologies is that you don't need high-end
> > hardware to start with. You can use the old hardware you found collecting
> > dust in the basement.
> > 
> > The learning curve isn't as steep as it used to be. There are plenty of
> > tools to make it easier to start using Hadoop.
> 
> As long as you're counting words and don't mind coding everything in Java. 
> :)
> 
> I found that if you want to avoid using Java, then the available
> documentation plummets, and I'm pretty sure the version I was
> attempting to use was buggy - it was losing records in the sort/reduce
> phase I believe.  Or perhaps I was just using it incorrectly, but the
> same exact code worked just fine when I ran it on a single host with a
> smaller dataset and just piped map | sort | reduce without using
> Hadoop.  The documentation was pretty sparse on how to get Hadoop to
> work via stdin/out with non-Java code and it is quite possible I
> wasn't quite doing things right.  In the end my problem wasn't big
> enough to necessitate using Hadoop and I used GNU parallel instead.

No need for Java knowledge to develop against Hadoop.
A commercial product:
http://www.informatica.com/Images/01603_powerexchange-for-hadoop_ds_en-US.pdf
Nice and easy graphical interface. The same "code" that works against a 
relational database also works with Hadoop. The tool does the translation.

I would be surprised if there are no other tools that can make it easier to 
develop code to work with Hadoop. I just haven't had the reason to search for 
those yet.

--
Joost

Reply via email to