Hi, It would depend on the data volume mainly. Hadoop can be used to refine the data before inserting into a traditional architecture (like a database).
If you want to write jobs, several solutions have emerged : * plain Mapred/Mapreduce APIs (former is older than the latter but both are plain default java APIs) * use python or other languages with Hadoop streaming * Cascading/Crunch... provides a more high level java APIs (and you have scalding/cascalog as scala/clojure 'wrapper') * pig / hive if you want a specific high level language (hive ql is sql-ish) * and then you have commercial products too... So it depends really on what you want to use it for and what competencies you (your team, your company) has. Regards Bertrand On Wed, Sep 5, 2012 at 10:42 AM, pgaurav <[email protected]> wrote: > > Hi Guys, > I’m 5 days old in hadoop world and trying to analyse this as a long term > solution to our client. > I could do some r&d on Amazon EC2 / EMR: > Load the data, text / csv, to S3 > Write your mapper / reducer / Jobclient and upload the jar to s3 > Start a job flow > I tried 2 sample code, word count and csv data process. > My question is that to further analyse the data / reporting / search, what > should be done? Do I need to implement in Mapper class itself? Do I need to > dump the data to the database and then write some custom application? What > is the standard way to analysing the data? > > Thanks > Prashant > > -- > View this message in context: > http://old.nabble.com/Using-hadoop-for-analytics-tp34391246p34391246.html > Sent from the Hadoop core-user mailing list archive at Nabble.com. > > -- Bertrand Dechoux
