Hi Gayatri, Looks like you might want to use a low-level enhancement of the default Hadoop API called Pangool (http://pangool.net) which uses tuples and simplifies grouping by, sorting by and joining datasets in Hadoop.
On Mon, Apr 23, 2012 at 7:30 AM, Gayatri Rao <[email protected]> wrote: > Hello, > > I am using BinSedesTuple as a mapper key to emit a tuple of values. But > somehow same keys do not go to the same reducer and I do not get > aggregates. > Is it not suggested to use it as a mapper key? > > For example in my mapper I emit > > Mapper: > Output key : BinSedesTuple value: int > > > Example output: > tuple.append(url); > tuple.append(category); > > Reducer: > Input key: BinSedesTuple value: int > Output key: Text value: int > > Example output: > url1 category1 3 > url1 category1 2 > > In the reducer output I get output with multiple keys being the same. My > expected output is > url1 category 5 > > Any ideas what might be wrong? > > > Thanks, > Gayatri >
