Hi Jay !
I think it's a bit difference here. I want to get 30 classId for each
teacherId that have most students.
For example : get 3 classId.
(File1)
1) Teacher1, Class11, 30
2) Teacher1, Class12, 29
3) Teacher1, Class13, 28
4) Teacher1, Class14, 27
... n ...

n+1) Teacher2, Class21, 45
n+2) Teacher2, Class22, 44
n+3) Teacher2, Class23, 43
n+4) Teacher2, Class24, 42
... n+m ...

=> return 3 line 1, 2, 3 for Teacher1 and line n+1, n+2, n+3 for Teacher2


Vào 09:52 Ngày 24 tháng 4 năm 2012, Jay Vyas <[email protected]> đã viết:

> Its somewhat tricky to understand exactly what you need from your
> explanation, but I believe you want teachers who have the most students in
> a given class.  So for English, i have 10 teachers teaching the class - and
> i want the ones with the highes # of students.
>
> You can output key= <classid>, value=<-1*#ofstudent,teacherid> as the
> values.
>
> The values will then be sorted, by # of students.  You can thus pick
> teacher in the the first value of your reducer, and that will be the
> teacher for class id = xyz , with the highes number of students.
>
> You can also be smart in your mapper by running a combiner to remove the
> teacherids who are clearly not maximal.
>
> On Mon, Apr 23, 2012 at 9:38 PM, Lac Trung <[email protected]> wrote:
>
> > Hello everyone !
> >
> > I have a problem with MapReduce [:(] like that :
> > I have 4 file input with 3 fields : teacherId, classId, numberOfStudent
> > (numberOfStudent is ordered by desc for each teach)
> > Output is top 30 classId that numberOfStudent is max for each teacher.
> > My approach is MapReduce like Wordcount example. But I don't know how to
> > determine key for map function.
> > I run Wordcount example, understand its code but I have no experience at
> > programming MapReduce.
> >
> > Can anyone help me to resolve this problem ?
> > Thanks so much !
> >
> >
> > --
> > Lạc Trung
> > 20083535
> >
>
>
>
> --
> Jay Vyas
> MMSB/UCHC
>



-- 
Lạc Trung
20083535

Reply via email to