RE: Limitation of key-value pairs for a particular key.

Utkarsh Gupta Fri, 18 Jan 2013 01:26:37 -0800

You are right
Actually we were expecting the values to be sorted.
We tried to reproduce the problem by this simple code
private final IntWritable one=new IntWritable(1);
        private Text word=new Text();
        @Override
        public void map(LongWritable key,Text value, Context context) throws 
IOException, InterruptedException {
            int N=30000;
            for(int i=0;i<N;i++)
            {
                word.set(i+"");
                System.out.println(i);
                context.write(one,word);
            }
        }
For smaller N numbers were in order but for N 3000000 order was not maintained

From: Harsh J [mailto:[email protected]]
Sent: Thursday, January 17, 2013 1:57 AM
To: mapreduce-user
Subject: RE: Limitation of key-value pairs for a particular key.

We don't sort values (only keys) nor apply any manual limits in MR. Can your 
post a reproduceable test case to support your suspicion?
On Jan 16, 2013 4:34 PM, "Utkarsh Gupta" 
<[email protected]<mailto:[email protected]>> wrote:
Hi,
Thanks for the response. There was some issues with my code. I have checked 
that in detail.
All the values of map are present in reducer but not in sorted order. This case 
happens if the number of values are too large for a key.

Thanks
Utkarsh

From: Vinod Kumar Vavilapalli 
[mailto:[email protected]<mailto:[email protected]>]
Sent: Thursday, January 10, 2013 11:00 PM
To: [email protected]<mailto:[email protected]>
Subject: Re: Limitation of key-value pairs for a particular key.

There isn't any limit like that. Can you reproduce this consistently? If so, 
please file a ticket.

It will definitely help if you can provide a test case which can reproduce this 
issue.

Thanks,
+Vinod

On Thu, Jan 10, 2013 at 12:41 AM, Utkarsh Gupta 
<[email protected]<mailto:[email protected]>> wrote:
Hi,

I am using Apache Hadoop 1.0.4 on a 10 node cluster of commodity machines with 
Ubuntu 12.04 Server edition. I am having a issue with my map reduce code. While 
debugging I found that the reducer can take 262145 values for a particular key. 
If more values are there, they seem to be corrupted. I checked the values while 
emitting from map and again checked in reducer.
I am wondering is there any such kind of limitation in the Hadoop or is it a 
configuration problem.

Thanks and Regards
Utkarsh Gupta

**************** CAUTION - Disclaimer *****************

This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely

for the use of the addressee(s). If you are not the intended recipient, please

notify the sender by e-mail and delete the original message. Further, you are 
not

to copy, disclose, or distribute this e-mail or its contents to any other 
person and

any such actions are unlawful. This e-mail may contain viruses. Infosys has 
taken

every reasonable precaution to minimize this risk, but is not liable for any 
damage

you may sustain as a result of any virus in this e-mail. You should carry out 
your

own virus checks before opening the e-mail or attachment. Infosys reserves the

right to monitor and review the content of all messages sent to or from this 
e-mail

address. Messages sent to or from this e-mail address may be stored on the

Infosys e-mail system.

***INFOSYS******** End of Disclaimer ********INFOSYS***

--
+Vinod
Hortonworks Inc.
http://hortonworks.com/

RE: Limitation of key-value pairs for a particular key.

Reply via email to