Hello, The script is a simple pig script that reads files in a remote hdfs and then processes them. It runs 3 map reduce jobs in yarn. On the cluster it takes several hours to run on the dataset.
Do you mean that I have to run the script on the server ? Thanks, Regards > On 16 Feb 2020, at 04:31, Tushar Kapila <[email protected]> wrote: > > Depends on what the script does? Of it's launching a job on a remote cluster > then yes. > > Bit of script does something more and needs to run for longer than no. > > But if script it on a remote system, not what you asked but an alternative > see > https://stackoverflow.com/questions/39574653/error-executing-pigserver-in-java > > <https://stackoverflow.com/questions/39574653/error-executing-pigserver-in-java> > On Sat, 15 Feb, 2020, 19:16 Daniel Santos, <[email protected] > <mailto:[email protected]>> wrote: > Hello, > > What I was thinking was : launching the pig script on my laptop, the hadoop > cluster would be left executing it, and I could shut down the laptop. > > Is this possible ? > > Thanks, > Regards > >> On 12 Feb 2020, at 02:06, Shashwat Shriparv <[email protected] >> <mailto:[email protected]>> wrote: >> >> nohup <your pig command> & >> >> >> Warm Regards, >> Shashwat Shriparv >> http://bit.ly/14cHpad <http://bit.ly/14cHpad> >> http://goo.gl/rxz0z8 <http://goo.gl/rxz0z8> >> http://goo.gl/RKyqO8 <http://goo.gl/RKyqO8> >> http://helpmetocode.blogspot.in/ >> <http://helpmetocode.blogspot.in/> >> http://photoinfinity.blogspot.in/ >> <http://photoinfinity.blogspot.in/> >> http://writingishabit.blogspot.in/ >> <http://writingishabit.blogspot.in/> >> http://realiq.blogspot.in/ >> <http://realiq.blogspot.in/> >> http://sshriparv.blogspot.in/ <http://sshriparv.blogspot.in/> >> https://goo.gl/M8Us3B <https://goo.gl/M8Us3B> >> https://goo.gl/nrI2mv <https://goo.gl/nrI2mv> >> https://500px.com/shriparv <https://500px.com/shriparv> >> https://www.flickr.com/photos/55141469@N02/ >> <https://www.flickr.com/photos/55141469@N02/> >> https://about.me/shriparv <https://about.me/shriparv> >> ISBN - 10: 1783985941 >> ISBN - 13: 9781783985944 >> <https://www.linkedin.com/pub/shashwat-shriparv/19/214/2a9> >> <https://twitter.com/shriparv> <https://www.facebook.com/shriparv> >> <http://google.com/+ShashwatShriparv> >> <http://www.youtube.com/user/sShriparv/videos> <mailto:[email protected]> >> >> >> >> On Wed, 12 Feb 2020 at 04:48, Daniel Santos <[email protected] >> <mailto:[email protected]>> wrote: >> Hello, >> >> I managed to create a properties file with the following contents : >> >> fs.defaultFS=hdfs://hadoopnamenode:9000 <> >> mapreduce.framework.name <http://mapreduce.framework.name/>=yarn >> yarn.resourcemanager.address=hadoopresourcemanager:8032 >> >> It is now submitting the jobs to the cluster. I also set the HADOOP_HOME on >> my laptop to point to the same version of hadoop that is running on the >> cluster (2.7.0). I am running pig version 0.17 >> >> Then a main class not found error happened on the yarn nodes where the job >> was scheduled to run. I had to add the following to yarn-site.xml and >> restart yarn and the nodes : >> >> <property> >> <name>mapreduce.application.classpath</name> >> >> <value>/home/hadoop/hadoop-2.7.0/share/hadoop/mapreduce/*,/home/hadoop/hadoop-2.7.0/share/hadoop/mapreduce/lib/*,/home/hadoop/hadoop-2.7.0/share/hadoop/common/*,/home/hadoop/hadoop-2.7.0/share/hadoop/common/lib/*,/home/hadoop/hadoop-2.7.0/share/hadoop/yarn/*,/home/hadoop/hadoop-2.7.0/share/hadoop/yarn/lib/*,/home/hadoop/hadoop-2.7.0/share/hadoop/hdfs/*,/home/hadoop/hadoop-2.7.0/share/hadoop/hdfs/lib/*</value> >> </property> >> >> After this change, the script ran. But the pig command only returned after >> the job finished. >> Does anyone know how to launch the script and exit immediately to the shell ? >> If the job takes a long time I will have to keep the terminal open. >> >> Thanks, >> Regards >> >> >> > On 11 Feb 2020, at 05:25, Vinod Kumar Vavilapalli <[email protected] >> > <mailto:[email protected]>> wrote: >> > >> > It’s running the job in local mode (LocalJobRunner), that’s why. Please >> > check your configuration files and make sure that the right directories >> > are on the classpath. Also look in mapred-site.xml for >> > mapreduce.framework.name <http://mapreduce.framework.name/> (should be >> > yarn). >> > >> > Thanks >> > +Vinod >> > >> >> On Feb 11, 2020, at 2:09 AM, Daniel Santos <[email protected] >> >> <mailto:[email protected]>> wrote: >> >> >> >> Hello all, >> >> >> >> I have developed a script in my laptop. The script is now ready to be >> >> unleashed in a non secured cluster. >> >> But when I do : pig -x mapreduce <script name> it doesn’t return to the >> >> shell immediately. It prints stuff like [LocalJobRunner Map Task Executor >> >> #0] >> >> >> >> I have exported the PIG_CLASSPATH shell variable to point to a directory >> >> with the cluster’s configuration and its operating on the files located >> >> there. >> >> But I would expect the job to be launched, the shell prompt returned and >> >> the job would be left executing independently on the cluster. >> >> >> >> Another thing I noticed while developing the script and running it both >> >> locally and on the cluster, is that the web page for there resource >> >> manager does not show the map reduce jobs that pig generates. What do I >> >> have to do to be able to see them ? >> >> >> >> Thanks, >> >> Regards >> >> --------------------------------------------------------------------- >> >> To unsubscribe, e-mail: [email protected] >> >> <mailto:[email protected]> >> >> For additional commands, e-mail: [email protected] >> >> <mailto:[email protected]> >> >> >> > >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [email protected] >> <mailto:[email protected]> >> For additional commands, e-mail: [email protected] >> <mailto:[email protected]> >> >
