Jonathan, a small correction if I may. Julia is not JIT - I asked on the Julia discourse. A much better description is Ahead of Time compilation. Not really important, but JIT triggers a certain response with most people.
On Thu, 14 Mar 2019 at 07:31, Jonathan Aquilina <jaquil...@eagleeyet.net> wrote: > Hi All, > > > > What sets Julia apart is it is not a compiled language but a Just In Time > (JIT) language. I am still getting into it but it seems to be geared to > complex and large data sets. As mentioned previously I am still working > with a colleague on this prototype. With Julia at least there is an IDE so > to speak for it. It is based on the ATOM IDE with a package that is > installed specifically for Julia. > > > > I will obviously keep the list updated in regards to Julia and my > experiences with it but the little I have looked at the language it is easy > to write code for. Its still in its infancy as the latest version I believe > is 1.0.1 > > > > Regards, > > Jonathan > > > > *From:* Beowulf <beowulf-boun...@beowulf.org> *On Behalf Of *Scott Atchley > *Sent:* 14 March 2019 01:17 > *To:* Douglas Eadline <deadl...@eadline.org> > *Cc:* Beowulf Mailing List <beowulf@beowulf.org> > *Subject:* Re: [Beowulf] Large amounts of data to store and process > > > > I agree with your take about slower progress on the hardware front and > that software has to improve. DOE funds several vendors to do research to > improve technologies that will hopefully benefit HPC, in particular, as > well as the general market. I am reviewing a vendor's latest report on > micro-architectural techniques to improve performance (e.g., lower latency, > increase bandwidth). For this study, they use a combination of DOE > mini-apps/proxies as well as commercial benchmarks. The techniques that > this vendor investigated showed potential improvements for commercial > benchmarks but much less, if any, for the DOE apps, which are highly > optimized. > > > > I will state that I know nothing about Julia, but I assume it is a > higher-level language than C/C++ (or Fortran for numerical codes). I am > skeptical that a higher-level language (assuming Julia is) can help. I > believe the vendor's techniques that I am reviewing benefited commercial > benchmarks because they are less optimized than the DOE apps. Using a > high-level language relies on the language's compiler/interpreter and > runtime. The developer has no idea what is happening or does not have the > ability to improve it if profiling shows that the issue is in the runtime. > I believe that if you need more performance, you will have to work for it > in a lower-level language and there is no more free lunch (i.e., hoping the > latest hardware will do it for me). > > > > Hope I am wrong. > > > > > > On Wed, Mar 13, 2019 at 5:23 PM Douglas Eadline <deadl...@eadline.org> > wrote: > > > I realize it is bad form to reply ones own post and > I forgot to mention something. > > Basically the HW performance parade is getting harder > to celebrate. Clock frequencies have been slowly > increasing while cores are multiply rather quickly. > Single core performance boosts are mostly coming > from accelerators. Added to the fact that speculation > technology when managed for security, slows things down. > > What this means, the focus on software performance > and optimization is going to increase because we can just > buy new hardware and improve things anymore. > > I believe languages like Julia can help with this situation. > For a while. > > -- > Doug > > >> Hi All, > >> Basically I have sat down with my colleague and we have opted to go down > > the route of Julia with JuliaDB for this project. But here is an > > interesting thought that I have been pondering if Julia is an up and > > coming fast language to work with for large amounts of data how will > > that > >> affect HPC and the way it is currently used and HPC systems created? > > > > > > First, IMO good choice. > > > > Second a short list of actual conversations. > > > > 1) "This code is written in Fortran." I have been met with > > puzzling looks when I say the the word "Fortran." Then it > > comes, "... ancient language, why not port to modern ..." > > If you are asking that question young Padawan you have > > much to learn, maybe try web pages" > > > > 2) I'll just use Python because it works on my Laptop. > > Later, "It will just run faster on a cluster, right?" > > and "My little Python program is now kind-of big and has > > become slow, should I use TensorFlow?" > > > > 3) <mcoy> > > "Dammit Jim, I don't want to learn/write Fortran,C,C++ and MPI. > > I'm a (fill in domain specific scientific/technical position)" > > </mcoy> > > > > My reply,"I agree and wish there was a better answer to that question. > > The computing industry has made great strides in HW with > > multi-core, clusters etc. Software tools have always lagged > > hardware. In the case of HPC it is a slow process and > > in HPC the whole programming "thing" is not as "easy" as > > it is in other sectors, warp drives and transporters > > take a little extra effort. > > > > 4) Then I suggest Julia, "I invite you to try Julia. It is > > easy to get started, fast, and can grow with you application." > > Then I might say, "In a way it is HPC BASIC, it you are old > > enough you will understand what I mean by that." > > > > The question with languages like Julia (or Chapel, etc) is: > > > > "How much performance are you willing to give up for convenience?" > > > > The goal is to keep the programmer close to the problem at hand > > and away from the nuances of the underlying hardware. Obviously > > the more performance needed, the closer you need to get to the hardware. > > This decision goes beyond software tools, there are all kinds > > of cost/benefits that need to be considered. And, then there > > is IO ... > > > > -- > > Doug > > > > > > > > > > > > > > > >> Regards, > >> Jonathan > >> -----Original Message----- > >> From: Beowulf <beowulf-boun...@beowulf.org> On Behalf Of Michael Di > > Domenico > >> Sent: 04 March 2019 17:39 > >> Cc: Beowulf Mailing List <beowulf@beowulf.org> > >> Subject: Re: [Beowulf] Large amounts of data to store and process On > > Mon, Mar 4, 2019 at 8:18 AM Jonathan Aquilina > > <jaquil...@eagleeyet.net> > >> wrote: > >>> As previously mentioned we don’t really need to have anything > >>> indexed > > so I am thinking flat files are the way to go my only concern is the > > performance of large flat files. > >> potentially, there are many factors in the work flow that ultimately > > influence the decision as others have pointed out. my flat file example > > is only one, where we just repeatable blow through the files. > >>> Isnt that what HDFS is for to deal with large flat files. > >> large is relative. 256GB file isn't "large" anymore. i've pushed TB > > files through hadoop and run the terabyte sort benchmark, and yes it can > > be done in minutes (time-scale), but you need an astounding amount of > > hardware to do it (the last benchmark paper i saw, it was something 1000 > > nodes). you can accomplish the same feat using less and less > > complicated hardware/software > >> and if your dev's are willing to adapt to the hadoop ecosystem, you sunk > > right off the dock. > >> to get a more targeted answer from the numerous smart people on the > > list, > >> you'd need to open up the app and workflow to us. there's just too many > > variables _______________________________________________ > >> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin > Computing > > To change your subscription (digest mode or unsubscribe) visit > >> http://www.beowulf.org/mailman/listinfo/beowulf > >> _______________________________________________ > >> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin > Computing > > To change your subscription (digest mode or unsubscribe) visit > >> http://www.beowulf.org/mailman/listinfo/beowulf > > > > > > -- > > Doug > > > > > > > > > > _______________________________________________ > > Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing > > To change your subscription (digest mode or unsubscribe) visit > > https://beowulf.org/cgi-bin/mailman/listinfo/beowulf > > > > > -- > Doug > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > https://beowulf.org/cgi-bin/mailman/listinfo/beowulf > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > https://beowulf.org/cgi-bin/mailman/listinfo/beowulf >
_______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit https://beowulf.org/cgi-bin/mailman/listinfo/beowulf