On Fri, 7 Dec 2018 16:19:30 +0100, you wrote: >Perhaps for another thread: >Actually I went t the AWS USer Group in the UK on Wednesday. Ver >impressive, and there are the new Lustre filesystems and MPI networking. >I guess the HPC World will see the same philosophy of building your setup >using the AWS toolkit as Uber etc. etc. do today. >Also a lot of noise is being made at the moment about the convergence of >HPC and Machine Learning workloads. >Are we going to see the MAchine Learning folks adapting their workflows to >run on HPC on-premise bare metal clusters? >Or are we going to see them go off and use AWS (Azure, Google ?)
I suspect that ML will not go for on-premise for a number of reasons. First, ignoring cost, companies like Google, Amazon and Microsoft are very good at ML because not only are they driving the research but they need it for their business. So they have the in house expertise not only to implement cloud systems that are ideal for ML, but to implement custom hardware - see Google's Tensor Processor Unit. Second, setting up a new cluster isn't going to be easy. Finding physical space, making sure enough utilities can be supplied to support the hardware, staffing up, etc. are not only going to be difficult but inherently takes time when instead you can simply sign up to a cloud provider and have the project running within 24 hours. Would HPC exist today as we know it if the ability to instantly turn on a cluster existed at the beginning? Third, albeit this is very speculative. I suspect ML learning is heading towards using custom hardware. It has had a very good run using GPU's, and a GPU will likely always be the entry point for desktop ML, but unless Nvidia is holding back due to a lack of competition is does appear the GPU is reaching and end to its development much like CPUs have. The latest hardware from Nvidia is getting lacklustre reviews, and the bolting on of additional things like raytracing is perhaps an indication that there are limits to how much further the GPU architecture can be pushed. The question then is the ML market big enough to have that custom hardware as a OEM product like a GPU or will it remain restricted to places like Google who can afford to build it without the necessary overheads of a consumer product. _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf