Re: [Beowulf] [tt] Nvidia unveils Tesla, moves into supercomputing

Vincent Diepeveen Thu, 21 Jun 2007 15:27:49 -0700

Instead of the bla bla that nvidia and ati produce, please let them create afew clear pdf's that describe things like

for each specific graphics card EXACTLY how BIG the caches are on each card.

How on planet earth can you program for a card without knowing how thecaches work let alone know its size?

intel and amd definitely don't make a major secret out of the size of theircaches.

What we see however in reviews of new graphics cards is that a few hardwaresites simply must GUESS how big it is.

Nearly all descriptions are out of graphics programmers viewpoints insteadout of CPU programmers viewpoints.


That makes the step real big to make a cpu intensive program work on a GPU.

Is it so hard for ATI/NVIDIA to write about their latest flagship a cleardocument like that and put it online for free download?

Additionally i miss 1 major important instruction on those GPU's, which theCPU's already had from 386 and on.If your gpu just can do 32 bits integer data types, then make a parallelmultiplication that takes 2x32 bits input and 2x32

bits output.

It is a fairy tale that FFT is faster in floating point; it just happens tobe the case that in most SIMD there is no integer equivalent so far.


Thanks,
Vincent

----- Original Message -----From: "Eugen Leitl" <[EMAIL PROTECTED]>

To: <Beowulf@beowulf.org>
Sent: Thursday, June 21, 2007 10:44 AM
Subject: [Beowulf] [tt] Nvidia unveils Tesla, moves into supercomputing

----- Forwarded message from Brian Atkins <[EMAIL PROTECTED]> -----

From: Brian Atkins <[EMAIL PROTECTED]>
Date: Wed, 20 Jun 2007 16:23:29 -0500
To: transhumantech <[EMAIL PROTECTED]>
Subject: [tt] Nvidia unveils Tesla, moves into supercomputing
User-Agent: Thunderbird 2.0.0.4 (Windows/20070604)

http://www.tgdaily.com/content/view/32557/135/
Santa Clara (CA) – Nvidia today announced Tesla, a third product line nexttothe GeForce and Quadro graphics products. The company aims to use Teslacardsand the massive floating point horsepower of its graphics processors totake
over a portion of the lucrative supercomputing market.
The core of each Tesla device is a GeForce 8-series GPU as well as thegeneralcomponent layout of the high-end Quadro FX 5600 workstation graphics cardwith1.5 GB of memory. The only noteworthy difference between the FX 5600 and aTeslacard is the fact that the supercomputing-targeted devices lack thegraphicsoutputs on the backpanel, which we were told, allows Nvidia to increasethe
clock speed on Tesla.

While the actual clock speed of the Tesla GeForce GPU is kept under wraps,
Nvidia said that one processor (used in the C870 add-in card) is good foraperformance of 518 GFlops, two processors (used in the desksidesupercomputerD870, which integrates two C870 cards) will bring 1 TFlops; the Tesla GPUserver
with four processors will hit 2 TFlops.
In terms of pure number crunching horsepower, Nvidia told us that oneGeForceGPU can match the combined performance of 40 x86 processors. In additionto theraw performance, Tesla also makes a case for power efficiency: The C870 isratedat a maximum power consumption of 170 watts and the GPU server at 800watts,which may sound a lot at first look. However, 40 low-power x86 processorswouldrun at a typical 1600 watts. With a common power budget of about 25kilowattsper rackserver, a Tesla GPU server rack has a theoretical maximumperformance ofmore than 60 TFlops – which would put the floating point rating of such adevice
among the 15 fastest supercomputers currently ranked on the Top 500
Supercomputer list.


Similarities to ATI’s stream processor card, implications for developers
Readers, who have been following recent general purpose GPU announcements,willremember that ATI has product in its portfolio that is very similar to theTesla
C870 – the stream processor card (which is based on a R580 GPU and 1 GB of
memory). Both products follow the same concept to make the massivelyprocessingcapability provided by shader processors available to run arbitrary codeinstead
of graphics code.
Developers such as John Stone and James Philips, senior researchprogrammers atthe Beckman Institute of Advanced Science and Technology at the UniversityofIllinois, have been looking at accelerators such as GPUs for some, buthave beenlimited mainly by bugs in shader drivers. Stone told us that much of hisworkwith GPUs in the past was focused “on finding driver bugs” and “writinghisapplications around them” in order to make the technology usable forscientificsimulations. “There can be a lot of rounding errors and because of thisvery
fact, I wasn’t very excited about working with GPUs,” he said.
However, both AMD and Nvidia came up with a programming model to solvethisproblem. On AMD’s side, it is called CTM (“close to metal”) and on Nvidia’ssideit is CUDA (“Compute Unified Device Architecture”). At this time, itappears tocome down to personal liking which model is preferred by a developer, as,for
example, there are some universities that are working with CTM (such as
Stanford’s [EMAIL PROTECTED] project) and there are some that are working withCUDA.Stone and Philips are focusing on the Nvidia model as they claim itsC++-based
language model is easier to deal with than AMD’s CTM version, which uses a
low-level assembly language.
While CUDA works very much like a regular programming model and, accordingtoStone, can deliver results very quickly, the big challenge in exploitingthesedevices will be knowledge to write advanced parallelized code for theseGPGPUs.
Stone believes that especially coders who have written code for (massively
parallel) supercomputers before will have an easy transition opportunity.Ofcourse, knowledge of the hardware, graphics processing and a good look attheparallelizable parts of applications help to take advantage of thetechnology.
Shane Ryoo, a graduate research assistant at the University of Illinois at
Urbana-Champaign, said that CUDA will allow programmers with someexperience indeveloping threaded applications to get “really good results right off thebat.”However, it will be the fine-tuning process, which will increase the valueofGPGPUs: Ryoo noted that expert knowledge that will allow developers tosqueezethe best possible performance out of GPUs, sometimes can accelerateapplication
code by a factor of 5x or greater.
Nvidia is well aware of this challenge and has begun assistinguniversities in
establishing classes and developing course material focusing on massively
parallel programming and CUDA in particular. Eventually, the companyhopes, thatGPGPU programming will become a standard part in computer science courseworkand help to educate a whole new generation of programmers. So far, Nvidiahastaught courses at the University of Illinois, The University ofCalifornia, theUniversity of North Carolina and Purdue University. Nvidia said thatseveral
universities are developing their own courses, including the University of
Virginia, the University of Pennsylvania, Oregon State University, the
University of Wisconsin. Caltech, MIT, Berkeley and Stanford have beenoffering“legacy” GPGPU and GPU programming classes, according to Nvidia chiefscientist
David Kirk.

The payoff: Accelerated applications
If the capabilities of these GPGPUs are exploited, there can be a bigpayoff.Stone, who is working on Nanoscale Molecular Dynamics (NAMD) as well asVisualMolecular Dynamics (VMD), said that a virus simulation that took 110 CPUhourson a SGI Altix Itanium 2 supercomputer at NCSA required only 27 GPUminutes on a
GeForce 8 graphics processor – which translates into a 240x speedup.
In an example that showcases an impact that can touch many lifes, Ryoo andhisteam are working on an interactive, medical MRI application thatsubstantiallyincreases the resolution of MRI scans thanks to the added processingpower. As a
result, they expect to be able to deliver much finer images, which allow
physicians to detect tumors at an earlier state or differentiate between ablip
or an actual tumor.

In a demonstration showed during an Nvidia event, a representative from
Headwave, a company that provides geophysical data analysis, highlighted a4D
application, which allows users to visualize gigabytes and apparently even
terabytes of data in a three-dimensional scale and even apply a timefilter todisplay changes to geological layers over time. The company claims thatGPUs areaccelerating their application by about 2000% and are delivering an outputof
about 2000 MB/s.
In fairness, we should mention that Tesla (or stream processor cards forthatmatter) will not be able to replace supercomputers, which continue toprovide a
memory bandwidth a few Tesla cards cannot match. Scientists such as Stone
believe that products such as Tesla will make their way intosupercomputers tocreate an overall more balanced environment. “Number crunching was thelimiting
factor up until now. Now Infiniband will be a problem,” he said.
GPGPUs are likely to have a greater impact on deskside supercomputers intheshort term. While scientists today have to apply for expensivesupercomputertime and in most cases have to wait several days until their applicationcan be
processed - if those requests are not turned down anyway – there is now an
opportunity to run many of those tests on a desk right in the lab.Conceivably,GPGPUs will allow more scientists to run more and higher qualitysimulations in
less time.


Cost and impact on the consumer
Nvidia’s Tesla products will start at $1300 for the single GPU add-incard; the2-GPU deskside unit will run for $7500 and the 4 GPU server, which soonwill
also be offered in an 8 GPU version, will sell for $12,000. Leaving out of
consideration that, at least to our knowledge, Tesla is not yet available,these
apparently lofty price tags turn out to be bargains at a closer look.
The C870 not only undercuts the ATI stream processor card, which currentlysellsfor about $2000, but also Nvidia’s own workstation products. The C870, at$1300,compares to a Quadro FX 5600 graphics card, which requires and investmentin the
neighborhood of $3000 and up. Clearspeed’s CSX600 accelerator card, which
provides a performance of about 100 GFlops, is selling in volume for about$7500.
A representative of Evolved Machines told us that the company plans to be
offering a 12 TFlops Tesla server, which will cost somewhere between$60,000 and$70,000, but will be fast enough to match the floating point performanceof the
19th fastest supercomputer on the Top-500 list.
Stone told us that even if the GPUs per se may appear to be expensive foraconsumer point of view, they “are available for far less money than thenext
best thing that is available today.”
So, what does that mean for the consumer? Clearly, there is only anindirectbenefit for most consumers that we may see in improved research resultsdown theroad. However, as all technologies, these GPUs will get cheaper over timeandeven today, a $1300 card would be in reach for enthusiasts, who oftenspendsubstantially more than $5000 on their rig. The fact is that there is nomagicnecessary to make these cards work on a PC - and CUDA even works withGeForce 8graphics cards, which can be had for less than $250 in the case of8600-series
models. The real question is: When will there be applications that take
advantage of this technology and will they provide enough incentive for
consumers to purchase a GeForce 8 card? Industry experts believe that itwill beup do developers to come up with new applications that will take advantageof
the capability of GPGPUs on the desktop.
Nvidia CEO Jen-Hsun Huang told TG Daily that Tesla will be strictlyfocused forthe enterprise market and will not be making its way to the consumermarket. Inthe end, it will be up to the GeForce product groups to leverage CUDA ondesktop
computers, but at least for now, Nvidia has little motivation to push this
technology for the average consumer: “Perhaps in the future,” said Huang,“[thistechnology] could do physics on the PC, but this would need a WindowsAPI.”
--
Brian Atkins
Singularity Institute for Artificial Intelligence
http://www.singinst.org/
_______________________________________________
tt mailing list
[EMAIL PROTECTED]
http://postbiota.org/mailman/listinfo/tt

----- End forwarded message -----
--
Eugen* Leitl <a href="http://leitl.org";>leitl</a> http://leitl.org
______________________________________________________________
ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visithttp://www.beowulf.org/mailman/listinfo/beowulf


_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Re: [Beowulf] [tt] Nvidia unveils Tesla, moves into supercomputing

Reply via email to