On Mar 10, 2013, at 9:03 PM, Mark Hahn wrote: >> Is there any line/point to make distinction between accelerators and >> co-processors (that are used in conjunction with the primary CPU >> to boost >> up the performance)? or these terms can be used interchangeably? > > IMO, a coprocessor executes the same instruction stream as the > "primary" processor. this was the case with the x87, for instance, > though the distinction became less significant once the x87 came > onchip. > (though you certainly notice that FPU on any of these chips is mostly > separate - not sharing functional units or register files, > sometimes even > with separate micro-op schedulers.) > >> Specifically, the word "accelerator" is used commonly with GPU. On >> the >> other hand the word "co-processors" is used commonly with Xeon Phi. > > I don't think it is a useful distinction: both are basiclly > independent > computers. obviously, the programming model of Phi is dramatically > more > like a conventional processor than Nvidia. >
Mark, that's the marketing talk about Xeon Phi. It's surprisingly the same of course except for the cache coherency; big vector processors. > there is a meaningful distinction between offload and coprocessor > approaches. > that is, offload means you use the device to accelerate a set of > libraries > (offload matrix multiply, eig, fft, etc). to use a coprocessor, I > think the > expectation is that the main code will be very much aware of the > state of the > PCIe-attached hardware. > > I suppose one might suggest that "accelerator" to some extent implies > offload usage: you're accelerating a library. > > another interesting example is AMD's upcoming HSA concept: since > nearly all > GPUs are now on-chip, AMD wants to integrate the CPU and GPU > programming > models (at least to some extent). as far as I understand it, HSA > is based > on introducing a quite general intermediate ISA that can be > executed using > all available hardware resources: CPU and/or GPU. although Nvidia > does have > its own intermediate ISA, they don't seem to be trying to make it > general, > *and* they don't seem interested in making it work on both C/GPU. > (well, > so far at least - I wouldn't be surprised if they _did_ have a PTX > JIT for > their ARM-based C/GPU chips...) > > I think HSA is potentially interesting for HPC, too. > I really expect > AMD and/or Intel to ship products this year that have a C/GPU chip > mounted on > the same interposer as some high-bandwidth ram. How can an integrated gpu outperform a gpgpu card? Something like what is it 25 watt versus 250 watt, what will be faster? I assume you will not build 10 nodes with 10 cpu's with integrated gpu in order to rival a single card. > a fixed amount of very high > performance memory sounds very tasty to me. a surprising amount of > power > in current systems is spend getting high-speed signals off-socket. > > imagine a package dissipating say 40W containing a, say, 4 CPU cores, > 256 GPU ALUs and 2GB of gddr5. the point would be to tile 32 of them > in a 1U box. (dropping socketed, off-package dram would probably make > it uninteresting for memcached and some space-intensive HPC. > > then again, if you think carefully about the numbers, any code today > that has a big working set is almost as anachronistic as codes that > use > disk-based algorithms. (same conceptual thing happening: capacity is > growing much faster than the pipe.) > > regards, mark hahn. > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin > Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf