Bogdan Costescu wrote:
about on this list: interconnect hardware being able to DMA directly to/from CPU cache. I don't know how useful such a feature is for a
You can do something similar today using Direct Cache Access (DCA) on (recent) Intel chips with IOAT. It's an indirect cache access, you tag a DMA to automatically prefetch the data in the L3 of a specific socket.
It does nothing for latency, since polling will fetch the cache line just as fast, but it works well if there is a delay between the data being delivered and the data being used. The best example is a communication overlapped by computation: cache prefetching is overlapped as well, no more memory latency.
Patrick _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf