From: Beowulf <beowulf-boun...@beowulf.org> on behalf of Lawrence Stewart <stew...@serissa.com> Date: Monday, September 20, 2021 at 9:17 AM To: Jim Cownie <jcow...@gmail.com> Cc: Lawrence Stewart <stew...@serissa.com>, Douglas Eadline <deadl...@eadline.org>, "beowulf@beowulf.org" <beowulf@beowulf.org> Subject: Re: [Beowulf] [EXTERNAL] Re: Deskside clusters
Well said. Expanding on this, caches work because of both temporal locality and spatial locality. Spatial locality is addressed by having cache lines be substantially larger than a byte or word. These days, 64 bytes is pretty common. Some prefetch schemes, like the L1D version that fetches the VA ^ 64 clearly affect spatial locality. Streaming prefetch has an expanded notion of “spatial” I suppose! What puzzles me is why compilers seem not to have evolved much notion of cache management. It seems like something a smart compiler could do. Instead, it is left to Prof. Goto and the folks at ATLAS and BLIS to figure out how to rewrite algorithms for efficient cache behavior. To my limited knowledge, compilers don’t make much use of PREFETCH or any non-temporal loads and stores either. It seems to me that once the programmer helps with RESTRICT and so forth, then compilers could perfectly well dynamically move parts of arrays around to maximize cache use. -L I suspect that there’s enough variability among cache implementation and the wide variety of algorithms that might use it that writing a smart-enough compiler is “hard” and “expensive”. Leaving it to the library authors is probably the best “bang for the buck”.
_______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit https://beowulf.org/cgi-bin/mailman/listinfo/beowulf