On Tue, May 19, 2009 at 01:23:57PM -0500, Rahul Nabar wrote: > Subject: Re: [Beowulf] recommendations for cluster upgrades > On Tue, May 19, 2009 at 1:10 PM, Nifty Tom Mitchell > <niftyo...@niftyegg.com> wrote: > > Other compiler vendors have their tools and tricks too.... By working > > with multiple compilers and taking advantage of the strengths of each > > compiler important portability and correctness code improvements are > > possible. > > Thanks Tom. Those are important leads for sure. I'll look into those > flags and pathopt. > > >compiler flag appears to result in a bad (different) answer. > > That has always stumped me. Compilers can lead to slow or fast code > but how can they compile to "wrong" code. To me this always seemed > like a compiler design flaw.
Goofy stuff comes to mind: x*(y/1024+1024) ---------------- x Is the answer (y/1024+1024) ? Sure unless x=0 Somtimes x is a complex function so this may not be obvious... 1024+1024 is a constant... woops MyDearAuntSally.... Lots of code depends on dividing by zero not throwing an error or throwing it in the expected way at the expected time. Then there are issues related to magnitude. Consider the sum or a long list of numbers some very small some very large. If you sort the list from small to large and add you get a different answer then if you add from large to small. Algebra tells us that these are equivalent but with floating point numbers the result can be very different. Since this is a Beowulf list consider the impact of slicing and dicing an array of mixed small and large numbers, sorted and unsorted. If the list is sorted from large to small and added by one processor there is one result. If the same list is split between core A and core B the second core will see a list that contains values of smaller magnitudes than the first core and depending on the distribution of values and numbers of cores interesting deltas in the output are possible. Compilers do make simple to difficult symbolic transformation as optimizations transparently. Some constant expressions are also evaluated at compile time. Another possible numeric issue is FPU rounding rules. When Intel came up with a FPU with 80 internal bits the act of moving floating point numbers to or from a 64bit memory location might result in rounding differences. This can happen in some cases with loop unrolling or inlining of functions (.vs. function call). Anytime a function needs more floating point registers than the processor has the selection of which value stays in a register and which lives on the call stack can prove interesting. Play with these little scripts... we know that 0C is exactly 32F and 100C is exactly 212F. The simple sixth grade form of the relationship results in almost the exact answer. Play with scale... --- f2c ----- #!/bin/bash # f2c convert Fahrenheit to centigrade. bc << KELVIN #scale=3 scale=20 (5 *(${1}-32 ))/ 9 KELVIN --- c2f ----- #!/bin/bash # c2f convert centigrade to Fahrenheit. bc << KELVIN #scale=3 scale=20 ((9/5) * $1)+32 KELVIN Later, mitch PS: The above are "GPL"... give credit if you keep them. -- T o m M i t c h e l l Found me a new hat, now what? _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf