------- Comment #2 from ubizjak at gmail dot com 2010-01-10 17:42 ------- (In reply to comment #0)
> Furthermore, math-library function fminf()/fmaxf() (and fmin()/fmax() for > double) would benefit from map to intrinsic minss/maxss processing. Now > they > cause math library calls, where they are implemented as minss/maxss. Use -ffast-math (which I recommend for everything that processes data from the real world). > Another optimization adventure would be to be able to unroll that loop, and > use > packed float values in xmm registers to do up to 4 operations in parallel. > minreg/maxreg/sumreg could be described at C level as: > float minreg[4]; > and code would have an explicit loop from 0 to 3 processing sample sets. Done in 4.5. Try with -O2 -ffast-math -ftree-vectorize. > -- ubizjak at gmail dot com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |RESOLVED Resolution| |WORKSFORME http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42682