------- Comment #8 from potswa at mac dot com 2009-09-14 21:36 ------- Well, if fast is good, maybe you will consider this:
const _Distance __d = __gcd(__n, __k), __r = __n / __d; for ( _Distance __i = 0; __i < __d; ++ __i ) { _ValueType __tmp = *__first; _RandomAccessIterator __p = __first + __l; for ( _Distance __j = 0; ; ) { swap( __tmp, * __p ); if ( ++ __j >= __r ) break; if ( __p >= __middle ) __p -= __n - __l; else __p += __l; } ++ __first; } It's a few times faster than the current implementation for the case of many small rings, and seems to be within 1% otherwise for both large and small data structures. (The way __tmp is used, it can always be optimized out.) There is a small penalty which seems to go away if I manually unroll the inner loop, which is annoying. It's also much, much more elegant. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41351