------- Comment #4 from kai dot extern at googlemail dot com  2010-02-27 13:46 
-------
> You are violating c++ aliasing rules. You access a uint8_t via  
> different types.un 

Actually, I address other types via uint8_t, but that is neither here nor
there. (Oh, I just realized you probably didn't mean the union but the load in
mem2int. But the following comments apply just as well to that part.)

First, as far as I can tell, the code produced is clearly *correct*.

Second, the optimization of everything except uint64_t shows that gcc clearly
understands what I'm trying to do.

In fact, my first attempts had no type-punning whatsoever; however, the
resulting code demonstrated that gcc had no clue what I was trying to do - it
was pretty much completely unoptimized. So I went looking forways to describe
the problem that gcc would actually understand.

This particular solution grew out of some other gcc bug which compared various
versions of determining endianess, and showed that currently, the version with
unions is the only one gcc can optimize - that seems to be a regression
introduced with 4.0, and the bug seems to be still open.

Anyway, my point here is that gcc *does* understand this idiom. I see two
problems:

1. For some reason, gcc can optimize the byte_sex function to a constant -
except when the integer is 64 bits long. It is not obvious if the problem is in
the 64 bits, or in the 8 loop iterations, but something does not work there
which works in the smaller cases.

2. The actual byte swapping code (which has nothing whatsoever to do with any
aliasing) is clearly suboptimal in some cases. Again, it is not obvious what
property causes gcc to generate this one just fine for some cases, and not very
fine for others. The basic logic is obviously the same.

Anyway, if you can point out a way to write this that is completely
standards-conformant, generates decent code (at least as good as this version),
and does not rely on me telling the compiler what the endianness is, I'm
interested in learning. (I should probably point out that this ought to work
even for inconsistent endianness - I don't recall exactly, but I rember hearing
about a cpu that did something like 3412 byte ordering.)

Just for comparision, my first attempt looked like this:

    template < int n, typename X > struct xword {
        void operator=(X x) {
            set(x);
        };
        operator X() {
            return get();
        };
      protected:
        void set(register X x) {
            for (register int i = 0; i < n; i++) {
                m_x[TRAITS::le ? i : n - i - 1] = x & 0xff;
                x >>= 8;
            };
        };
        X get() {
            register X r;
            for (register int i = 0; i < n; i++) {
                r <<= 8;
                r |= m_x[TRAITS::le ? n - i - 1 : i];
            };
            return r;
        };
      private:
        uint8_t m_x[n];
    };

No aliasing issues. Also, no optimization whatsoever.


-- 

kai dot extern at googlemail dot com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |kai dot extern at googlemail
                   |                            |dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43197

Reply via email to