https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91201

--- Comment #24 from Joel Yliluoma <bisqwit at iki dot fi> ---
The simple horizontal 8-bit add seems to work nicely. Very nice work.

However, the original bug report — that the code snippet quoted below no longer
receives love from the SIMD optimization unless you explicitly say “pragma #omp
simd” — seems still unaddressed.

    #define num_words 2

    typedef unsigned long long E;
    E bytes[num_words];
    unsigned char sum() 
    {
        E b[num_words] = {};
        //#pragma omp simd
        for(unsigned n=0; n<num_words; ++n)
        {
            // Calculate the sum of all bytes in a word
            E temp = bytes[n];
            temp += (temp >> 32);
            temp += (temp >> 16);
            temp += (temp >> 8);
            // Save that number in an array
            b[n] = temp;
        }
        // Calculate sum of those sums
        unsigned char result = 0;
        //#pragma omp simd
        for(unsigned n=0; n<num_words; ++n) result += b[n];
        return result;
    }

Compiler Explorer link: https://godbolt.org/z/XL3cIK

Reply via email to