https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111354
Hongtao.liu <crazylht at gmail dot com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |crazylht at gmail dot com --- Comment #5 from Hongtao.liu <crazylht at gmail dot com> --- void rte_mov128blocks(uint8_t *dst, const uint8_t *src, size_t n) { __m256i ymm0, ymm1, ymm2, ymm3; while (n >= 128) { ymm0 = _mm256_loadu_si256((const __m256i *)(const void *) ((const uint8_t *)src + 0 * 32)); n -= 128; ymm1 = _mm256_loadu_si256((const __m256i *)(const void *) ((const uint8_t *)src + 1 * 32)); ymm2 = _mm256_loadu_si256((const __m256i *)(const void *) ((const uint8_t *)src + 2 * 32)); ymm3 = _mm256_loadu_si256((const __m256i *)(const void *) ((const uint8_t *)src + 3 * 32)); src = (const uint8_t *)src + 128; _mm256_storeu_si256((__m256i *)(void *) ((uint8_t *)dst + 0 * 32), ymm0); _mm256_storeu_si256((__m256i *)(void *) ((uint8_t *)dst + 1 * 32), ymm1); _mm256_storeu_si256((__m256i *)(void *) ((uint8_t *)dst + 2 * 32), ymm2); _mm256_storeu_si256((__m256i *)(void *) ((uint8_t *)dst + 3 * 32), ymm3); dst = (uint8_t *)dst + 128; } } I'm curious if we can distribute the uppper as an memmove?(of course, compiler needs to know 2 array don't alias each other.