On 13/09/2016 18:27, Richard Henderson wrote: > On 09/13/2016 09:10 AM, Paolo Bonzini wrote: >> @@ -177,16 +231,15 @@ bool test_buffer_is_zero_next_accel(void) >> >> static bool select_accel_fn(const void *buf, size_t len) >> { >> - uintptr_t ibuf = (uintptr_t)buf; >> #ifdef CONFIG_AVX2_OPT >> - if (len % 128 == 0 && ibuf % 32 == 0 && (cpuid_cache & CACHE_AVX2)) { >> + if (len >= 128 && (cpuid_cache & CACHE_AVX2)) { >> return buffer_zero_avx2(buf, len); >> } >> - if (len % 64 == 0 && ibuf % 16 == 0 && (cpuid_cache & CACHE_SSE4)) { >> + if (len >= 64 && (cpuid_cache & CACHE_SSE4)) { >> return buffer_zero_sse4(buf, len); >> } >> #endif >> - if (len % 64 == 0 && ibuf % 16 == 0 && (cpuid_cache & CACHE_SSE2)) { >> + if (len >= 64 && (cpuid_cache & CACHE_SSE2)) { >> return buffer_zero_sse2(buf, len); >> } > > You've dropped a major change to select_accel_fn here. > > (1) The avx2 routine, as written, can support len >= 64, therefore a common > test works for all of the vectorized functions. > > (2) I had saved the pointer to the routine, so that we didn't have to > repeatedly test multiple cpuid_cache bits.
Can you send a replacement for this patch only? Thanks, Paolo