On 09/13/2016 09:10 AM, Paolo Bonzini wrote:
> @@ -177,16 +231,15 @@ bool test_buffer_is_zero_next_accel(void)
>  
>  static bool select_accel_fn(const void *buf, size_t len)
>  {
> -    uintptr_t ibuf = (uintptr_t)buf;
>  #ifdef CONFIG_AVX2_OPT
> -    if (len % 128 == 0 && ibuf % 32 == 0 && (cpuid_cache & CACHE_AVX2)) {
> +    if (len >= 128 && (cpuid_cache & CACHE_AVX2)) {
>          return buffer_zero_avx2(buf, len);
>      }
> -    if (len % 64 == 0 && ibuf % 16 == 0 && (cpuid_cache & CACHE_SSE4)) {
> +    if (len >= 64 && (cpuid_cache & CACHE_SSE4)) {
>          return buffer_zero_sse4(buf, len);
>      }
>  #endif
> -    if (len % 64 == 0 && ibuf % 16 == 0 && (cpuid_cache & CACHE_SSE2)) {
> +    if (len >= 64 && (cpuid_cache & CACHE_SSE2)) {
>          return buffer_zero_sse2(buf, len);
>      }

You've dropped a major change to select_accel_fn here.

(1) The avx2 routine, as written, can support len >= 64, therefore a common
test works for all of the vectorized functions.

(2) I had saved the pointer to the routine, so that we didn't have to
repeatedly test multiple cpuid_cache bits.


r~

Reply via email to