I put some alternative fixes together that make the test pass: https://patchwork.freedesktop.org/series/36612/
On Wed, Jan 17, 2018 at 1:32 PM, Timothy Arceri <[email protected]> wrote: > > > On 17/01/18 22:36, Bas Nieuwenhuizen wrote: >> >> On Wed, Jan 17, 2018 at 10:47 AM, Timothy Arceri <[email protected]> >> wrote: >>> >>> Without this count will always be greater than 4 and we will always >>> set the writemask so the loop can never exit. >>> >>> Fixes: 91074bb11bda "radv/ac: Implement Float64 SSBO stores." >>> --- >>> src/amd/common/ac_nir_to_llvm.c | 1 + >>> 1 file changed, 1 insertion(+) >>> >>> diff --git a/src/amd/common/ac_nir_to_llvm.c >>> b/src/amd/common/ac_nir_to_llvm.c >>> index ddcd546b93..c24e563695 100644 >>> --- a/src/amd/common/ac_nir_to_llvm.c >>> +++ b/src/amd/common/ac_nir_to_llvm.c >>> @@ -2453,6 +2453,7 @@ static void visit_store_ssbo(struct ac_nir_context >>> *ctx, >>> if (count > 4) { >>> writemask |= ((1u << (count - 4)) - 1u) << >>> (start + 4); >>> count = 4; >>> + elem_size_mul = 1; >> >> >> This seems confusing to me. We do the initial iteration writemask >> check before multiplying by elem_size_mul, so writemask is in terms of >> 64-bit components, but after the first iteration, writemask is in >> terms of 32-bit components? Looks like we should expand the bitmask >> beforehand similarly as in visit_store_var, and then don't multiply by >> elem_size_mul in the loop. > > > I meant to add somewhere that this doesn't fix the test I'm looking at it > just stops it locking up my machine. There are a bunch of other problems > once we get past this point. e.g. > > Intrinsic name not mangled correctly for type arguments! Should be: > llvm.amdgcn.buffer.store.v8f32 > void (<8 x float>, <4 x i32>, i32, i32, i1, i1)* > @llvm.amdgcn.buffer.store.v4f32 > Intrinsic name not mangled correctly for type arguments! Should be: > llvm.amdgcn.buffer.store.v8f32 > void (<8 x float>, <4 x i32>, i32, i32, i1, i1)* > @llvm.amdgcn.buffer.store.v4f32 > Call parameter type does not match function signature! > <4 x float> bitcast (<2 x double> <double 8.000000e+00, double 9.000000e+00> > to <4 x float>) > <8 x float> call void @llvm.amdgcn.buffer.store.v4f32(<4 x float> bitcast > (<2 x double> <double 8.000000e+00, double 9.000000e+00> to <4 x float>), <4 > x i32> %59, i32 0, i32 112, i1 false, i1 false) #3 > > > I'm still not really sure how this code is intended to work. > > I'm using the ./bin/arb_gpu_shader_fp64-layout-std140-fp64-shader -auto > piglit test for testing, note it also requires [1] to avoid crashing > earlier. > > [1] https://patchwork.freedesktop.org/patch/197723/ > > > >> >>> } >>> >>> if (count == 4) { >>> -- >>> 2.14.3 >>> >>> _______________________________________________ >>> mesa-dev mailing list >>> [email protected] >>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev _______________________________________________ mesa-dev mailing list [email protected] https://lists.freedesktop.org/mailman/listinfo/mesa-dev
