date:20180109

Re: [Mesa-dev] [PATCH v2] i965: allocate a SGVS element when VertexID or InstanceID are read

2018-01-09 Thread Iago Toral

Ken, do you have any comments about this patch? I'd like to push it otherwise. Iago On Thu, 2018-01-04 at 14:24 -0800, Jason Ekstrand wrote: > Reviewed-by: Jason Ekstrand > > Ken? > > On Wed, Jan 3, 2018 at 6:55 PM, Iago Toral Quiroga > wrote: > > Although on gen8+ platforms we can in theory

[Mesa-dev] [PATCH 2/5] i965/miptree: Use cpu tiling/detiling when mapping

2018-01-09 Thread Scott D Phillips

Rename the (un)map_gtt functions to (un)map_map (map by returning a map) and add new functions (un)map_tiled_memcpy that return a shadow buffer populated with the intel_tiled_memcpy functions. --- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 95 --- 1 file changed, 86 in

[Mesa-dev] [PATCH 1/5] i965/tiled_memcpy: change linear pointer from (0, 0) to (xt1, yt1)

2018-01-09 Thread Scott D Phillips

In all current uses, the linear surface is only allocated starting at (xt1, yt1) anyway, so this improves the calling ergonomics. --- src/mesa/drivers/dri/i965/intel_pixel_read.c | 2 +- src/mesa/drivers/dri/i965/intel_tex_image.c| 4 ++-- src/mesa/drivers/dri/i965/intel_tiled_memcpy.c | 1

[Mesa-dev] [PATCH 5/5] i965/miptree: Don't gtt map from map_depthstencil

2018-01-09 Thread Scott D Phillips

Instead of gtt mapping, call out to other map functions (map_map or map_tiled_memcpy) for the depth surface. Removes a place where gtt mapping is used. --- This is a bit icky, perhaps something like mapping z_mt with BRW_MAP_DIRECT_BIT could be cleaner (but in that case the depthstencil mapping and

[Mesa-dev] [PATCH 4/5] i965/miptree: Map with movntdqa for linear buffers only

2018-01-09 Thread Scott D Phillips

Removes a place where gtt mapping is used. --- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c index e4a3f163d2..fa4ae06399 100644 ---

[Mesa-dev] [PATCH 3/5] i965/miptree: Initialize mcs with a linear map

2018-01-09 Thread Scott D Phillips

When initializing mcs, map with MAP_RAW and fill in the linear map. Removes a place where gtt mapping is used. --- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri

[Mesa-dev] [PATCH 18/21] r600/sb: use different stacks for tracking lds and queue usage.

2018-01-09 Thread Dave Airlie

From: Dave Airlie The normal ssa renumbering isn't sufficient for LDS queue access, this uses two stacks, one for the lds queue, and one for the lds r/w ordering. The LDS oq values are incremented in their use in a linear fashion. The LDS rw values are incremented in their definitions and used i

[Mesa-dev] [PATCH 21/21] [RFC] hack enable sb for tess

2018-01-09 Thread Dave Airlie

From: Dave Airlie Don't apply this until we have a lot more tests passing this disables SB for barrier usage (as those will be a lot of "fun") --- src/gallium/drivers/r600/r600_shader.c | 11 --- src/gallium/drivers/r600/r600_shader.h | 1 + 2 files changed, 5 insertions(+), 7 deletion

[Mesa-dev] [PATCH 20/21] [RFC] r600/sb: make it work?

2018-01-09 Thread Dave Airlie

From: Dave Airlie This has some hacks in it that in the end make heaven run --- src/gallium/drivers/r600/sb/sb_bc_builder.cpp | 2 +- src/gallium/drivers/r600/sb/sb_bc_decoder.cpp | 1 + src/gallium/drivers/r600/sb/sb_bc_finalize.cpp | 10 +++- src/gallium/drivers/r600/sb/sb_gcm.cpp

[Mesa-dev] [PATCH 19/21] r600/sb: add lds related peepholes.

2018-01-09 Thread Dave Airlie

From: Dave Airlie if no destination: a) convert _RET instructions to non _RET variants if no dst b) set src0 to undefined if it's a READ, this should get DCE then. --- src/gallium/drivers/r600/sb/sb_peephole.cpp | 9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/src/gall

[Mesa-dev] [PATCH 14/21] r600/sb: add gcm support to avoid clause between lds read/queue read

2018-01-09 Thread Dave Airlie

From: Dave Airlie You have to schedule LDS_READ_RET _, x and MOV reg, LDS_OQ_A_POP in the same basic block/clause. This makes sure once we've issues and MOV we don't add another block until we balance it with an LDS read. --- src/gallium/drivers/r600/sb/sb_gcm.cpp | 15 ++- src/galli

[Mesa-dev] [PATCH 12/21] r600/sb: handle LDS operations in folding.

2018-01-09 Thread Dave Airlie

From: Dave Airlie Don't try and fold LDS using expressions. --- src/gallium/drivers/r600/sb/sb_expr.cpp | 11 +++ 1 file changed, 11 insertions(+) diff --git a/src/gallium/drivers/r600/sb/sb_expr.cpp b/src/gallium/drivers/r600/sb/sb_expr.cpp index 7a5d62c8e8..7d43ef1d1d 100644 --- a/sr

[Mesa-dev] [PATCH 17/21] r600/sb: schedule LDS ops in appropriate places.

2018-01-09 Thread Dave Airlie

From: Dave Airlie So LDS ops have to be SLOT_X, and LDS OQ reads have read port restrictions so we try and force those into only having one per slot and avoiding bank swizzles. --- src/gallium/drivers/r600/sb/sb_bc.h | 3 +++ src/gallium/drivers/r600/sb/sb_sched.cpp | 4 2 files change

[Mesa-dev] [PATCH 15/21] r600/sb: adding lds oq tracking to the scheduler

2018-01-09 Thread Dave Airlie

From: Dave Airlie This adds support for tracking the lds oq read/writes so can avoid scheduling other things in between. This patch just adds the tracking and assert to show problems. --- src/gallium/drivers/r600/sb/sb_sched.cpp | 13 ++--- src/gallium/drivers/r600/sb/sb_sched.h | 5

[Mesa-dev] [PATCH 16/21] r600/sb: hit the scheduler with a big hammer to avoid lds splits.

2018-01-09 Thread Dave Airlie

From: Dave Airlie This tries to avoid an lds queue read getting scheduled separately from an lds ret read, the non-sb code uses the same style of hammer, this isn't foolproof. We can do better, but it's a bit tricky, as you have to scan ahead and either schedule more lds oq moves and more lds re

[Mesa-dev] [PATCH 13/21] r600/sb: handle lds special dest registers.

2018-01-09 Thread Dave Airlie

From: Dave Airlie This adds lds to the geom emit handling --- src/gallium/drivers/r600/sb/sb_bc_finalize.cpp | 2 +- src/gallium/drivers/r600/sb/sb_sched.cpp | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/r600/sb/sb_bc_finalize.cpp b/src/gallium

[Mesa-dev] [PATCH 01/21] r600: emit 0 gds_op for tf write.

2018-01-09 Thread Dave Airlie

From: Dave Airlie This field is ignored for tf writes so should be 0. Signed-off-by: Dave Airlie --- src/gallium/drivers/r600/eg_asm.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/r600/eg_asm.c b/src/gallium/drivers/r600/eg_asm.c index 8f9d1b85f

[Mesa-dev] [PATCH 10/21] r600/sb: add initial support for parsing lds operations.

2018-01-09 Thread Dave Airlie

From: Dave Airlie This handles parsing the LDS ops and queue accessess. --- src/gallium/drivers/r600/sb/sb_bc_parser.cpp | 52 ++-- 1 file changed, 50 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/r600/sb/sb_bc_parser.cpp b/src/gallium/drivers/r600/sb/s

[Mesa-dev] [PATCH 08/21] r600/sb: lds ops have no dst register.

2018-01-09 Thread Dave Airlie

From: Dave Airlie Although these are op3s they don't have a dst reg. --- src/gallium/drivers/r600/sb/sb_bc_dump.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/drivers/r600/sb/sb_bc_dump.cpp b/src/gallium/drivers/r600/sb/sb_bc_dump.cpp index 72a1b24467..3b5d9

[Mesa-dev] [PATCH 03/21] r600/sb: fix a bug emitting ar load from a constant.

2018-01-09 Thread Dave Airlie

From: Dave Airlie Some tess shaders were doing MOVA_INT _, c0.x on cayman, and then hitting an assert in sb_bc_finalize.cpp:translate_kcache. This makes sure the toplevel kcache tracker gets updated, and the clause gets fixed up. Signed-off-by: Dave Airlie --- src/gallium/drivers/r600/sb/sb_s

[Mesa-dev] [PATCH 09/21] r600/sb: disable if conversion for hs

2018-01-09 Thread Dave Airlie

From: Dave Airlie This fixes bad interactions with the LDS special values. --- src/gallium/drivers/r600/sb/sb_core.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/drivers/r600/sb/sb_core.cpp b/src/gallium/drivers/r600/sb/sb_core.cpp index cdc2862d36..5049b677

[Mesa-dev] [PATCH 06/21] r600/sb: update last_cf if alu is the last clause

2018-01-09 Thread Dave Airlie

From: Dave Airlie It's rare to have a final alu clause on normal shaders (exports) but tess shaders write to LDS as their output, so we see some alu clauses, and the CF_END get put in the wrong place. This makes sure to update last_cf correctly. Signed-off-by: Dave Airlie --- src/gallium/driv

[Mesa-dev] [PATCH 05/21] r600/sb: start adding GDS support

2018-01-09 Thread Dave Airlie

From: Dave Airlie This adds support for GDS ops to sb backend. This seems to work for atomics and tess factor writes. Signed-off-by: Dave Airlie --- src/gallium/drivers/r600/r600_isa.h| 2 +- src/gallium/drivers/r600/sb/sb_bc.h| 7 src/gallium/drivers/r600/sb/sb

[Mesa-dev] [PATCH 04/21] r600/sb: add tess/compute initial state registers.

2018-01-09 Thread Dave Airlie

From: Dave Airlie This stops them being optimised out. --- src/gallium/drivers/r600/sb/sb_bc_parser.cpp | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/r600/sb/sb_bc_parser.cpp b/src/gallium/drivers/r600/sb/sb_bc_parser.cpp index ae92a767b4..de3984f59

[Mesa-dev] [PATCH 07/21] r600/sb: introduce special register values for lds support.

2018-01-09 Thread Dave Airlie

From: Dave Airlie For LDS read/write ordering we use the LDS_RW value, reads will wait on previous writes. For LDS read/read from LDS queue ordering we use the LDS_OQ values, we define two for now, though initially we'll just support OQA. Also add the check for the lds oq values Signed-off-by:

[Mesa-dev] [PATCH 11/21] r600/sb: add finalising for lds output queue special values.

2018-01-09 Thread Dave Airlie

From: Dave Airlie We need to convert these to the hw special registers. --- src/gallium/drivers/r600/sb/sb_bc_finalize.cpp | 12 1 file changed, 12 insertions(+) diff --git a/src/gallium/drivers/r600/sb/sb_bc_finalize.cpp b/src/gallium/drivers/r600/sb/sb_bc_finalize.cpp index 2ec4

[Mesa-dev] r600 sb tessellation support

2018-01-09 Thread Dave Airlie

This is an attempt to add tessellation support to the SB backend. The main things needed are GDS access which is used for tess factor storage (also used for atomic counters), and LDS access which is needed to pass all the data between stages. The first 19 patches are the stuff I'm happy with, the

[Mesa-dev] [PATCH 02/21] r600/shader: only emit add instruction if param has a value.

2018-01-09 Thread Dave Airlie

From: Dave Airlie Just saves a pointless a = a + 0; Signed-off-by: Dave Airlie --- src/gallium/drivers/r600/r600_shader.c | 14 -- 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/src/gallium/drivers/r600/r600_shader.c b/src/gallium/drivers/r600/r600_shader.c index 1b

[Mesa-dev] [PATCH] ac: add load_patch_vertices_in() to the abi

2018-01-09 Thread Timothy Arceri

Fixes the follow test for radeonsi nir: tests/spec/arb_tessellation_shader/execution/quads.shader_test Also stops 8 other tests from crashing, they now just fail e.g. tcs-output-array-float-index-rd-after-barrier.shader_test --- src/amd/common/ac_nir_to_llvm.c | 11 ++- src/amd

Re: [Mesa-dev] [PATCH] dri_util: remove ALLOW_RGB10_CONFIGS option (v2)

2018-01-09 Thread Tapani Pälli

Hi Marek; This one works but only if you add DRI_CONF_ALLOW_RGB10_CONFIGS("false") to the DRI_CONF_SECTION_MISCELLANEOUS section in intel_screen. With that change: Reviewed-by: Tapani Pälli On 01/09/2018 04:04 PM, Marek Olšák wrote: From: Marek Olšák This is unused because it's for libG

Re: [Mesa-dev] [RFC 07/10] mesa: add program blob cache functionality

2018-01-09 Thread Tapani Pälli

On 01/09/2018 05:05 PM, Eric Engestrom wrote: On Tuesday, 2018-01-09 09:48:19 +0200, Tapani Pälli wrote: Cache set and get are called in similar fashion as what is happening with disk cache. Functionality requires ARB_get_program_binary and EGL_ANDROID_blob_cache support. Signed-off-by: Tapan

Re: [Mesa-dev] [PATCH] util: fix NORETURN for msvc, add HAVE_FUNC_ATTRIBUTE_NORETURN to c99_compat.h

2018-01-09 Thread Brian Paul

On 01/09/2018 07:15 PM, srol...@vmware.com wrote: From: Roland Scheidegger We've seen some problems internally due to macro redefinition. Fix this by adding HAVE_FUNC_ATTRIBUTE_NORETURN to c99_compat.h, and defining it for msvc. And avoid redefinition just in case. --- include/c99_compat.h |

Re: [Mesa-dev] [PATCH 2/2] radv: Implement VK_EXT_discard_rectangles.

2018-01-09 Thread Dave Airlie

On 10 January 2018 at 12:34, Bas Nieuwenhuizen wrote: > Tested with a modified deferred demo and no regressions in a 1.0.2 > mustpass run. For the series: Reviewed-by: Dave Airlie > --- > src/amd/vulkan/radv_cmd_buffer.c | 51 > +++ > src/amd/vulkan/radv_

[Mesa-dev] [PATCH v4 38/38] nvir/nir: implement intrinsic shader_clock

2018-01-09 Thread Karol Herbst

Signed-off-by: Karol Herbst --- src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp | 8 1 file changed, 8 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp index 0a78c6a593..e60d21bc8a 10

[Mesa-dev] [PATCH v4 21/38] nvir/nir: implement nir_alu_instr handling

2018-01-09 Thread Karol Herbst

Signed-off-by: Karol Herbst v2: user bitfield_insert instead of bfi rework switch helper macros remove some lowering code (LoweringHelper is now used for this) v3: add pack_half_2x16_split add unpack_half_2x16_split_x/y --- .../drivers/nouveau/codegen/nv50_ir_from_nir.cpp | 486 +++

[Mesa-dev] [PATCH v4 27/38] nvir/nir: implement nir_ssa_undef_instr

2018-01-09 Thread Karol Herbst

v2: use mkOp Signed-off-by: Karol Herbst --- src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp | 15 +++ 1 file changed, 15 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp index ee

[Mesa-dev] [PATCH v4 30/38] nvir/nir: implement vote and ballot

2018-01-09 Thread Karol Herbst

v2: add vote_eq support use the new subop intrinsic helper add ballot v3: add read_(first_)invocation Signed-off-by: Karol Herbst --- .../drivers/nouveau/codegen/nv50_ir_from_nir.cpp | 41 ++ 1 file changed, 41 insertions(+) diff --git a/src/gallium/drivers/nouveau

[Mesa-dev] [PATCH v4 33/38] nvir/nir: implement nir_intrinsic_load_ubo

2018-01-09 Thread Karol Herbst

v4: use loadFrom helper Signed-off-by: Karol Herbst --- src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp | 13 + 1 file changed, 13 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp

[Mesa-dev] [PATCH v4 35/38] nvir/nir: implement images

2018-01-09 Thread Karol Herbst

v3: fix compiler warnings v4: use loadFrom helper Signed-off-by: Karol Herbst --- .../drivers/nouveau/codegen/nv50_ir_from_nir.cpp | 276 +++-- 1 file changed, 258 insertions(+), 18 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp b/src/galli

[Mesa-dev] [PATCH v4 24/38] nvir/nir: implement nir_intrinsic_load_input

2018-01-09 Thread Karol Herbst

v3: and load_output v4: use smarter getIndirect helper use new getSlotAddress helper Signed-off-by: Karol Herbst --- .../drivers/nouveau/codegen/nv50_ir_from_nir.cpp | 38 ++ 1 file changed, 38 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from

[Mesa-dev] [PATCH v4 23/38] nvir/nir: implement nir_intrinsic_store_(per_vertex_)output

2018-01-09 Thread Karol Herbst

v3: add workaround for RA issues indirects have to be multiplied by 0x10 fix indirect access v4: use smarter getIndirect helper use storeTo helper Signed-off-by: Karol Herbst --- .../drivers/nouveau/codegen/nv50_ir_from_nir.cpp | 43 ++ 1 file changed, 43 insert

[Mesa-dev] [PATCH v4 25/38] nvir/nir: implement intrinsic_discard(_if)

2018-01-09 Thread Karol Herbst

Signed-off-by: Karol Herbst --- src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp | 15 +++ 1 file changed, 15 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp index 748d7740de..8573

[Mesa-dev] [PATCH v4 26/38] nvir/nir: implement loading system values

2018-01-09 Thread Karol Herbst

v2: support more sys values fixed a bug where for multi component reads all values ended up in x v3: add load_patch_vertices_in v4: add subgroup stuff Signed-off-by: Karol Herbst --- .../drivers/nouveau/codegen/nv50_ir_from_nir.cpp | 108 + 1 file changed, 108 insertion

[Mesa-dev] [PATCH v4 28/38] nvir/nir: implement nir_instr_type_tex

2018-01-09 Thread Karol Herbst

a lot of those fields are not valid for a lot of tex ops. Not quite sure if it's worth the effort to check for those or just keep it like that. It seems to kind of work. v2: reworked offset handling add tex support with indirect R/S arguments handle GLSL_SAMPLER_DIM_EXTERNAL drop refer

[Mesa-dev] [PATCH v4 20/38] nvir/nir: add skeleton for nir_intrinsic_instr

2018-01-09 Thread Karol Herbst

Signed-off-by: Karol Herbst --- .../drivers/nouveau/codegen/nv50_ir_from_nir.cpp | 19 +++ 1 file changed, 19 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp index 7010e4e468..9c0

[Mesa-dev] [PATCH v4 37/38] nvir/nir: implement load_per_vertex_output

2018-01-09 Thread Karol Herbst

v4: use smarter getIndirect helper use new getSlotAddress helper Signed-off-by: Karol Herbst --- .../drivers/nouveau/codegen/nv50_ir_from_nir.cpp | 25 ++ 1 file changed, 25 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp b/src/gall

[Mesa-dev] [PATCH v4 32/38] nvir/nir: implement geometry shader nir_intrinsics

2018-01-09 Thread Karol Herbst

v4: use smarter getIndirect helper use new getSlotAddress helper use loadFrom helper Signed-off-by: Karol Herbst --- .../drivers/nouveau/codegen/nv50_ir_from_nir.cpp | 25 ++ 1 file changed, 25 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_

[Mesa-dev] [PATCH v4 34/38] nvir/nir: implement ssbo intrinsics

2018-01-09 Thread Karol Herbst

v4: use loadFrom helper Signed-off-by: Karol Herbst --- .../drivers/nouveau/codegen/nv50_ir_from_nir.cpp | 86 ++ 1 file changed, 86 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.

[Mesa-dev] [PATCH v4 36/38] nvir/nir: add memory barriers

2018-01-09 Thread Karol Herbst

Signed-off-by: Karol Herbst --- src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp | 11 +++ 1 file changed, 11 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp index f7b51339c2..aeeca94f

[Mesa-dev] [PATCH v4 18/38] nvir/nir: implement CFG handling

2018-01-09 Thread Karol Herbst

Signed-off-by: Karol Herbst --- .../drivers/nouveau/codegen/nv50_ir_from_nir.cpp | 255 - 1 file changed, 253 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp inde

[Mesa-dev] [PATCH v4 29/38] nvir/nir: add getOperation for intrinsics

2018-01-09 Thread Karol Herbst

Signed-off-by: Karol Herbst --- .../drivers/nouveau/codegen/nv50_ir_from_nir.cpp | 24 ++ 1 file changed, 24 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp index 58c627371b..ef0

[Mesa-dev] [PATCH v4 31/38] nvir/nir: implement variable indexing

2018-01-09 Thread Karol Herbst

we store those arrays in local memory and reserve some space for each of the arrays. The arrays are stored in a packed format, because we know quite easily the context of each index. We don't do that in TGSI so far. This causes various issues to come up in the MemoryOpt pass, because ld/st with in

[Mesa-dev] [PATCH v4 22/38] nvir/nir: implement nir_intrinsic_load_uniform

2018-01-09 Thread Karol Herbst

v2: use new getIndirect helper fixes symbols for 64 bit types v4: use smarter getIndirect helper simplify address calculation use loadFrom helper Signed-off-by: Karol Herbst --- src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp | 10 ++ 1 file changed, 10 insertions(+)

[Mesa-dev] [PATCH v4 16/38] nvir/nir: add loadFrom and storeTo helpler

2018-01-09 Thread Karol Herbst

Signed-off-by: Karol Herbst --- .../drivers/nouveau/codegen/nv50_ir_from_nir.cpp | 47 ++ 1 file changed, 47 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp index f3cd22622d..6ea

[Mesa-dev] [PATCH v4 17/38] nvir/nir: parse NIR shader info

2018-01-09 Thread Karol Herbst

v2: parse a few more fields v3: add special handling for GL_ISOLINES Signed-off-by: Karol Herbst --- .../drivers/nouveau/codegen/nv50_ir_from_nir.cpp | 58 ++ 1 file changed, 58 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp b/src/gall

[Mesa-dev] [PATCH v4 19/38] nvir/nir: implement nir_load_const_instr

2018-01-09 Thread Karol Herbst

Signed-off-by: Karol Herbst --- .../drivers/nouveau/codegen/nv50_ir_from_nir.cpp | 20 1 file changed, 20 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp index 66ec4460d9..701

[Mesa-dev] [PATCH v4 09/38] nvc0: add support for NIR

2018-01-09 Thread Karol Herbst

not all those nir options are actually required, it just made the work a little easier. v2: fix asserts parse compute shaders don't lower bitfield_insert v3: fix memory leak v4: don't lower fmod32 Signed-off-by: Karol Herbst --- src/gallium/drivers/nouveau/Makefile.sources | 1 +

[Mesa-dev] [PATCH v4 15/38] nvir/nir: run assignSlots

2018-01-09 Thread Karol Herbst

v2: add support for geometry shaders set idx add some missing mappings fix for 64bit inputs/outputs fix up some FP color output index messup parse centroid flag v3: fix arrays in outputs as well fix input/ouput size calculation for tessellation shaders v4: add getSlotAddress

[Mesa-dev] [PATCH v4 08/38] nvir: add lowering helper

2018-01-09 Thread Karol Herbst

this is mostly usefull for lazy IR converters not wanting to deal with 64 bit lowering and other illegal stuff Signed-off-by: Karol Herbst --- src/gallium/drivers/nouveau/Makefile.sources | 2 + .../nouveau/codegen/nv50_ir_lowering_helper.cpp| 250 + .../nouveau/c

[Mesa-dev] [PATCH v4 10/38] nvir/nir: use lowering helper

2018-01-09 Thread Karol Herbst

this helps with a bunch of piglit tests testing 64 bit types Signed-off-by: Karol Herbst --- src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp | 5 + 1 file changed, 5 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp b/src/gallium/drivers/nouveau/c

[Mesa-dev] [PATCH v4 13/38] nvir/nir: track defs and provide easy access functions

2018-01-09 Thread Karol Herbst

v2: add helper function for indirects v4: add new getIndirect overload for easier use Signed-off-by: Karol Herbst --- .../drivers/nouveau/codegen/nv50_ir_from_nir.cpp | 136 + 1 file changed, 136 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_n

[Mesa-dev] [PATCH v4 14/38] nvir/nir: add nir type helper functions

2018-01-09 Thread Karol Herbst

v4: treat imul as unsigned Signed-off-by: Karol Herbst --- .../drivers/nouveau/codegen/nv50_ir_from_nir.cpp | 117 + 1 file changed, 117 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_

[Mesa-dev] [PATCH v4 11/38] nvc0/debug: add env var to make nir default

2018-01-09 Thread Karol Herbst

v2: allow for non debug builds as well v3: move reading out env var more global disable tg4 with multiple offsets with nir disable caps for 64 bit types Signed-off-by: Karol Herbst --- src/gallium/drivers/nouveau/nouveau_screen.c | 4 src/gallium/drivers/nouveau/nouveau_screen.h

[Mesa-dev] [PATCH v4 12/38] nvir/nir: run some passes to make the conversion easier

2018-01-09 Thread Karol Herbst

v2: add constant_folding Signed-off-by: Karol Herbst --- .../drivers/nouveau/codegen/nv50_ir_from_nir.cpp | 40 ++ 1 file changed, 40 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir

[Mesa-dev] [PATCH v4 06/38] nvir: print the shader type when dumping headers

2018-01-09 Thread Karol Herbst

this makes debugging the shader header a little easier Signed-off-by: Karol Herbst --- src/gallium/drivers/nouveau/nvc0/nvc0_program.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_program.c b/src/gallium/drivers/nouveau/nvc0/nvc0_program.c index e615

[Mesa-dev] [PATCH v4 00/38] Nir support for Nouveau

2018-01-09 Thread Karol Herbst

significant changes to last series: * fixing TF with GS for gallium nir drivers * RA fix for 64 bit values and compounds * completed support for 64 bit types * random piglit fixes Tested with unigine heaven/valley, gputest and RealisticRenderer piglit run -x glx -x egl -x streaming-texture-leak -

[Mesa-dev] [PATCH v4 03/38] nir: add vs_inputs_dual_locations compiler option

2018-01-09 Thread Karol Herbst

From: Timothy Arceri Allows nir drivers to either use a single or dual locations for vs double inputs. i965 uses dual locations for both OpenGL and Vulkan drivers, for now gallium OpenGL drivers only use a single location. The following patch will also make use of this option when calling nir_s

[Mesa-dev] [PATCH v4 01/38] mesa/st: translate SO info in glsl_to_nir() case

2018-01-09 Thread Karol Herbst

From: Rob Clark This was handled for VS, but not for GS. Fixes for gallium drivers using nir: spec@arb_gpu_shader5@arb_gpu_shader5-xfb-streams-without-invocations spec@arb_gpu_shader5@arb_gpu_shader5-xfb-streams* spec@arb_transform_feedback3@arb_transform_feedback3-ext_interleaved_two_bufs_gs* s

[Mesa-dev] [PATCH v4 05/38] nv50/ir/ra: Fix copying compound for moves

2018-01-09 Thread Karol Herbst

From: Connor Abbott In order to reduce moves when coalescing multiple registers into a larger register, RA will try to coalesce MERGE instructions with their definitions. For example, for something like this in GLSL: uint a = ...; uint b = ...; uint64 x = packUint2x32(a, b); The compiler will t

[Mesa-dev] [PATCH v4 02/38] compiler: tidy up double_inputs_read uses

2018-01-09 Thread Karol Herbst

From: Timothy Arceri First we move double_inputs_read into a vs struct in the union, double_inputs_read is only used for vs inputs so this will save space and also allows us to add a new double_inputs field. We add the new field because c2acf97fcc9b changed the behaviour of double_inputs_read, a

[Mesa-dev] [PATCH v4 04/38] nir: partially revert c2acf97fcc9b32e

2018-01-09 Thread Karol Herbst

From: Timothy Arceri c2acf97fcc9b32e changed the use of double_inputs_read to be inconsitent with its previous meaning. Here we re-enable the gather info code that was removed as the modified code from c2acf97fcc9b32e now uses the double_inputs member rather than double_inputs_read. This change

[Mesa-dev] [PATCH v4 07/38] nvir: move common converter code in base class

2018-01-09 Thread Karol Herbst

v2: remove TGSI related bits Signed-off-by: Karol Herbst --- src/gallium/drivers/nouveau/Makefile.sources | 2 + .../nouveau/codegen/nv50_ir_from_common.cpp| 107 + .../drivers/nouveau/codegen/nv50_ir_from_common.h | 58 +++ .../drivers/nouveau/codeg

Re: [Mesa-dev] [PATCH] r600: Allow egd_tables.py to run with python3 too

2018-01-09 Thread Dave Airlie

On 5 January 2018 at 01:14, Michal Srb wrote: > From: =?UTF-8?q?Tom=C3=A1=C5=A1=20Chv=C3=A1tal?= > > Makes the egd_tables.py compatible with both python 2 and 3. This appears to break the build here, I get a few () lines in the output. I suspect print() needs to be print('') Dave. > --- > sr

[Mesa-dev] [PATCH 2/2] radv: Implement VK_EXT_discard_rectangles.

2018-01-09 Thread Bas Nieuwenhuizen

Tested with a modified deferred demo and no regressions in a 1.0.2 mustpass run. --- src/amd/vulkan/radv_cmd_buffer.c | 51 +++ src/amd/vulkan/radv_device.c | 6 + src/amd/vulkan/radv_extensions.py | 1 + src/amd/vulkan/radv_pipeline.c| 35 ++

[Mesa-dev] [PATCH 1/2] radv: Add mapping between dynamic state mask and external enum.

2018-01-09 Thread Bas Nieuwenhuizen

The EXT values are really large, e.g. VK_DYNAMIC_STATE_DISCARD_RECTANGLE_EXT = 199000, so 1 << value is not going to fit into a 32-bit mask. --- src/amd/vulkan/radv_cmd_buffer.c | 36 ++--- src/amd/vulkan/radv_pipeline.c | 49 +++-

[Mesa-dev] [PATCH] util: fix NORETURN for msvc, add HAVE_FUNC_ATTRIBUTE_NORETURN to c99_compat.h

2018-01-09 Thread sroland

From: Roland Scheidegger We've seen some problems internally due to macro redefinition. Fix this by adding HAVE_FUNC_ATTRIBUTE_NORETURN to c99_compat.h, and defining it for msvc. And avoid redefinition just in case. --- include/c99_compat.h | 1 + src/util/macros.h| 12 2 files

[Mesa-dev] [Bug 104553] mat4: m[i][j] incorrect result with row_major UBO

2018-01-09 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=104553 --- Comment #3 from Timothy Arceri --- (In reply to Ilia Mirkin from comment #1) > Ian Romanick (idr) wrote a test generator which generated random shader_test > files with different ubo arrangements. It caught a lot of bugs back in the > day, b

Re: [Mesa-dev] [PATCH 1/2] i965/fs: Use UW types when using V immediates

2018-01-09 Thread Anuj Phogat

I tested the destination register type W => UW change to move 0x76543210V. It fixed 1000+ piglit failures on Cannonlake. On Tue, Jan 9, 2018 at 4:56 PM, Jason Ekstrand wrote: > Gen 10 has a strange hardware bug involving V immediates with W types. > It appears that a mov(8) g2<1>W 0x76543210V wil

[Mesa-dev] [PATCH] r600: add support for ARB_shader_clock.

2018-01-09 Thread Dave Airlie

From: Dave Airlie --- docs/features.txt | 2 +- src/gallium/drivers/r600/r600_pipe.c | 2 +- src/gallium/drivers/r600/r600_shader.c | 29 ++--- src/gallium/drivers/r600/r600_sq.h | 3 ++- 4 files changed, 30 insertions(+), 6 deletions(-) dif

[Mesa-dev] [PATCH 1/2] i965/fs: Use UW types when using V immediates

2018-01-09 Thread Jason Ekstrand

Gen 10 has a strange hardware bug involving V immediates with W types. It appears that a mov(8) g2<1>W 0x76543210V will actually result in g2 getting the value {3, 2, 1, 0, 3, 2, 1, 0}. In particular, the bottom four nibbles are repeated instead of the top four being taken. (A mov of 0x3210V

[Mesa-dev] [PATCH 2/2] i965: Use UD types for gl_SampleID setup

2018-01-09 Thread Jason Ekstrand

We already had to switch all of the W types to UW to prevent issues with vector immediates on gen10. We may as well use unsigned types everywhere. --- src/intel/compiler/brw_fs.cpp | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/

Re: [Mesa-dev] [PATCH] i915g: fix crashes with wined3d

2018-01-09 Thread Ilia Mirkin

On Tue, Jan 9, 2018 at 5:40 PM, Christopher Egert wrote: > I'm not too familiar with gallium3d, but this fixes > crashes with 3DMark2001 and GTA3 in wine-staging. > > This should be fixed properly in the future. > > Signed-off-by: Christopher Egert > --- > src/gallium/drivers/i915/i915_clear.c

Re: [Mesa-dev] [PATCH 16/29] anv/cmd_buffer: Pass a subpass id into begin_subpass

2018-01-09 Thread Nanley Chery

On Mon, Nov 27, 2017 at 07:06:06PM -0800, Jason Ekstrand wrote: > This is a bit less awkward than passing in the subpass because it means > we don't have to extract the subpass id from the subpass. > --- > src/intel/vulkan/genX_cmd_buffer.c | 12 +--- > 1 file changed, 5 insertions(+), 7 d

Re: [Mesa-dev] [PATCH 15/29] anv/cmd_buffer: Add begin/end_subpass helpers

2018-01-09 Thread Nanley Chery

On Mon, Nov 27, 2017 at 07:06:05PM -0800, Jason Ekstrand wrote: > Having begin/end_subpass is a bit nicer than the begin/next/end hooks > that Vulkan gives us. > --- > src/intel/vulkan/genX_cmd_buffer.c | 55 > +- > 1 file changed, 31 insertions(+), 24 deletion

[Mesa-dev] [PATCH] i915g: fix crashes with wined3d

2018-01-09 Thread Christopher Egert

I'm not too familiar with gallium3d, but this fixes crashes with 3DMark2001 and GTA3 in wine-staging. This should be fixed properly in the future. Signed-off-by: Christopher Egert --- src/gallium/drivers/i915/i915_clear.c| 3 ++- src/gallium/drivers/i915/i915_state_static.c | 4 +++- 2

Re: [Mesa-dev] [PATCH 14/29] anv/cmd_buffer: Apply subpass flushes before set_subpass

2018-01-09 Thread Nanley Chery

On Mon, Nov 27, 2017 at 07:06:04PM -0800, Jason Ekstrand wrote: > This seems slightly more correct because it means that the flushes > happen before any clears or resolves implied by the subpass transition. > --- > src/intel/vulkan/genX_cmd_buffer.c | 8 > 1 file changed, 4 insertions(+),

[Mesa-dev] [PATCH v2 1/3] util/crc32: don't drop the const qualifier

2018-01-09 Thread Grazvydas Ignotas

Signed-off-by: Grazvydas Ignotas --- src/util/crc32.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/util/crc32.c b/src/util/crc32.c index 44d637c..f2e01c6 100644 --- a/src/util/crc32.c +++ b/src/util/crc32.c @@ -109,11 +109,11 @@ util_crc32_table[256] = { * @sa http://

[Mesa-dev] [PATCH v2 2/3] android, configure, meson: define HAVE_ZLIB

2018-01-09 Thread Grazvydas Ignotas

The next change wants to use some optional zlib functionality, however not all platforms currently use it. Based on earlier Jordan Justen's patches and their review feedback. Signed-off-by: Grazvydas Ignotas --- Android.common.mk | 1 + configure.ac | 1 + meson.build | 1 + 3 files c

[Mesa-dev] [PATCH v2 3/3] util: use faster zlib's CRC32 implementaion

2018-01-09 Thread Grazvydas Ignotas

zlib provides a faster slice-by-4 CRC32 implementation than the traditional single byte lookup one used by mesa. As most supported platforms now link zlib unconditionally, we can easily use it. Improvement for a 1MB buffer (avg MB/s, n=100, zlib 1.2.8): i5-6600KC2D E4500 mes

Re: [Mesa-dev] [PATCH 1/2] amd/common: do not rely on the pipeline for the push constants logic

2018-01-09 Thread Bas Nieuwenhuizen

Reviewed-by: Bas Nieuwenhuizen for the series. On Tue, Jan 9, 2018 at 6:09 PM, Samuel Pitoiset wrote: > It makes more sense to rely on nir_intrinsic_load_push_constant > instead of the pipeline layout. > > Signed-off-by: Samuel Pitoiset > --- > src/amd/common/ac_nir_to_llvm.c | 6 +++--- > s

Re: [Mesa-dev] [PATCH 3/3] radv/gfx9: calculate the number of ES VGPRs for merged shaders

2018-01-09 Thread Bas Nieuwenhuizen

Reviewed-by: Bas Nieuwenhuizen for the series. On Tue, Jan 9, 2018 at 4:01 PM, Samuel Pitoiset wrote: > Signed-off-by: Samuel Pitoiset > --- > src/amd/vulkan/radv_shader.c | 13 ++--- > 1 file changed, 10 insertions(+), 3 deletions(-) > > diff --git a/src/amd/vulkan/radv_shader.c b/sr

Re: [Mesa-dev] [PATCH 2/8] intel/isl: Add support to emit clear value address.

2018-01-09 Thread Nanley Chery

On Mon, Jan 08, 2018 at 04:00:37PM -0800, Jason Ekstrand wrote: > On Mon, Jan 8, 2018 at 2:29 PM, Nanley Chery wrote: > > > On Fri, Dec 15, 2017 at 02:53:29PM -0800, Rafael Antognolli wrote: > > > gen10 can emit the clear color by setting it on a buffer somewhere, and > > > then adding only the a

Re: [Mesa-dev] [PATCH 3/8] anv: Make the clear state buffer 64 bytes aligned.

2018-01-09 Thread Nanley Chery

On Tue, Jan 09, 2018 at 11:26:26AM -0800, Jason Ekstrand wrote: > On Tue, Jan 9, 2018 at 10:33 AM, Nanley Chery wrote: > > > On Mon, Jan 08, 2018 at 04:03:47PM -0800, Jason Ekstrand wrote: > > > On Mon, Jan 8, 2018 at 3:00 PM, Nanley Chery > > wrote: > > > > > > > On Fri, Dec 15, 2017 at 02:53:3

Re: [Mesa-dev] [PATCH 3/8] anv: Make the clear state buffer 64 bytes aligned.

2018-01-09 Thread Nanley Chery

On Mon, Jan 08, 2018 at 04:33:25PM -0800, Rafael Antognolli wrote: > On Mon, Jan 08, 2018 at 04:03:47PM -0800, Jason Ekstrand wrote: > > On Mon, Jan 8, 2018 at 3:00 PM, Nanley Chery wrote: > > > > On Fri, Dec 15, 2017 at 02:53:30PM -0800, Rafael Antognolli wrote: > > > On Gen10+, if we us

[Mesa-dev] [PATCH] intel: Add more Coffee Lake PCI IDs

2018-01-09 Thread Anuj Phogat

More Coffee Lake PCI IDs have been added to the spec. Cc: Rodrigo Vivi Signed-off-by: Anuj Phogat --- include/pci_ids/i965_pci_ids.h | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/include/pci_ids/i965_pci_ids.h b/include/pci_ids/i965_pci_ids.h index 0dd01a4343..96

Re: [Mesa-dev] [PATCH 1/2] i965/fs: Add/use functions to convert to 3src_align1 vstride/hstride

2018-01-09 Thread Scott D Phillips

Matt Turner writes: > On Mon, Jan 8, 2018 at 5:01 PM, Scott D Phillips > wrote: >> Matt Turner writes: >> >>> Some cases weren't handled, such as stride 4 which is needed for 64-bit >>> operations. Presumably fixes the assertion failure mentioned in commit >>> 2d0457203871 (Revert "i965/fs: Use

Re: [Mesa-dev] [PATCH 1/2] i965/fs: Add/use functions to convert to 3src_align1 vstride/hstride

2018-01-09 Thread Matt Turner

On Mon, Jan 8, 2018 at 5:01 PM, Scott D Phillips wrote: > Matt Turner writes: > >> Some cases weren't handled, such as stride 4 which is needed for 64-bit >> operations. Presumably fixes the assertion failure mentioned in commit >> 2d0457203871 (Revert "i965/fs: Use align1 mode on ternary instruc

Re: [Mesa-dev] [PATCH 3/8] anv: Make the clear state buffer 64 bytes aligned.

2018-01-09 Thread Jason Ekstrand

On Tue, Jan 9, 2018 at 10:33 AM, Nanley Chery wrote: > On Mon, Jan 08, 2018 at 04:03:47PM -0800, Jason Ekstrand wrote: > > On Mon, Jan 8, 2018 at 3:00 PM, Nanley Chery > wrote: > > > > > On Fri, Dec 15, 2017 at 02:53:30PM -0800, Rafael Antognolli wrote: > > > > On Gen10+, if we use the clear sta

Re: [Mesa-dev] [PATCH 3/8] anv: Make the clear state buffer 64 bytes aligned.

2018-01-09 Thread Nanley Chery

On Mon, Jan 08, 2018 at 04:03:47PM -0800, Jason Ekstrand wrote: > On Mon, Jan 8, 2018 at 3:00 PM, Nanley Chery wrote: > > > On Fri, Dec 15, 2017 at 02:53:30PM -0800, Rafael Antognolli wrote: > > > On Gen10+, if we use the clear state address field in the surface state > > > instead of the clear c

Re: [Mesa-dev] [PATCH] intel: Apply Geminilake "Barrier Mode" workaround.

2018-01-09 Thread Kenneth Graunke

On Monday, January 8, 2018 3:00:30 PM PST Rafael Antognolli wrote: > On Thu, Jan 04, 2018 at 11:36:48AM -0800, Kenneth Graunke wrote: > > Apparently, Geminilake requires you to whack a chicken bit to select > > either compute or tessellation mode for barriers. The recommendation > > is to switch b

1 2 >

1 - 100 of 150 matches

Mail list logo