Re: [Mesa-dev] [PATCH 2/4] nv50ir: fix unnecessary parentheses warning

2019-09-22 Thread Rhys Kidd
On Sat, 21 Sep 2019 at 04:27, Karol Herbst  wrote:

> Signed-off-by: Karol Herbst 
>

Reviewed-by: Rhys Kidd 


> ---
>  src/gallium/drivers/nouveau/codegen/nv50_ir_util.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_util.h
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_util.h
> index 307c23d5e03..b1766f48205 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_util.h
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_util.h
> @@ -145,7 +145,7 @@ public:
>  #define DLLIST_EMPTY(__list) ((__list)->next == (__list))
>
>  #define DLLIST_FOR_EACH(list, it) \
> -   for (DLList::Iterator (it) = (list)->iterator(); !(it).end();
> (it).next())
> +   for (DLList::Iterator it = (list)->iterator(); !(it).end();
> (it).next())
>
>  class DLList
>  {
> --
> 2.21.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/3] android: aco: add support for libmesa_aco

2019-09-22 Thread Mauro Rossi
Android building rules are added in src/amd/Android.compiler.mk
libmesa_aco static library is built conditionally to radeonsi
as done for vulkan.radv module

This will prevent Android build errors for non x86 systems

filter-out compiler/aco_instruction_selection_setup.cpp source,
as already included by compiler/aco_instruction_selection.cpp
and would cause several multiple definition linker errors

NOTE: libLLVM requires AMDGPU Disassembler to build radv with aco

Fixes: 93c8ebf ("aco: Initial commit of independent AMD compiler")
Fixes: a70a998 ("radv/aco: Setup alternate path in RADV to support the 
experimental ACO compiler")
Signed-off-by: Mauro Rossi 
---
 src/amd/Android.compiler.mk | 93 +
 src/amd/Android.mk  |  1 +
 src/amd/Makefile.sources| 32 +
 src/amd/vulkan/Android.mk   |  4 +-
 4 files changed, 129 insertions(+), 1 deletion(-)
 create mode 100644 src/amd/Android.compiler.mk

diff --git a/src/amd/Android.compiler.mk b/src/amd/Android.compiler.mk
new file mode 100644
index 00..a62c93d9d3
--- /dev/null
+++ b/src/amd/Android.compiler.mk
@@ -0,0 +1,93 @@
+# Copyright © 2018 Valve Corporation
+# Copyright © 2019 Mauro Rossi issor.or...@gmail.com
+
+# Permission is hereby granted, free of charge, to any person obtaining a
+# copy of this software and associated documentation files (the "Software"),
+# to deal in the Software without restriction, including without limitation
+# the rights to use, copy, modify, merge, publish, distribute, sublicense,
+# and/or sell copies of the Software, and to permit persons to whom the
+# Software is furnished to do so, subject to the following conditions:
+#
+# The above copyright notice and this permission notice (including the next
+# paragraph) shall be included in all copies or substantial portions of the
+# Software.
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+# IN THE SOFTWARE.
+
+LOCAL_PATH := $(call my-dir)
+
+include $(LOCAL_PATH)/Makefile.sources
+
+# ---
+# Build libmesa_aco
+# ---
+
+include $(CLEAR_VARS)
+
+LOCAL_MODULE := libmesa_aco
+
+# filter-out compiler/aco_instruction_selection_setup.cpp because
+# it's already included by compiler/aco_instruction_selection.cpp
+LOCAL_SRC_FILES := \
+   $(filter-out compiler/aco_instruction_selection_setup.cpp, $(ACO_FILES))
+
+LOCAL_CFLAGS += -DFORCE_BUILD_AMDGPU   # instructs LLVM to declare 
LLVMInitializeAMDGPU* functions
+
+LOCAL_CPPFLAGS += -Wall -std=c++14
+
+# generate sources
+LOCAL_MODULE_CLASS := STATIC_LIBRARIES
+intermediates := $(call local-generated-sources-dir)
+LOCAL_GENERATED_SOURCES += $(addprefix $(intermediates)/, 
$(ACO_GENERATED_FILES))
+
+ACO_OPCODES_H_SCRIPT := $(MESA_TOP)/src/amd/compiler/aco_opcodes_h.py
+ACO_OPCODES_CPP_SCRIPT := $(MESA_TOP)/src/amd/compiler/aco_opcodes_cpp.py
+ACO_BUILDER_H_SCRIPT := $(MESA_TOP)/src/amd/compiler/aco_builder_h.py
+
+ACO_DEPS := $(MESA_TOP)/src/amd/compiler/aco_opcodes.py
+
+$(intermediates)/compiler/aco_opcodes.h: $(ACO_OPCODES_H_SCRIPT) $(ACO_DEPS)
+   @mkdir -p $(dir $@)
+   @echo "Gen Header: $(PRIVATE_MODULE) <= $(notdir $(@))"
+   $(hide) $(MESA_PYTHON2) $(ACO_OPCODES_H_SCRIPT) > $@ || ($(RM) $@; 
false)
+
+$(intermediates)/compiler/aco_opcodes.cpp: $(ACO_OPCODES_CPP_SCRIPT) 
$(ACO_DEPS)
+   @mkdir -p $(dir $@)
+   @echo "Gen Header: $(PRIVATE_MODULE) <= $(notdir $(@))"
+   $(hide) $(MESA_PYTHON2) $(ACO_OPCODES_CPP_SCRIPT) > $@ || ($(RM) $@; 
false)
+
+$(intermediates)/compiler/aco_builder.h: $(ACO_BUILDER_H_SCRIPT) $(ACO_DEPS)
+   @mkdir -p $(dir $@)
+   @echo "Gen Header: $(PRIVATE_MODULE) <= $(notdir $(@))"
+   $(hide) $(MESA_PYTHON2) $(ACO_BUILDER_H_SCRIPT) > $@ || ($(RM) $@; 
false)
+
+LOCAL_C_INCLUDES := \
+   $(MESA_TOP)/src/amd \
+   $(MESA_TOP)/src/amd/common \
+   $(MESA_TOP)/src/amd/compiler \
+   $(MESA_TOP)/src/compiler/nir \
+   $(MESA_TOP)/src/mapi \
+   $(MESA_TOP)/src/mesa \
+   $(intermediates)/compiler
+
+LOCAL_EXPORT_C_INCLUDE_DIRS := \
+   $(MESA_TOP)/src/amd/compiler \
+   $(intermediates)/compiler
+
+LOCAL_SHARED_LIBRARIES := \
+   libdrm_amdgpu
+
+LOCAL_STATIC_LIBRARIES := \
+   libmesa_amd_common \
+   libmesa_nir
+
+$(call mesa-build-with-llvm)
+
+include $(MESA_COMMON_MK)
+include $(BUILD_STATIC_LIBRARY)
diff --git a/src/amd/Android.mk b/src/amd/Android.mk
index e40e7da01b..c9dbeafde1 100644
--- a/src/amd/Android.mk
+++ b/src/amd/Android.mk
@@ -28,5 +28,6 @@ include $(LOCAL_PATH)/

[Mesa-dev] [PATCH 2/3] android: compiler/nir: build nir_divergence_analysis.c

2019-09-22 Thread Mauro Rossi
Prerequisite to avoid following radv linking error happening with aco

FAILED: 
out/target/product/x86_64/obj_x86/SHARED_LIBRARIES/vulkan.radv_intermediates/LINKED/vulkan.radv.so
...
external/mesa/src/amd/compiler/aco_instruction_selection_setup.cpp:178:
error: undefined reference to 'nir_divergence_analysis'
clang.real: error: linker command failed with exit code 1 (use -v to see 
invocation)

Fixes: df86c5f ("nir: add divergence analysis pass.")
Signed-off-by: Mauro Rossi 
---
 src/compiler/Makefile.sources | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/compiler/Makefile.sources b/src/compiler/Makefile.sources
index c4d2c2be7c..bc49e00525 100644
--- a/src/compiler/Makefile.sources
+++ b/src/compiler/Makefile.sources
@@ -215,6 +215,7 @@ NIR_FILES = \
nir/nir_control_flow_private.h \
nir/nir_deref.c \
nir/nir_deref.h \
+   nir/nir_divergence_analysis.c \
nir/nir_dominance.c \
nir/nir_format_convert.h \
nir/nir_from_ssa.c \
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/4] nv50ir/nir: comparison of integer expressions of different signedness warning

2019-09-22 Thread Rhys Kidd
On Sat, 21 Sep 2019 at 04:27, Karol Herbst  wrote:

> Signed-off-by: Karol Herbst 
>

Reviewed-by: Rhys Kidd 


> ---
>  src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
> index 4e86ab8f8cc..95b60d2c7d0 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp
> @@ -1957,7 +1957,7 @@ Converter::visit(nir_intrinsic_instr *insn)
>   }
>   case Program::TYPE_GEOMETRY:
>   case Program::TYPE_VERTEX: {
> -if (info->io.genUserClip > 0 && idx == clipVertexOutput) {
> +if (info->io.genUserClip > 0 && idx ==
> (uint32_t)clipVertexOutput) {
> mkMov(clipVtx[i], src);
> src = clipVtx[i];
>  }
> --
> 2.21.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/3] panfrost: Make sure a clear does not re-use a pre-existing batch

2019-09-22 Thread Boris Brezillon
On Fri, 20 Sep 2019 15:45:33 -0400
Alyssa Rosenzweig  wrote:

> To be clear, if we have a batch and do the following operations:
> 
>   clear red
>   draw 1
>   clear green
>   draw 2
>   flush
> 
> All we should see is #2 on a green background, which this patch handles
> by the second clear invalidating all the clears/draws that came before
> it (provided there is no flush in between). 
> 
> I might just be tripped up by the "freeze" name. That really means throw
> away / free here, I guess?

Nope. Freeze means "stop queuing new draws to this batch". I guess we
could free the batch as well if the result of the previous draws/clear
are really overwritten by this new clear, but that implies checking the
new clear flags to make sure they target the same depth/stencil/color
combination. On the other hand, I'm wondering if it's really the
driver's job to try to optimize silly things the apps might do. I mean,
the sequence you describe does not look like a wise thing to do since
the "clear red+draw 1" end up being overwritten by "clear green + draw
2".

> 
> Provided that's the idea (and we're not somehow saving the original draw
> 1), it's Reviewed-by A R 
> 
> On Fri, Sep 20, 2019 at 04:53:37PM +0200, Boris Brezillon wrote:
> > glClear()s are expected to be the first thing GL apps do before drawing
> > new things. If there's already an existing batch targetting the same
> > FBO that has draws attached to it, we should make sure the new clear
> > gets a new batch assigned to guaranteed that the FB content is actually
> > cleared with the requested color/depth/stencil values.
> > 
> > We create a panfrost_get_fresh_batch_for_fbo() helper for that and
> > call it from panfrost_clear().
> > 
> > Signed-off-by: Boris Brezillon 
> > ---
> >  src/gallium/drivers/panfrost/pan_context.c |  2 +-
> >  src/gallium/drivers/panfrost/pan_job.c | 21 +
> >  src/gallium/drivers/panfrost/pan_job.h |  3 +++
> >  3 files changed, 25 insertions(+), 1 deletion(-)
> > 
> > diff --git a/src/gallium/drivers/panfrost/pan_context.c 
> > b/src/gallium/drivers/panfrost/pan_context.c
> > index ac01461a07fe..b2f2a9da7a51 100644
> > --- a/src/gallium/drivers/panfrost/pan_context.c
> > +++ b/src/gallium/drivers/panfrost/pan_context.c
> > @@ -162,7 +162,7 @@ panfrost_clear(
> >  double depth, unsigned stencil)
> >  {
> >  struct panfrost_context *ctx = pan_context(pipe);
> > -struct panfrost_batch *batch = panfrost_get_batch_for_fbo(ctx);
> > +struct panfrost_batch *batch = 
> > panfrost_get_fresh_batch_for_fbo(ctx);
> >  
> >  panfrost_batch_add_fbo_bos(batch);
> >  panfrost_batch_clear(batch, buffers, color, depth, stencil);
> > diff --git a/src/gallium/drivers/panfrost/pan_job.c 
> > b/src/gallium/drivers/panfrost/pan_job.c
> > index d8330bc133a6..4ec2aa0565d7 100644
> > --- a/src/gallium/drivers/panfrost/pan_job.c
> > +++ b/src/gallium/drivers/panfrost/pan_job.c
> > @@ -298,6 +298,27 @@ panfrost_get_batch_for_fbo(struct panfrost_context 
> > *ctx)
> >  return batch;
> >  }
> >  
> > +struct panfrost_batch *
> > +panfrost_get_fresh_batch_for_fbo(struct panfrost_context *ctx)
> > +{
> > +struct panfrost_batch *batch;
> > +
> > +batch = panfrost_get_batch(ctx, &ctx->pipe_framebuffer);
> > +
> > +/* The batch has no draw/clear queued, let's return it directly.
> > + * Note that it's perfectly fine to re-use a batch with an
> > + * existing clear, we'll just update it with the new clear request.
> > + */
> > +if (!batch->last_job.gpu)
> > +return batch;
> > +
> > +/* Otherwise, we need to freeze the existing one and instantiate a 
> > new
> > + * one.
> > + */
> > +panfrost_freeze_batch(batch);
> > +return panfrost_get_batch(ctx, &ctx->pipe_framebuffer);
> > +}
> > +
> >  static bool
> >  panfrost_batch_fence_is_signaled(struct panfrost_batch_fence *fence)
> >  {
> > diff --git a/src/gallium/drivers/panfrost/pan_job.h 
> > b/src/gallium/drivers/panfrost/pan_job.h
> > index e1b1f56a2e64..0bd78bba267a 100644
> > --- a/src/gallium/drivers/panfrost/pan_job.h
> > +++ b/src/gallium/drivers/panfrost/pan_job.h
> > @@ -172,6 +172,9 @@ panfrost_batch_fence_reference(struct 
> > panfrost_batch_fence *batch);
> >  struct panfrost_batch *
> >  panfrost_get_batch_for_fbo(struct panfrost_context *ctx);
> >  
> > +struct panfrost_batch *
> > +panfrost_get_fresh_batch_for_fbo(struct panfrost_context *ctx);
> > +
> >  void
> >  panfrost_batch_init(struct panfrost_context *ctx);
> >  
> > -- 
> > 2.21.0  

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3 01/17] panfrost: Extend the panfrost_batch_add_bo() API to pass access flags

2019-09-22 Thread Boris Brezillon
On Fri, 20 Sep 2019 16:53:49 -0400
Alyssa Rosenzweig  wrote:

> > @@ -1121,7 +1134,11 @@ panfrost_emit_for_draw(struct panfrost_context *ctx, 
> > bool with_vertex_data)
> >  
> >  struct panfrost_shader_state *ss = 
> > &all->variants[all->active_variant];
> >  
> > -panfrost_batch_add_bo(batch, ss->bo);
> > +panfrost_batch_add_bo(batch, ss->bo,
> > +  PAN_BO_ACCESS_PRIVATE |
> > +  PAN_BO_ACCESS_READ |  
> 
> > +  PAN_BO_ACCESS_VERTEX_TILER |
> > +  PAN_BO_ACCESS_FRAGMENT);  
> 
> I believe this should be just the access for the stage `i`
> 
> Although actually I am not at all sure what this batch_add_bo is doing
> at all?
> 
> I think this batch_add_bo should probably dropped altogether? This loop
> is dealing with constant buffers; the shaders themselves were added

I'll double check. I couldn't find where BOs containing shader programs
were added last time I looked.

> 
> >  void panfrost_batch_add_fbo_bos(struct panfrost_batch *batch)
> >  {
> > +uint32_t flags = PAN_BO_ACCESS_SHARED | PAN_BO_ACCESS_WRITE |
> > + PAN_BO_ACCESS_VERTEX_TILER |
> > + PAN_BO_ACCESS_FRAGMENT;  
> 
> I think we can drop VERTEX_TILER here...? The buffers are written right
> at the end of the FRAGMENT job, not touched before that.

What about the read done when drawing the wallpaper? I guess it's also
only read by the fragment job, but I wasn't sure.

> 
> If nothing else is broken, this should allow a nice perf boost with
> pipelining, so the vertex/tiler from frame n+1 can run in parallel with
> the fragment of frame n (rather than blocking on frame n finishing with
> the FBOs).

Would require the kernel patches I posted earlier for that to
happen ;-). Right now all jobs touching the same BO are serialized
because of the implicit BO fences added by the kernel driver.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/3] android: aco: fix undefined template 'std::__1::array' build errors

2019-09-22 Thread Mauro Rossi
Fixes a few building errors similar to the following:

In file included from 
external/mesa/src/amd/compiler/aco_instruction_selection.cpp:26:
In file included from external/libcxx/include/algorithm:639:
external/libcxx/include/utility:321:9:
error: implicit instantiation of undefined template 'std::__1::array'
_T2 second;
^

Fixes: 93c8ebf ("aco: Initial commit of independent AMD compiler")
Signed-off-by: Mauro Rossi 
---
 src/amd/compiler/aco_instruction_selection.cpp   | 1 +
 src/amd/compiler/aco_instruction_selection_setup.cpp | 1 +
 src/amd/compiler/aco_print_asm.cpp   | 2 +-
 src/amd/compiler/aco_register_allocation.cpp | 1 +
 src/amd/compiler/aco_validate.cpp| 1 +
 5 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/amd/compiler/aco_instruction_selection.cpp 
b/src/amd/compiler/aco_instruction_selection.cpp
index d52043f3c0..164ac8ed5d 100644
--- a/src/amd/compiler/aco_instruction_selection.cpp
+++ b/src/amd/compiler/aco_instruction_selection.cpp
@@ -24,6 +24,7 @@
  */
 
 #include 
+#include 
 #include 
 
 #include "aco_ir.h"
diff --git a/src/amd/compiler/aco_instruction_selection_setup.cpp 
b/src/amd/compiler/aco_instruction_selection_setup.cpp
index 6c4c408e65..e8522c3d12 100644
--- a/src/amd/compiler/aco_instruction_selection_setup.cpp
+++ b/src/amd/compiler/aco_instruction_selection_setup.cpp
@@ -22,6 +22,7 @@
  *
  */
 
+#include 
 #include 
 #include "aco_ir.h"
 #include "nir.h"
diff --git a/src/amd/compiler/aco_print_asm.cpp 
b/src/amd/compiler/aco_print_asm.cpp
index 31079aa1c4..d3f4c3cb40 100644
--- a/src/amd/compiler/aco_print_asm.cpp
+++ b/src/amd/compiler/aco_print_asm.cpp
@@ -1,4 +1,4 @@
-
+#include 
 #include 
 #include "aco_ir.h"
 #include "llvm-c/Disassembler.h"
diff --git a/src/amd/compiler/aco_register_allocation.cpp 
b/src/amd/compiler/aco_register_allocation.cpp
index d55f1febc6..47ea932f11 100644
--- a/src/amd/compiler/aco_register_allocation.cpp
+++ b/src/amd/compiler/aco_register_allocation.cpp
@@ -27,6 +27,7 @@
  */
 
 #include 
+#include 
 #include 
 #include 
 #include 
diff --git a/src/amd/compiler/aco_validate.cpp 
b/src/amd/compiler/aco_validate.cpp
index 0988d66df3..9919d0a585 100644
--- a/src/amd/compiler/aco_validate.cpp
+++ b/src/amd/compiler/aco_validate.cpp
@@ -24,6 +24,7 @@
 
 #include "aco_ir.h"
 
+#include 
 #include 
 
 namespace aco {
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/4] nv50ir: fix memset on non trivial types warning

2019-09-22 Thread Rhys Kidd
On Sat, 21 Sep 2019 at 04:27, Karol Herbst  wrote:

> Signed-off-by: Karol Herbst 
>

Reviewed-by: Rhys Kidd 


> ---
>  src/gallium/drivers/nouveau/codegen/nv50_ir.cpp| 4 +---
>  src/gallium/drivers/nouveau/codegen/nv50_ir.h  | 2 +-
>  src/gallium/drivers/nouveau/codegen/nv50_ir_target.cpp | 2 +-
>  3 files changed, 3 insertions(+), 5 deletions(-)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp
> b/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp
> index a181a13a3b1..45ee95bb103 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp
> @@ -903,10 +903,8 @@ Instruction::isCommutationLegal(const Instruction *i)
> const
>  }
>
>  TexInstruction::TexInstruction(Function *fn, operation op)
> -   : Instruction(fn, op, TYPE_F32)
> +   : Instruction(fn, op, TYPE_F32), tex()
>  {
> -   memset(&tex, 0, sizeof(tex));
> -
> tex.rIndirectSrc = -1;
> tex.sIndirectSrc = -1;
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir.h
> b/src/gallium/drivers/nouveau/codegen/nv50_ir.h
> index b19751ab372..5163e1a7ec2 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir.h
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir.h
> @@ -957,7 +957,7 @@ public:
> class Target
> {
> public:
> -  Target(TexTarget targ = TEX_TARGET_2D) : target(targ) { }
> +  Target(TexTarget targ = TEX_TARGET_1D) : target(targ) { }
>
>const char *getName() const { return descTable[target].name; }
>unsigned int getArgCount() const { return descTable[target].argc; }
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target.cpp
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_target.cpp
> index 5c6d0570ae2..609e7b89290 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target.cpp
> @@ -455,7 +455,7 @@ CodeEmitter::addInterp(int ipa, int reg, FixupApply
> apply)
>if (!fixupInfo)
>   return false;
>if (n == 0)
> - memset(fixupInfo, 0, sizeof(FixupInfo));
> + fixupInfo->count = 0;
> }
> ++fixupInfo->count;
>
> --
> 2.21.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3 01/17] panfrost: Extend the panfrost_batch_add_bo() API to pass access flags

2019-09-22 Thread Alyssa Rosenzweig
> > Although actually I am not at all sure what this batch_add_bo is doing
> > at all?
> > 
> > I think this batch_add_bo should probably dropped altogether? This loop
> > is dealing with constant buffers; the shaders themselves were added
> 
> I'll double check. I couldn't find where BOs containing shader programs
> were added last time I looked.

Masking a real bug :o

It should probably happen in panfrost_patch_shader_state?

> > >  void panfrost_batch_add_fbo_bos(struct panfrost_batch *batch)
> > >  {
> > > +uint32_t flags = PAN_BO_ACCESS_SHARED | PAN_BO_ACCESS_WRITE |
> > > + PAN_BO_ACCESS_VERTEX_TILER |
> > > + PAN_BO_ACCESS_FRAGMENT;  
> > 
> > I think we can drop VERTEX_TILER here...? The buffers are written right
> > at the end of the FRAGMENT job, not touched before that.
> 
> What about the read done when drawing the wallpaper? I guess it's also
> only read by the fragment job, but I wasn't sure.

As I stated before, I thought we should be adding the BO for
wallpapering when we actually wallpaper, since that's a slow path. Not
wallpapering is the default and ideally what most apps should do.

> > If nothing else is broken, this should allow a nice perf boost with
> > pipelining, so the vertex/tiler from frame n+1 can run in parallel with
> > the fragment of frame n (rather than blocking on frame n finishing with
> > the FBOs).
> 
> Would require the kernel patches I posted earlier for that to
> happen ;-). Right now all jobs touching the same BO are serialized
> because of the implicit BO fences added by the kernel driver.

Sure~ Maybe this sort of bug was the reason you weren't seeing
improvement from those kernel patches?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/3] panfrost: Make sure a clear does not re-use a pre-existing batch

2019-09-22 Thread Alyssa Rosenzweig
> > To be clear, if we have a batch and do the following operations:
> > 
> > clear red
> > draw 1
> > clear green
> > draw 2
> > flush
> > 
> > All we should see is #2 on a green background, which this patch handles
> > by the second clear invalidating all the clears/draws that came before
> > it (provided there is no flush in between). 
> > 
> > I might just be tripped up by the "freeze" name. That really means throw
> > away / free here, I guess?
> 
> Nope. Freeze means "stop queuing new draws to this batch". I guess we
> could free the batch as well if the result of the previous draws/clear
> are really overwritten by this new clear, but that implies checking the
> new clear flags to make sure they target the same depth/stencil/color
> combination. On the other hand, I'm wondering if it's really the
> driver's job to try to optimize silly things the apps might do. I mean,
> the sequence you describe does not look like a wise thing to do since
> the "clear red+draw 1" end up being overwritten by "clear green + draw
> 2".

I'm quite confused how this patch works, then.

A few thoughts: if the app clears all buffers in the middle, then yes
it's silly and yes we may as well optimize it out. (Should that be a thing GL
drivers have to do? I mean, if the other drivers are too...)

If the sequence is more like:

clear all buffers
draw 1
clear color buffer (preserve depth stencil)
draw 2
flush

That second clear should really be done by drawing a full screen quad,
just like if we were wallpapering, except loading its colour from a
uniform instead of a texture.

Similarly, a depth-only clear mid-frame can be emulated by drawing a
full-screen quad with the gl_Position.zw components juryrigged to the
desired depth components, and disabling colour draws by setting the
colour mask to 0x0. That also means you can skip having any shader at
all (literally set the shader pointer to 0x0) so that's faster.

Finally, a stencil-only clear can occur similarly playing tricks with
the stencil test parameters.

I suspect u_blitter or mesa/st is capable of doing these sorts of tricks
on our behalf, but I have not researched it extensively.

In any case, for a partial clear mid-frame, we would much rather do:

clear all buffers
draw 1
draw fullscreen quad (disable Z/S writes)
draw 2
flush

than what it sounds like this patch does

clear all buffers
draw 1
flush

clear all buffers
wallpaper
draw 2
flush

Please do correct me if I've misunderstood the logic here.

> 
> > 
> > Provided that's the idea (and we're not somehow saving the original draw
> > 1), it's Reviewed-by A R 
> > 
> > On Fri, Sep 20, 2019 at 04:53:37PM +0200, Boris Brezillon wrote:
> > > glClear()s are expected to be the first thing GL apps do before drawing
> > > new things. If there's already an existing batch targetting the same
> > > FBO that has draws attached to it, we should make sure the new clear
> > > gets a new batch assigned to guaranteed that the FB content is actually
> > > cleared with the requested color/depth/stencil values.
> > > 
> > > We create a panfrost_get_fresh_batch_for_fbo() helper for that and
> > > call it from panfrost_clear().
> > > 
> > > Signed-off-by: Boris Brezillon 
> > > ---
> > >  src/gallium/drivers/panfrost/pan_context.c |  2 +-
> > >  src/gallium/drivers/panfrost/pan_job.c | 21 +
> > >  src/gallium/drivers/panfrost/pan_job.h |  3 +++
> > >  3 files changed, 25 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/src/gallium/drivers/panfrost/pan_context.c 
> > > b/src/gallium/drivers/panfrost/pan_context.c
> > > index ac01461a07fe..b2f2a9da7a51 100644
> > > --- a/src/gallium/drivers/panfrost/pan_context.c
> > > +++ b/src/gallium/drivers/panfrost/pan_context.c
> > > @@ -162,7 +162,7 @@ panfrost_clear(
> > >  double depth, unsigned stencil)
> > >  {
> > >  struct panfrost_context *ctx = pan_context(pipe);
> > > -struct panfrost_batch *batch = panfrost_get_batch_for_fbo(ctx);
> > > +struct panfrost_batch *batch = 
> > > panfrost_get_fresh_batch_for_fbo(ctx);
> > >  
> > >  panfrost_batch_add_fbo_bos(batch);
> > >  panfrost_batch_clear(batch, buffers, color, depth, stencil);
> > > diff --git a/src/gallium/drivers/panfrost/pan_job.c 
> > > b/src/gallium/drivers/panfrost/pan_job.c
> > > index d8330bc133a6..4ec2aa0565d7 100644
> > > --- a/src/gallium/drivers/panfrost/pan_job.c
> > > +++ b/src/gallium/drivers/panfrost/pan_job.c
> > > @@ -298,6 +298,27 @@ panfrost_get_batch_for_fbo(struct panfrost_context 
> > > *ctx)
> > >  return batch;
> > >  }
> > >  
> > > +struct panfrost_batch *
> > > +panfrost_get_fresh_batch_for_fbo(struct panfrost_context *ctx)
> > > +{
> > > +struct panfrost_batch *batch;
> > > +
> > > +batch = panfrost_get_batch(ctx, &ctx->pipe_framebuffer);
> > > +
> >

Re: [Mesa-dev] [PATCH v3 01/17] panfrost: Extend the panfrost_batch_add_bo() API to pass access flags

2019-09-22 Thread Boris Brezillon
+your collabora address

On Sun, 22 Sep 2019 08:31:40 -0400
Alyssa Rosenzweig  wrote:

> > > Although actually I am not at all sure what this batch_add_bo is doing
> > > at all?
> > > 
> > > I think this batch_add_bo should probably dropped altogether? This loop
> > > is dealing with constant buffers; the shaders themselves were added  
> > 
> > I'll double check. I couldn't find where BOs containing shader programs
> > were added last time I looked.  
> 
> Masking a real bug :o
> 
> It should probably happen in panfrost_patch_shader_state?

Ok, I'll add it there, I wasn't sure this function was called for all
shaders, but looking at the code a second time it seems to be the case.

> 
> > > >  void panfrost_batch_add_fbo_bos(struct panfrost_batch *batch)
> > > >  {
> > > > +uint32_t flags = PAN_BO_ACCESS_SHARED | PAN_BO_ACCESS_WRITE |
> > > > + PAN_BO_ACCESS_VERTEX_TILER |
> > > > + PAN_BO_ACCESS_FRAGMENT;
> > > 
> > > I think we can drop VERTEX_TILER here...? The buffers are written right
> > > at the end of the FRAGMENT job, not touched before that.  
> > 
> > What about the read done when drawing the wallpaper? I guess it's also
> > only read by the fragment job, but I wasn't sure.  
> 
> As I stated before, I thought we should be adding the BO for
> wallpapering when we actually wallpaper, since that's a slow path. Not
> wallpapering is the default and ideally what most apps should do.

Wallpapering happens too late (when we are flushing the batch) to have
an impact on the dep graph, but we can probably know that wallpapering
will be needed before that. My question remains though, are
vertex/tiler supposed to touch the texture BO they're reading from, or
should we only flag the BO for FRAGMENT use.

> 
> > > If nothing else is broken, this should allow a nice perf boost with
> > > pipelining, so the vertex/tiler from frame n+1 can run in parallel with
> > > the fragment of frame n (rather than blocking on frame n finishing with
> > > the FBOs).  
> > 
> > Would require the kernel patches I posted earlier for that to
> > happen ;-). Right now all jobs touching the same BO are serialized
> > because of the implicit BO fences added by the kernel driver.  
> 
> Sure~ Maybe this sort of bug was the reason you weren't seeing
> improvement from those kernel patches?

Maybe.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/3] panfrost: Make sure a clear does not re-use a pre-existing batch

2019-09-22 Thread Boris Brezillon
On Sun, 22 Sep 2019 08:38:30 -0400
Alyssa Rosenzweig  wrote:

> > > To be clear, if we have a batch and do the following operations:
> > > 
> > >   clear red
> > >   draw 1
> > >   clear green
> > >   draw 2
> > >   flush
> > > 
> > > All we should see is #2 on a green background, which this patch handles
> > > by the second clear invalidating all the clears/draws that came before
> > > it (provided there is no flush in between). 
> > > 
> > > I might just be tripped up by the "freeze" name. That really means throw
> > > away / free here, I guess?  
> > 
> > Nope. Freeze means "stop queuing new draws to this batch". I guess we
> > could free the batch as well if the result of the previous draws/clear
> > are really overwritten by this new clear, but that implies checking the
> > new clear flags to make sure they target the same depth/stencil/color
> > combination. On the other hand, I'm wondering if it's really the
> > driver's job to try to optimize silly things the apps might do. I mean,
> > the sequence you describe does not look like a wise thing to do since
> > the "clear red+draw 1" end up being overwritten by "clear green + draw
> > 2".  
> 
> I'm quite confused how this patch works, then.
> 
> A few thoughts: if the app clears all buffers in the middle, then yes
> it's silly and yes we may as well optimize it out. (Should that be a thing GL
> drivers have to do? I mean, if the other drivers are too...)
> 
> If the sequence is more like:
> 
>   clear all buffers
>   draw 1
>   clear color buffer (preserve depth stencil)
>   draw 2
>   flush
> 
> That second clear should really be done by drawing a full screen quad,
> just like if we were wallpapering, except loading its colour from a
> uniform instead of a texture.
> 
> Similarly, a depth-only clear mid-frame can be emulated by drawing a
> full-screen quad with the gl_Position.zw components juryrigged to the
> desired depth components, and disabling colour draws by setting the
> colour mask to 0x0. That also means you can skip having any shader at
> all (literally set the shader pointer to 0x0) so that's faster.
> 
> Finally, a stencil-only clear can occur similarly playing tricks with
> the stencil test parameters.
> 
> I suspect u_blitter or mesa/st is capable of doing these sorts of tricks
> on our behalf, but I have not researched it extensively.

AFAIU, mesa/st does not transform the clear into a quad-draw for
depth/stencil only clears, it only does that for the color(s)-masked
case. That's certainly something we can do in Panfrost if we want to
avoid creating a new batch in such situations though.

> 
> In any case, for a partial clear mid-frame, we would much rather do:
> 
>   clear all buffers
>   draw 1
>   draw fullscreen quad (disable Z/S writes)
>   draw 2
>   flush
> 
> than what it sounds like this patch does
> 
>   clear all buffers
>   draw 1
>   flush
> 
>   clear all buffers
>   wallpaper
>   draw 2
>   flush
> 
> Please do correct me if I've misunderstood the logic here.


There's no flush introduced by this patch, it just adds a dep between
batch 2 and 1:

batch1: clear all buffers
draw 1

batch2: clear all buffers
wallpaper
draw 2

flush (actually flushes batch 1 and 2)

with batch2 --depends-on--> batch1
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3 01/17] panfrost: Extend the panfrost_batch_add_bo() API to pass access flags

2019-09-22 Thread Alyssa Rosenzweig
> +your collabora address

Thank you

> > > > I think this batch_add_bo should probably dropped altogether? This loop
> > > > is dealing with constant buffers; the shaders themselves were added  
> > > 
> > > I'll double check. I couldn't find where BOs containing shader programs
> > > were added last time I looked.  
> > 
> > Masking a real bug :o
> > 
> > It should probably happen in panfrost_patch_shader_state?
> 
> Ok, I'll add it there, I wasn't sure this function was called for all
> shaders, but looking at the code a second time it seems to be the case.

I think so as well, yeah.

> > As I stated before, I thought we should be adding the BO for
> > wallpapering when we actually wallpaper, since that's a slow path. Not
> > wallpapering is the default and ideally what most apps should do.
> 
> Wallpapering happens too late (when we are flushing the batch) to have
> an impact on the dep graph, but we can probably know that wallpapering
> will be needed before that. My question remains though, are
> vertex/tiler supposed to touch the texture BO they're reading from, or
> should we only flag the BO for FRAGMENT use.

Vertex/tiler should not touch the texture BO, unless you're texturing
from the vertex shader (which we're not for wallpapering).


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3 01/17] panfrost: Extend the panfrost_batch_add_bo() API to pass access flags

2019-09-22 Thread Boris Brezillon
On Sun, 22 Sep 2019 09:26:45 -0400
Alyssa Rosenzweig  wrote:

> > +your collabora address  
> 
> Thank you
> 
> > > > > I think this batch_add_bo should probably dropped altogether? This 
> > > > > loop
> > > > > is dealing with constant buffers; the shaders themselves were added   
> > > > >  
> > > > 
> > > > I'll double check. I couldn't find where BOs containing shader programs
> > > > were added last time I looked.
> > > 
> > > Masking a real bug :o
> > > 
> > > It should probably happen in panfrost_patch_shader_state?  
> > 
> > Ok, I'll add it there, I wasn't sure this function was called for all
> > shaders, but looking at the code a second time it seems to be the case.  
> 
> I think so as well, yeah.
> 
> > > As I stated before, I thought we should be adding the BO for
> > > wallpapering when we actually wallpaper, since that's a slow path. Not
> > > wallpapering is the default and ideally what most apps should do.  
> > 
> > Wallpapering happens too late (when we are flushing the batch) to have
> > an impact on the dep graph, but we can probably know that wallpapering
> > will be needed before that. My question remains though, are
> > vertex/tiler supposed to touch the texture BO they're reading from, or
> > should we only flag the BO for FRAGMENT use.  
> 
> Vertex/tiler should not touch the texture BO, unless you're texturing
> from the vertex shader (which we're not for wallpapering).

Okay, then adding the READ flag isn't really needed: a WRITE is more
constraining than a READ, and the FRAGMENT job already writes the FBO
BOs.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/3] panfrost: Make sure a clear does not re-use a pre-existing batch

2019-09-22 Thread Boris Brezillon
On Sun, 22 Sep 2019 15:24:10 +0200
Boris Brezillon  wrote:

> On Sun, 22 Sep 2019 08:38:30 -0400
> Alyssa Rosenzweig  wrote:
> 
> > > > To be clear, if we have a batch and do the following operations:
> > > > 
> > > > clear red
> > > > draw 1
> > > > clear green
> > > > draw 2
> > > > flush
> > > > 
> > > > All we should see is #2 on a green background, which this patch handles
> > > > by the second clear invalidating all the clears/draws that came before
> > > > it (provided there is no flush in between). 
> > > > 
> > > > I might just be tripped up by the "freeze" name. That really means throw
> > > > away / free here, I guess?
> > > 
> > > Nope. Freeze means "stop queuing new draws to this batch". I guess we
> > > could free the batch as well if the result of the previous draws/clear
> > > are really overwritten by this new clear, but that implies checking the
> > > new clear flags to make sure they target the same depth/stencil/color
> > > combination. On the other hand, I'm wondering if it's really the
> > > driver's job to try to optimize silly things the apps might do. I mean,
> > > the sequence you describe does not look like a wise thing to do since
> > > the "clear red+draw 1" end up being overwritten by "clear green + draw
> > > 2".
> > 
> > I'm quite confused how this patch works, then.
> > 
> > A few thoughts: if the app clears all buffers in the middle, then yes
> > it's silly and yes we may as well optimize it out. (Should that be a thing 
> > GL
> > drivers have to do? I mean, if the other drivers are too...)
> > 
> > If the sequence is more like:
> > 
> > clear all buffers
> > draw 1
> > clear color buffer (preserve depth stencil)
> > draw 2
> > flush
> > 
> > That second clear should really be done by drawing a full screen quad,
> > just like if we were wallpapering, except loading its colour from a
> > uniform instead of a texture.
> > 
> > Similarly, a depth-only clear mid-frame can be emulated by drawing a
> > full-screen quad with the gl_Position.zw components juryrigged to the
> > desired depth components, and disabling colour draws by setting the
> > colour mask to 0x0. That also means you can skip having any shader at
> > all (literally set the shader pointer to 0x0) so that's faster.
> > 
> > Finally, a stencil-only clear can occur similarly playing tricks with
> > the stencil test parameters.
> > 
> > I suspect u_blitter or mesa/st is capable of doing these sorts of tricks
> > on our behalf, but I have not researched it extensively.  
> 
> AFAIU, mesa/st does not transform the clear into a quad-draw for
> depth/stencil only clears, it only does that for the color(s)-masked
> case. That's certainly something we can do in Panfrost if we want to
> avoid creating a new batch in such situations though.

One more thing: optimization of the above scenario is probably
something we'll want to have at some point, but I think the current
patch is worth applying in the meantime. All this patch does is
enforcing ordering of clears/draws to make sure the end result matches
users expectation.

> 
> > 
> > In any case, for a partial clear mid-frame, we would much rather do:
> > 
> > clear all buffers
> > draw 1
> > draw fullscreen quad (disable Z/S writes)
> > draw 2
> > flush
> > 
> > than what it sounds like this patch does
> > 
> > clear all buffers
> > draw 1
> > flush
> > 
> > clear all buffers
> > wallpaper
> > draw 2
> > flush
> > 
> > Please do correct me if I've misunderstood the logic here.  
> 
> 
> There's no flush introduced by this patch, it just adds a dep between
> batch 2 and 1:
> 
> batch1:   clear all buffers
>   draw 1
> 
> batch2: clear all buffers
>   wallpaper
>   draw 2
> 
>   flush (actually flushes batch 1 and 2)
> 
> with batch2 --depends-on--> batch1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev