Re: [Mesa-dev] [PATCH 2/4] nv50ir: fix unnecessary parentheses warning
On Sat, 21 Sep 2019 at 04:27, Karol Herbst wrote: > Signed-off-by: Karol Herbst > Reviewed-by: Rhys Kidd > --- > src/gallium/drivers/nouveau/codegen/nv50_ir_util.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_util.h > b/src/gallium/drivers/nouveau/codegen/nv50_ir_util.h > index 307c23d5e03..b1766f48205 100644 > --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_util.h > +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_util.h > @@ -145,7 +145,7 @@ public: > #define DLLIST_EMPTY(__list) ((__list)->next == (__list)) > > #define DLLIST_FOR_EACH(list, it) \ > - for (DLList::Iterator (it) = (list)->iterator(); !(it).end(); > (it).next()) > + for (DLList::Iterator it = (list)->iterator(); !(it).end(); > (it).next()) > > class DLList > { > -- > 2.21.0 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/3] android: aco: add support for libmesa_aco
Android building rules are added in src/amd/Android.compiler.mk libmesa_aco static library is built conditionally to radeonsi as done for vulkan.radv module This will prevent Android build errors for non x86 systems filter-out compiler/aco_instruction_selection_setup.cpp source, as already included by compiler/aco_instruction_selection.cpp and would cause several multiple definition linker errors NOTE: libLLVM requires AMDGPU Disassembler to build radv with aco Fixes: 93c8ebf ("aco: Initial commit of independent AMD compiler") Fixes: a70a998 ("radv/aco: Setup alternate path in RADV to support the experimental ACO compiler") Signed-off-by: Mauro Rossi --- src/amd/Android.compiler.mk | 93 + src/amd/Android.mk | 1 + src/amd/Makefile.sources| 32 + src/amd/vulkan/Android.mk | 4 +- 4 files changed, 129 insertions(+), 1 deletion(-) create mode 100644 src/amd/Android.compiler.mk diff --git a/src/amd/Android.compiler.mk b/src/amd/Android.compiler.mk new file mode 100644 index 00..a62c93d9d3 --- /dev/null +++ b/src/amd/Android.compiler.mk @@ -0,0 +1,93 @@ +# Copyright © 2018 Valve Corporation +# Copyright © 2019 Mauro Rossi issor.or...@gmail.com + +# Permission is hereby granted, free of charge, to any person obtaining a +# copy of this software and associated documentation files (the "Software"), +# to deal in the Software without restriction, including without limitation +# the rights to use, copy, modify, merge, publish, distribute, sublicense, +# and/or sell copies of the Software, and to permit persons to whom the +# Software is furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice (including the next +# paragraph) shall be included in all copies or substantial portions of the +# Software. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL +# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING +# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS +# IN THE SOFTWARE. + +LOCAL_PATH := $(call my-dir) + +include $(LOCAL_PATH)/Makefile.sources + +# --- +# Build libmesa_aco +# --- + +include $(CLEAR_VARS) + +LOCAL_MODULE := libmesa_aco + +# filter-out compiler/aco_instruction_selection_setup.cpp because +# it's already included by compiler/aco_instruction_selection.cpp +LOCAL_SRC_FILES := \ + $(filter-out compiler/aco_instruction_selection_setup.cpp, $(ACO_FILES)) + +LOCAL_CFLAGS += -DFORCE_BUILD_AMDGPU # instructs LLVM to declare LLVMInitializeAMDGPU* functions + +LOCAL_CPPFLAGS += -Wall -std=c++14 + +# generate sources +LOCAL_MODULE_CLASS := STATIC_LIBRARIES +intermediates := $(call local-generated-sources-dir) +LOCAL_GENERATED_SOURCES += $(addprefix $(intermediates)/, $(ACO_GENERATED_FILES)) + +ACO_OPCODES_H_SCRIPT := $(MESA_TOP)/src/amd/compiler/aco_opcodes_h.py +ACO_OPCODES_CPP_SCRIPT := $(MESA_TOP)/src/amd/compiler/aco_opcodes_cpp.py +ACO_BUILDER_H_SCRIPT := $(MESA_TOP)/src/amd/compiler/aco_builder_h.py + +ACO_DEPS := $(MESA_TOP)/src/amd/compiler/aco_opcodes.py + +$(intermediates)/compiler/aco_opcodes.h: $(ACO_OPCODES_H_SCRIPT) $(ACO_DEPS) + @mkdir -p $(dir $@) + @echo "Gen Header: $(PRIVATE_MODULE) <= $(notdir $(@))" + $(hide) $(MESA_PYTHON2) $(ACO_OPCODES_H_SCRIPT) > $@ || ($(RM) $@; false) + +$(intermediates)/compiler/aco_opcodes.cpp: $(ACO_OPCODES_CPP_SCRIPT) $(ACO_DEPS) + @mkdir -p $(dir $@) + @echo "Gen Header: $(PRIVATE_MODULE) <= $(notdir $(@))" + $(hide) $(MESA_PYTHON2) $(ACO_OPCODES_CPP_SCRIPT) > $@ || ($(RM) $@; false) + +$(intermediates)/compiler/aco_builder.h: $(ACO_BUILDER_H_SCRIPT) $(ACO_DEPS) + @mkdir -p $(dir $@) + @echo "Gen Header: $(PRIVATE_MODULE) <= $(notdir $(@))" + $(hide) $(MESA_PYTHON2) $(ACO_BUILDER_H_SCRIPT) > $@ || ($(RM) $@; false) + +LOCAL_C_INCLUDES := \ + $(MESA_TOP)/src/amd \ + $(MESA_TOP)/src/amd/common \ + $(MESA_TOP)/src/amd/compiler \ + $(MESA_TOP)/src/compiler/nir \ + $(MESA_TOP)/src/mapi \ + $(MESA_TOP)/src/mesa \ + $(intermediates)/compiler + +LOCAL_EXPORT_C_INCLUDE_DIRS := \ + $(MESA_TOP)/src/amd/compiler \ + $(intermediates)/compiler + +LOCAL_SHARED_LIBRARIES := \ + libdrm_amdgpu + +LOCAL_STATIC_LIBRARIES := \ + libmesa_amd_common \ + libmesa_nir + +$(call mesa-build-with-llvm) + +include $(MESA_COMMON_MK) +include $(BUILD_STATIC_LIBRARY) diff --git a/src/amd/Android.mk b/src/amd/Android.mk index e40e7da01b..c9dbeafde1 100644 --- a/src/amd/Android.mk +++ b/src/amd/Android.mk @@ -28,5 +28,6 @@ include $(LOCAL_PATH)/
[Mesa-dev] [PATCH 2/3] android: compiler/nir: build nir_divergence_analysis.c
Prerequisite to avoid following radv linking error happening with aco FAILED: out/target/product/x86_64/obj_x86/SHARED_LIBRARIES/vulkan.radv_intermediates/LINKED/vulkan.radv.so ... external/mesa/src/amd/compiler/aco_instruction_selection_setup.cpp:178: error: undefined reference to 'nir_divergence_analysis' clang.real: error: linker command failed with exit code 1 (use -v to see invocation) Fixes: df86c5f ("nir: add divergence analysis pass.") Signed-off-by: Mauro Rossi --- src/compiler/Makefile.sources | 1 + 1 file changed, 1 insertion(+) diff --git a/src/compiler/Makefile.sources b/src/compiler/Makefile.sources index c4d2c2be7c..bc49e00525 100644 --- a/src/compiler/Makefile.sources +++ b/src/compiler/Makefile.sources @@ -215,6 +215,7 @@ NIR_FILES = \ nir/nir_control_flow_private.h \ nir/nir_deref.c \ nir/nir_deref.h \ + nir/nir_divergence_analysis.c \ nir/nir_dominance.c \ nir/nir_format_convert.h \ nir/nir_from_ssa.c \ -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/4] nv50ir/nir: comparison of integer expressions of different signedness warning
On Sat, 21 Sep 2019 at 04:27, Karol Herbst wrote: > Signed-off-by: Karol Herbst > Reviewed-by: Rhys Kidd > --- > src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp > b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp > index 4e86ab8f8cc..95b60d2c7d0 100644 > --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp > +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp > @@ -1957,7 +1957,7 @@ Converter::visit(nir_intrinsic_instr *insn) > } > case Program::TYPE_GEOMETRY: > case Program::TYPE_VERTEX: { > -if (info->io.genUserClip > 0 && idx == clipVertexOutput) { > +if (info->io.genUserClip > 0 && idx == > (uint32_t)clipVertexOutput) { > mkMov(clipVtx[i], src); > src = clipVtx[i]; > } > -- > 2.21.0 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] panfrost: Make sure a clear does not re-use a pre-existing batch
On Fri, 20 Sep 2019 15:45:33 -0400 Alyssa Rosenzweig wrote: > To be clear, if we have a batch and do the following operations: > > clear red > draw 1 > clear green > draw 2 > flush > > All we should see is #2 on a green background, which this patch handles > by the second clear invalidating all the clears/draws that came before > it (provided there is no flush in between). > > I might just be tripped up by the "freeze" name. That really means throw > away / free here, I guess? Nope. Freeze means "stop queuing new draws to this batch". I guess we could free the batch as well if the result of the previous draws/clear are really overwritten by this new clear, but that implies checking the new clear flags to make sure they target the same depth/stencil/color combination. On the other hand, I'm wondering if it's really the driver's job to try to optimize silly things the apps might do. I mean, the sequence you describe does not look like a wise thing to do since the "clear red+draw 1" end up being overwritten by "clear green + draw 2". > > Provided that's the idea (and we're not somehow saving the original draw > 1), it's Reviewed-by A R > > On Fri, Sep 20, 2019 at 04:53:37PM +0200, Boris Brezillon wrote: > > glClear()s are expected to be the first thing GL apps do before drawing > > new things. If there's already an existing batch targetting the same > > FBO that has draws attached to it, we should make sure the new clear > > gets a new batch assigned to guaranteed that the FB content is actually > > cleared with the requested color/depth/stencil values. > > > > We create a panfrost_get_fresh_batch_for_fbo() helper for that and > > call it from panfrost_clear(). > > > > Signed-off-by: Boris Brezillon > > --- > > src/gallium/drivers/panfrost/pan_context.c | 2 +- > > src/gallium/drivers/panfrost/pan_job.c | 21 + > > src/gallium/drivers/panfrost/pan_job.h | 3 +++ > > 3 files changed, 25 insertions(+), 1 deletion(-) > > > > diff --git a/src/gallium/drivers/panfrost/pan_context.c > > b/src/gallium/drivers/panfrost/pan_context.c > > index ac01461a07fe..b2f2a9da7a51 100644 > > --- a/src/gallium/drivers/panfrost/pan_context.c > > +++ b/src/gallium/drivers/panfrost/pan_context.c > > @@ -162,7 +162,7 @@ panfrost_clear( > > double depth, unsigned stencil) > > { > > struct panfrost_context *ctx = pan_context(pipe); > > -struct panfrost_batch *batch = panfrost_get_batch_for_fbo(ctx); > > +struct panfrost_batch *batch = > > panfrost_get_fresh_batch_for_fbo(ctx); > > > > panfrost_batch_add_fbo_bos(batch); > > panfrost_batch_clear(batch, buffers, color, depth, stencil); > > diff --git a/src/gallium/drivers/panfrost/pan_job.c > > b/src/gallium/drivers/panfrost/pan_job.c > > index d8330bc133a6..4ec2aa0565d7 100644 > > --- a/src/gallium/drivers/panfrost/pan_job.c > > +++ b/src/gallium/drivers/panfrost/pan_job.c > > @@ -298,6 +298,27 @@ panfrost_get_batch_for_fbo(struct panfrost_context > > *ctx) > > return batch; > > } > > > > +struct panfrost_batch * > > +panfrost_get_fresh_batch_for_fbo(struct panfrost_context *ctx) > > +{ > > +struct panfrost_batch *batch; > > + > > +batch = panfrost_get_batch(ctx, &ctx->pipe_framebuffer); > > + > > +/* The batch has no draw/clear queued, let's return it directly. > > + * Note that it's perfectly fine to re-use a batch with an > > + * existing clear, we'll just update it with the new clear request. > > + */ > > +if (!batch->last_job.gpu) > > +return batch; > > + > > +/* Otherwise, we need to freeze the existing one and instantiate a > > new > > + * one. > > + */ > > +panfrost_freeze_batch(batch); > > +return panfrost_get_batch(ctx, &ctx->pipe_framebuffer); > > +} > > + > > static bool > > panfrost_batch_fence_is_signaled(struct panfrost_batch_fence *fence) > > { > > diff --git a/src/gallium/drivers/panfrost/pan_job.h > > b/src/gallium/drivers/panfrost/pan_job.h > > index e1b1f56a2e64..0bd78bba267a 100644 > > --- a/src/gallium/drivers/panfrost/pan_job.h > > +++ b/src/gallium/drivers/panfrost/pan_job.h > > @@ -172,6 +172,9 @@ panfrost_batch_fence_reference(struct > > panfrost_batch_fence *batch); > > struct panfrost_batch * > > panfrost_get_batch_for_fbo(struct panfrost_context *ctx); > > > > +struct panfrost_batch * > > +panfrost_get_fresh_batch_for_fbo(struct panfrost_context *ctx); > > + > > void > > panfrost_batch_init(struct panfrost_context *ctx); > > > > -- > > 2.21.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 01/17] panfrost: Extend the panfrost_batch_add_bo() API to pass access flags
On Fri, 20 Sep 2019 16:53:49 -0400 Alyssa Rosenzweig wrote: > > @@ -1121,7 +1134,11 @@ panfrost_emit_for_draw(struct panfrost_context *ctx, > > bool with_vertex_data) > > > > struct panfrost_shader_state *ss = > > &all->variants[all->active_variant]; > > > > -panfrost_batch_add_bo(batch, ss->bo); > > +panfrost_batch_add_bo(batch, ss->bo, > > + PAN_BO_ACCESS_PRIVATE | > > + PAN_BO_ACCESS_READ | > > > + PAN_BO_ACCESS_VERTEX_TILER | > > + PAN_BO_ACCESS_FRAGMENT); > > I believe this should be just the access for the stage `i` > > Although actually I am not at all sure what this batch_add_bo is doing > at all? > > I think this batch_add_bo should probably dropped altogether? This loop > is dealing with constant buffers; the shaders themselves were added I'll double check. I couldn't find where BOs containing shader programs were added last time I looked. > > > void panfrost_batch_add_fbo_bos(struct panfrost_batch *batch) > > { > > +uint32_t flags = PAN_BO_ACCESS_SHARED | PAN_BO_ACCESS_WRITE | > > + PAN_BO_ACCESS_VERTEX_TILER | > > + PAN_BO_ACCESS_FRAGMENT; > > I think we can drop VERTEX_TILER here...? The buffers are written right > at the end of the FRAGMENT job, not touched before that. What about the read done when drawing the wallpaper? I guess it's also only read by the fragment job, but I wasn't sure. > > If nothing else is broken, this should allow a nice perf boost with > pipelining, so the vertex/tiler from frame n+1 can run in parallel with > the fragment of frame n (rather than blocking on frame n finishing with > the FBOs). Would require the kernel patches I posted earlier for that to happen ;-). Right now all jobs touching the same BO are serialized because of the implicit BO fences added by the kernel driver. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/3] android: aco: fix undefined template 'std::__1::array' build errors
Fixes a few building errors similar to the following: In file included from external/mesa/src/amd/compiler/aco_instruction_selection.cpp:26: In file included from external/libcxx/include/algorithm:639: external/libcxx/include/utility:321:9: error: implicit instantiation of undefined template 'std::__1::array' _T2 second; ^ Fixes: 93c8ebf ("aco: Initial commit of independent AMD compiler") Signed-off-by: Mauro Rossi --- src/amd/compiler/aco_instruction_selection.cpp | 1 + src/amd/compiler/aco_instruction_selection_setup.cpp | 1 + src/amd/compiler/aco_print_asm.cpp | 2 +- src/amd/compiler/aco_register_allocation.cpp | 1 + src/amd/compiler/aco_validate.cpp| 1 + 5 files changed, 5 insertions(+), 1 deletion(-) diff --git a/src/amd/compiler/aco_instruction_selection.cpp b/src/amd/compiler/aco_instruction_selection.cpp index d52043f3c0..164ac8ed5d 100644 --- a/src/amd/compiler/aco_instruction_selection.cpp +++ b/src/amd/compiler/aco_instruction_selection.cpp @@ -24,6 +24,7 @@ */ #include +#include #include #include "aco_ir.h" diff --git a/src/amd/compiler/aco_instruction_selection_setup.cpp b/src/amd/compiler/aco_instruction_selection_setup.cpp index 6c4c408e65..e8522c3d12 100644 --- a/src/amd/compiler/aco_instruction_selection_setup.cpp +++ b/src/amd/compiler/aco_instruction_selection_setup.cpp @@ -22,6 +22,7 @@ * */ +#include #include #include "aco_ir.h" #include "nir.h" diff --git a/src/amd/compiler/aco_print_asm.cpp b/src/amd/compiler/aco_print_asm.cpp index 31079aa1c4..d3f4c3cb40 100644 --- a/src/amd/compiler/aco_print_asm.cpp +++ b/src/amd/compiler/aco_print_asm.cpp @@ -1,4 +1,4 @@ - +#include #include #include "aco_ir.h" #include "llvm-c/Disassembler.h" diff --git a/src/amd/compiler/aco_register_allocation.cpp b/src/amd/compiler/aco_register_allocation.cpp index d55f1febc6..47ea932f11 100644 --- a/src/amd/compiler/aco_register_allocation.cpp +++ b/src/amd/compiler/aco_register_allocation.cpp @@ -27,6 +27,7 @@ */ #include +#include #include #include #include diff --git a/src/amd/compiler/aco_validate.cpp b/src/amd/compiler/aco_validate.cpp index 0988d66df3..9919d0a585 100644 --- a/src/amd/compiler/aco_validate.cpp +++ b/src/amd/compiler/aco_validate.cpp @@ -24,6 +24,7 @@ #include "aco_ir.h" +#include #include namespace aco { -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/4] nv50ir: fix memset on non trivial types warning
On Sat, 21 Sep 2019 at 04:27, Karol Herbst wrote: > Signed-off-by: Karol Herbst > Reviewed-by: Rhys Kidd > --- > src/gallium/drivers/nouveau/codegen/nv50_ir.cpp| 4 +--- > src/gallium/drivers/nouveau/codegen/nv50_ir.h | 2 +- > src/gallium/drivers/nouveau/codegen/nv50_ir_target.cpp | 2 +- > 3 files changed, 3 insertions(+), 5 deletions(-) > > diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp > b/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp > index a181a13a3b1..45ee95bb103 100644 > --- a/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp > +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp > @@ -903,10 +903,8 @@ Instruction::isCommutationLegal(const Instruction *i) > const > } > > TexInstruction::TexInstruction(Function *fn, operation op) > - : Instruction(fn, op, TYPE_F32) > + : Instruction(fn, op, TYPE_F32), tex() > { > - memset(&tex, 0, sizeof(tex)); > - > tex.rIndirectSrc = -1; > tex.sIndirectSrc = -1; > > diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir.h > b/src/gallium/drivers/nouveau/codegen/nv50_ir.h > index b19751ab372..5163e1a7ec2 100644 > --- a/src/gallium/drivers/nouveau/codegen/nv50_ir.h > +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir.h > @@ -957,7 +957,7 @@ public: > class Target > { > public: > - Target(TexTarget targ = TEX_TARGET_2D) : target(targ) { } > + Target(TexTarget targ = TEX_TARGET_1D) : target(targ) { } > >const char *getName() const { return descTable[target].name; } >unsigned int getArgCount() const { return descTable[target].argc; } > diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target.cpp > b/src/gallium/drivers/nouveau/codegen/nv50_ir_target.cpp > index 5c6d0570ae2..609e7b89290 100644 > --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target.cpp > +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target.cpp > @@ -455,7 +455,7 @@ CodeEmitter::addInterp(int ipa, int reg, FixupApply > apply) >if (!fixupInfo) > return false; >if (n == 0) > - memset(fixupInfo, 0, sizeof(FixupInfo)); > + fixupInfo->count = 0; > } > ++fixupInfo->count; > > -- > 2.21.0 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 01/17] panfrost: Extend the panfrost_batch_add_bo() API to pass access flags
> > Although actually I am not at all sure what this batch_add_bo is doing > > at all? > > > > I think this batch_add_bo should probably dropped altogether? This loop > > is dealing with constant buffers; the shaders themselves were added > > I'll double check. I couldn't find where BOs containing shader programs > were added last time I looked. Masking a real bug :o It should probably happen in panfrost_patch_shader_state? > > > void panfrost_batch_add_fbo_bos(struct panfrost_batch *batch) > > > { > > > +uint32_t flags = PAN_BO_ACCESS_SHARED | PAN_BO_ACCESS_WRITE | > > > + PAN_BO_ACCESS_VERTEX_TILER | > > > + PAN_BO_ACCESS_FRAGMENT; > > > > I think we can drop VERTEX_TILER here...? The buffers are written right > > at the end of the FRAGMENT job, not touched before that. > > What about the read done when drawing the wallpaper? I guess it's also > only read by the fragment job, but I wasn't sure. As I stated before, I thought we should be adding the BO for wallpapering when we actually wallpaper, since that's a slow path. Not wallpapering is the default and ideally what most apps should do. > > If nothing else is broken, this should allow a nice perf boost with > > pipelining, so the vertex/tiler from frame n+1 can run in parallel with > > the fragment of frame n (rather than blocking on frame n finishing with > > the FBOs). > > Would require the kernel patches I posted earlier for that to > happen ;-). Right now all jobs touching the same BO are serialized > because of the implicit BO fences added by the kernel driver. Sure~ Maybe this sort of bug was the reason you weren't seeing improvement from those kernel patches? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] panfrost: Make sure a clear does not re-use a pre-existing batch
> > To be clear, if we have a batch and do the following operations: > > > > clear red > > draw 1 > > clear green > > draw 2 > > flush > > > > All we should see is #2 on a green background, which this patch handles > > by the second clear invalidating all the clears/draws that came before > > it (provided there is no flush in between). > > > > I might just be tripped up by the "freeze" name. That really means throw > > away / free here, I guess? > > Nope. Freeze means "stop queuing new draws to this batch". I guess we > could free the batch as well if the result of the previous draws/clear > are really overwritten by this new clear, but that implies checking the > new clear flags to make sure they target the same depth/stencil/color > combination. On the other hand, I'm wondering if it's really the > driver's job to try to optimize silly things the apps might do. I mean, > the sequence you describe does not look like a wise thing to do since > the "clear red+draw 1" end up being overwritten by "clear green + draw > 2". I'm quite confused how this patch works, then. A few thoughts: if the app clears all buffers in the middle, then yes it's silly and yes we may as well optimize it out. (Should that be a thing GL drivers have to do? I mean, if the other drivers are too...) If the sequence is more like: clear all buffers draw 1 clear color buffer (preserve depth stencil) draw 2 flush That second clear should really be done by drawing a full screen quad, just like if we were wallpapering, except loading its colour from a uniform instead of a texture. Similarly, a depth-only clear mid-frame can be emulated by drawing a full-screen quad with the gl_Position.zw components juryrigged to the desired depth components, and disabling colour draws by setting the colour mask to 0x0. That also means you can skip having any shader at all (literally set the shader pointer to 0x0) so that's faster. Finally, a stencil-only clear can occur similarly playing tricks with the stencil test parameters. I suspect u_blitter or mesa/st is capable of doing these sorts of tricks on our behalf, but I have not researched it extensively. In any case, for a partial clear mid-frame, we would much rather do: clear all buffers draw 1 draw fullscreen quad (disable Z/S writes) draw 2 flush than what it sounds like this patch does clear all buffers draw 1 flush clear all buffers wallpaper draw 2 flush Please do correct me if I've misunderstood the logic here. > > > > > Provided that's the idea (and we're not somehow saving the original draw > > 1), it's Reviewed-by A R > > > > On Fri, Sep 20, 2019 at 04:53:37PM +0200, Boris Brezillon wrote: > > > glClear()s are expected to be the first thing GL apps do before drawing > > > new things. If there's already an existing batch targetting the same > > > FBO that has draws attached to it, we should make sure the new clear > > > gets a new batch assigned to guaranteed that the FB content is actually > > > cleared with the requested color/depth/stencil values. > > > > > > We create a panfrost_get_fresh_batch_for_fbo() helper for that and > > > call it from panfrost_clear(). > > > > > > Signed-off-by: Boris Brezillon > > > --- > > > src/gallium/drivers/panfrost/pan_context.c | 2 +- > > > src/gallium/drivers/panfrost/pan_job.c | 21 + > > > src/gallium/drivers/panfrost/pan_job.h | 3 +++ > > > 3 files changed, 25 insertions(+), 1 deletion(-) > > > > > > diff --git a/src/gallium/drivers/panfrost/pan_context.c > > > b/src/gallium/drivers/panfrost/pan_context.c > > > index ac01461a07fe..b2f2a9da7a51 100644 > > > --- a/src/gallium/drivers/panfrost/pan_context.c > > > +++ b/src/gallium/drivers/panfrost/pan_context.c > > > @@ -162,7 +162,7 @@ panfrost_clear( > > > double depth, unsigned stencil) > > > { > > > struct panfrost_context *ctx = pan_context(pipe); > > > -struct panfrost_batch *batch = panfrost_get_batch_for_fbo(ctx); > > > +struct panfrost_batch *batch = > > > panfrost_get_fresh_batch_for_fbo(ctx); > > > > > > panfrost_batch_add_fbo_bos(batch); > > > panfrost_batch_clear(batch, buffers, color, depth, stencil); > > > diff --git a/src/gallium/drivers/panfrost/pan_job.c > > > b/src/gallium/drivers/panfrost/pan_job.c > > > index d8330bc133a6..4ec2aa0565d7 100644 > > > --- a/src/gallium/drivers/panfrost/pan_job.c > > > +++ b/src/gallium/drivers/panfrost/pan_job.c > > > @@ -298,6 +298,27 @@ panfrost_get_batch_for_fbo(struct panfrost_context > > > *ctx) > > > return batch; > > > } > > > > > > +struct panfrost_batch * > > > +panfrost_get_fresh_batch_for_fbo(struct panfrost_context *ctx) > > > +{ > > > +struct panfrost_batch *batch; > > > + > > > +batch = panfrost_get_batch(ctx, &ctx->pipe_framebuffer); > > > + > >
Re: [Mesa-dev] [PATCH v3 01/17] panfrost: Extend the panfrost_batch_add_bo() API to pass access flags
+your collabora address On Sun, 22 Sep 2019 08:31:40 -0400 Alyssa Rosenzweig wrote: > > > Although actually I am not at all sure what this batch_add_bo is doing > > > at all? > > > > > > I think this batch_add_bo should probably dropped altogether? This loop > > > is dealing with constant buffers; the shaders themselves were added > > > > I'll double check. I couldn't find where BOs containing shader programs > > were added last time I looked. > > Masking a real bug :o > > It should probably happen in panfrost_patch_shader_state? Ok, I'll add it there, I wasn't sure this function was called for all shaders, but looking at the code a second time it seems to be the case. > > > > > void panfrost_batch_add_fbo_bos(struct panfrost_batch *batch) > > > > { > > > > +uint32_t flags = PAN_BO_ACCESS_SHARED | PAN_BO_ACCESS_WRITE | > > > > + PAN_BO_ACCESS_VERTEX_TILER | > > > > + PAN_BO_ACCESS_FRAGMENT; > > > > > > I think we can drop VERTEX_TILER here...? The buffers are written right > > > at the end of the FRAGMENT job, not touched before that. > > > > What about the read done when drawing the wallpaper? I guess it's also > > only read by the fragment job, but I wasn't sure. > > As I stated before, I thought we should be adding the BO for > wallpapering when we actually wallpaper, since that's a slow path. Not > wallpapering is the default and ideally what most apps should do. Wallpapering happens too late (when we are flushing the batch) to have an impact on the dep graph, but we can probably know that wallpapering will be needed before that. My question remains though, are vertex/tiler supposed to touch the texture BO they're reading from, or should we only flag the BO for FRAGMENT use. > > > > If nothing else is broken, this should allow a nice perf boost with > > > pipelining, so the vertex/tiler from frame n+1 can run in parallel with > > > the fragment of frame n (rather than blocking on frame n finishing with > > > the FBOs). > > > > Would require the kernel patches I posted earlier for that to > > happen ;-). Right now all jobs touching the same BO are serialized > > because of the implicit BO fences added by the kernel driver. > > Sure~ Maybe this sort of bug was the reason you weren't seeing > improvement from those kernel patches? Maybe. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] panfrost: Make sure a clear does not re-use a pre-existing batch
On Sun, 22 Sep 2019 08:38:30 -0400 Alyssa Rosenzweig wrote: > > > To be clear, if we have a batch and do the following operations: > > > > > > clear red > > > draw 1 > > > clear green > > > draw 2 > > > flush > > > > > > All we should see is #2 on a green background, which this patch handles > > > by the second clear invalidating all the clears/draws that came before > > > it (provided there is no flush in between). > > > > > > I might just be tripped up by the "freeze" name. That really means throw > > > away / free here, I guess? > > > > Nope. Freeze means "stop queuing new draws to this batch". I guess we > > could free the batch as well if the result of the previous draws/clear > > are really overwritten by this new clear, but that implies checking the > > new clear flags to make sure they target the same depth/stencil/color > > combination. On the other hand, I'm wondering if it's really the > > driver's job to try to optimize silly things the apps might do. I mean, > > the sequence you describe does not look like a wise thing to do since > > the "clear red+draw 1" end up being overwritten by "clear green + draw > > 2". > > I'm quite confused how this patch works, then. > > A few thoughts: if the app clears all buffers in the middle, then yes > it's silly and yes we may as well optimize it out. (Should that be a thing GL > drivers have to do? I mean, if the other drivers are too...) > > If the sequence is more like: > > clear all buffers > draw 1 > clear color buffer (preserve depth stencil) > draw 2 > flush > > That second clear should really be done by drawing a full screen quad, > just like if we were wallpapering, except loading its colour from a > uniform instead of a texture. > > Similarly, a depth-only clear mid-frame can be emulated by drawing a > full-screen quad with the gl_Position.zw components juryrigged to the > desired depth components, and disabling colour draws by setting the > colour mask to 0x0. That also means you can skip having any shader at > all (literally set the shader pointer to 0x0) so that's faster. > > Finally, a stencil-only clear can occur similarly playing tricks with > the stencil test parameters. > > I suspect u_blitter or mesa/st is capable of doing these sorts of tricks > on our behalf, but I have not researched it extensively. AFAIU, mesa/st does not transform the clear into a quad-draw for depth/stencil only clears, it only does that for the color(s)-masked case. That's certainly something we can do in Panfrost if we want to avoid creating a new batch in such situations though. > > In any case, for a partial clear mid-frame, we would much rather do: > > clear all buffers > draw 1 > draw fullscreen quad (disable Z/S writes) > draw 2 > flush > > than what it sounds like this patch does > > clear all buffers > draw 1 > flush > > clear all buffers > wallpaper > draw 2 > flush > > Please do correct me if I've misunderstood the logic here. There's no flush introduced by this patch, it just adds a dep between batch 2 and 1: batch1: clear all buffers draw 1 batch2: clear all buffers wallpaper draw 2 flush (actually flushes batch 1 and 2) with batch2 --depends-on--> batch1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 01/17] panfrost: Extend the panfrost_batch_add_bo() API to pass access flags
> +your collabora address Thank you > > > > I think this batch_add_bo should probably dropped altogether? This loop > > > > is dealing with constant buffers; the shaders themselves were added > > > > > > I'll double check. I couldn't find where BOs containing shader programs > > > were added last time I looked. > > > > Masking a real bug :o > > > > It should probably happen in panfrost_patch_shader_state? > > Ok, I'll add it there, I wasn't sure this function was called for all > shaders, but looking at the code a second time it seems to be the case. I think so as well, yeah. > > As I stated before, I thought we should be adding the BO for > > wallpapering when we actually wallpaper, since that's a slow path. Not > > wallpapering is the default and ideally what most apps should do. > > Wallpapering happens too late (when we are flushing the batch) to have > an impact on the dep graph, but we can probably know that wallpapering > will be needed before that. My question remains though, are > vertex/tiler supposed to touch the texture BO they're reading from, or > should we only flag the BO for FRAGMENT use. Vertex/tiler should not touch the texture BO, unless you're texturing from the vertex shader (which we're not for wallpapering). signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 01/17] panfrost: Extend the panfrost_batch_add_bo() API to pass access flags
On Sun, 22 Sep 2019 09:26:45 -0400 Alyssa Rosenzweig wrote: > > +your collabora address > > Thank you > > > > > > I think this batch_add_bo should probably dropped altogether? This > > > > > loop > > > > > is dealing with constant buffers; the shaders themselves were added > > > > > > > > > > > > > I'll double check. I couldn't find where BOs containing shader programs > > > > were added last time I looked. > > > > > > Masking a real bug :o > > > > > > It should probably happen in panfrost_patch_shader_state? > > > > Ok, I'll add it there, I wasn't sure this function was called for all > > shaders, but looking at the code a second time it seems to be the case. > > I think so as well, yeah. > > > > As I stated before, I thought we should be adding the BO for > > > wallpapering when we actually wallpaper, since that's a slow path. Not > > > wallpapering is the default and ideally what most apps should do. > > > > Wallpapering happens too late (when we are flushing the batch) to have > > an impact on the dep graph, but we can probably know that wallpapering > > will be needed before that. My question remains though, are > > vertex/tiler supposed to touch the texture BO they're reading from, or > > should we only flag the BO for FRAGMENT use. > > Vertex/tiler should not touch the texture BO, unless you're texturing > from the vertex shader (which we're not for wallpapering). Okay, then adding the READ flag isn't really needed: a WRITE is more constraining than a READ, and the FRAGMENT job already writes the FBO BOs. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] panfrost: Make sure a clear does not re-use a pre-existing batch
On Sun, 22 Sep 2019 15:24:10 +0200 Boris Brezillon wrote: > On Sun, 22 Sep 2019 08:38:30 -0400 > Alyssa Rosenzweig wrote: > > > > > To be clear, if we have a batch and do the following operations: > > > > > > > > clear red > > > > draw 1 > > > > clear green > > > > draw 2 > > > > flush > > > > > > > > All we should see is #2 on a green background, which this patch handles > > > > by the second clear invalidating all the clears/draws that came before > > > > it (provided there is no flush in between). > > > > > > > > I might just be tripped up by the "freeze" name. That really means throw > > > > away / free here, I guess? > > > > > > Nope. Freeze means "stop queuing new draws to this batch". I guess we > > > could free the batch as well if the result of the previous draws/clear > > > are really overwritten by this new clear, but that implies checking the > > > new clear flags to make sure they target the same depth/stencil/color > > > combination. On the other hand, I'm wondering if it's really the > > > driver's job to try to optimize silly things the apps might do. I mean, > > > the sequence you describe does not look like a wise thing to do since > > > the "clear red+draw 1" end up being overwritten by "clear green + draw > > > 2". > > > > I'm quite confused how this patch works, then. > > > > A few thoughts: if the app clears all buffers in the middle, then yes > > it's silly and yes we may as well optimize it out. (Should that be a thing > > GL > > drivers have to do? I mean, if the other drivers are too...) > > > > If the sequence is more like: > > > > clear all buffers > > draw 1 > > clear color buffer (preserve depth stencil) > > draw 2 > > flush > > > > That second clear should really be done by drawing a full screen quad, > > just like if we were wallpapering, except loading its colour from a > > uniform instead of a texture. > > > > Similarly, a depth-only clear mid-frame can be emulated by drawing a > > full-screen quad with the gl_Position.zw components juryrigged to the > > desired depth components, and disabling colour draws by setting the > > colour mask to 0x0. That also means you can skip having any shader at > > all (literally set the shader pointer to 0x0) so that's faster. > > > > Finally, a stencil-only clear can occur similarly playing tricks with > > the stencil test parameters. > > > > I suspect u_blitter or mesa/st is capable of doing these sorts of tricks > > on our behalf, but I have not researched it extensively. > > AFAIU, mesa/st does not transform the clear into a quad-draw for > depth/stencil only clears, it only does that for the color(s)-masked > case. That's certainly something we can do in Panfrost if we want to > avoid creating a new batch in such situations though. One more thing: optimization of the above scenario is probably something we'll want to have at some point, but I think the current patch is worth applying in the meantime. All this patch does is enforcing ordering of clears/draws to make sure the end result matches users expectation. > > > > > In any case, for a partial clear mid-frame, we would much rather do: > > > > clear all buffers > > draw 1 > > draw fullscreen quad (disable Z/S writes) > > draw 2 > > flush > > > > than what it sounds like this patch does > > > > clear all buffers > > draw 1 > > flush > > > > clear all buffers > > wallpaper > > draw 2 > > flush > > > > Please do correct me if I've misunderstood the logic here. > > > There's no flush introduced by this patch, it just adds a dep between > batch 2 and 1: > > batch1: clear all buffers > draw 1 > > batch2: clear all buffers > wallpaper > draw 2 > > flush (actually flushes batch 1 and 2) > > with batch2 --depends-on--> batch1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev