Reviewed-by: Edward O'Callaghan
On 09/16/2016 11:57 PM, Nicolai Hähnle wrote:
> From: Nicolai Hähnle
>
> ---
> src/gallium/drivers/radeon/r600_query.c | 6 +-
> 1 file changed, 1 insertion(+), 5 deletions(-)
>
> diff --git a/src/gallium/drivers/radeon/r600_query.c
> b/src/gallium/drivers
While the current CFG code is valid in the case where a switch break also
happens to be a loop continue, it's a bit suboptimal. Since hardware is
capable of handling the continue as a direct jump, it's better to use a
continue instruction when we can than to bother with all of the nasty
switch bre
It is possible that the break block of a switch is actually the continue of
the loop containing the switch. In this case, we need to identify the
break block as a continue and break out of current level of CFG handling.
If we don't, the continue portion of the loop will get handled twice, once
by
On Fri, Sep 16, 2016 at 5:59 PM, Francisco Jerez
wrote:
> Jason Ekstrand writes:
>
> > On Sep 16, 2016 3:04 PM, "Francisco Jerez"
> wrote:
> >>
> >> Not intended for upstream. Should cause a GPU hang if some thread is
> >> executed with a non-contiguous dispatch mask breaking assumptions of
>
Jason Ekstrand writes:
> On Sep 16, 2016 3:04 PM, "Francisco Jerez" wrote:
>>
>> Not intended for upstream. Should cause a GPU hang if some thread is
>> executed with a non-contiguous dispatch mask breaking assumptions of
>> brw_stage_has_packed_dispatch(). Doesn't cause any CTS, DEQP or
>> Pi
Hi Francesco,
Where are you with the piglit tests? I just finished converting the
ARB_viewport_array tests, and was thinking of having a go at the
ARB_texture_view ones. However if you've made significant progress
there already, I have other things I can do too.
-ilia
On Wed, Aug 31, 2016 at 1
On Fri, Sep 16, 2016 at 5:36 PM, Connor Abbott wrote:
> On Fri, Sep 16, 2016 at 6:25 PM, Jason Ekstrand
> wrote:
> > On Thu, Sep 15, 2016 at 12:03 AM, Timothy Arceri
> > wrote:
> >>
> >> From: Thomas Helland
> >>
> >> This pass detects induction variables and calculates the
> >> trip count of
On Fri, Sep 16, 2016 at 6:25 PM, Jason Ekstrand wrote:
> On Thu, Sep 15, 2016 at 12:03 AM, Timothy Arceri
> wrote:
>>
>> From: Thomas Helland
>>
>> This pass detects induction variables and calculates the
>> trip count of loops to be used for loop unrolling.
>>
>> I've removed support for float
On Sat, 2016-09-17 at 09:40 +1000, Timothy Arceri wrote:
> On Fri, 2016-09-16 at 15:25 -0700, Jason Ekstrand wrote:
> > > > On Thu, Sep 15, 2016 at 12:03 AM, Timothy Arceri wrote:
> > > > > > From: Thomas Helland
> >
> >
snip
>
> >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > >
As I said on patch 5, I would like to see some version of it merged at
least for fs. The vec4 back-end isn't as much of a problem since we've
verified it now and future hardware won't be using it.
Series is Reviewed-by: Jason Ekstrand
On Sep 16, 2016 3:04 PM, "Francisco Jerez" wrote:
> This a
On Sep 16, 2016 3:04 PM, "Francisco Jerez" wrote:
>
> Not intended for upstream. Should cause a GPU hang if some thread is
> executed with a non-contiguous dispatch mask breaking assumptions of
> brw_stage_has_packed_dispatch(). Doesn't cause any CTS, DEQP or
> Piglit regressions, while replacin
Jason Ekstrand writes:
> On Fri, Sep 16, 2016 at 3:03 PM, Francisco Jerez
> wrote:
>
>> The eliminate_find_live_channel optimization eliminates
>> FIND_LIVE_CHANNEL instructions in cases where control flow is known to
>> be uniform, and replaces them with 'MOV 0', which in turn unblocks
>> subse
On Fri, 2016-09-16 at 17:01 +0200, Erik Faye-Lund wrote:
> On Thu, Sep 15, 2016 at 9:03 AM, Timothy Arceri
> wrote:
> >
> > + const int bias[] = { -1, 1, 1 };
> > +
> > + for (unsigned i = 0; i < ARRAY_SIZE(bias); i++) {
> > + iter_int = iter_int + bias[i];
> > +
> > + switch (cond_
On Fri, 2016-09-16 at 16:52 +0200, Erik Faye-Lund wrote:
> On Thu, Sep 15, 2016 at 9:03 AM, Timothy Arceri
> wrote:
> >
> > This will be used by the loop unroll and lcssa passes.
> >
> > V2:
> > - Check instruction count is not too large for unrolling
> > - Add helper for complex loop unrolling
On Fri, Sep 16, 2016 at 3:03 PM, Francisco Jerez
wrote:
> The eliminate_find_live_channel optimization eliminates
> FIND_LIVE_CHANNEL instructions in cases where control flow is known to
> be uniform, and replaces them with 'MOV 0', which in turn unblocks
> subsequent elimination of the BROADCAST
On Thu, Sep 15, 2016 at 12:03 AM, Timothy Arceri <
timothy.arc...@collabora.com> wrote:
> From: Thomas Helland
>
> This pass detects induction variables and calculates the
> trip count of loops to be used for loop unrolling.
>
> I've removed support for float induction values for now, for the
> s
https://bugs.freedesktop.org/show_bug.cgi?id=97230
Chris Wilson changed:
What|Removed |Added
QA Contact|intel-gfx-bugs@lists.freede |mesa-dev@lists.freedesktop.
Not intended for upstream. Should cause a GPU hang if some thread is
executed with a non-contiguous dispatch mask breaking assumptions of
brw_stage_has_packed_dispatch(). Doesn't cause any CTS, DEQP or
Piglit regressions, while replacing brw_stage_has_packed_dispatch()
with a dummy implementation
From: Jason Ekstrand
On at least Sky Lake, ce0 does not contain the full story as far as enabled
channels goes. It is possible to have completely disabled channels where
the corresponding bits in ce0 are 1. In order to get the correct execution
mask, you have to mask off those channels which we
The eliminate_find_live_channel optimization eliminates
FIND_LIVE_CHANNEL instructions in cases where control flow is known to
be uniform, and replaces them with 'MOV 0', which in turn unblocks
subsequent elimination of the BROADCAST instruction frequently used on
the result of FIND_LIVE_CHANNEL.
This avoids emitting a few extra instructions required to take the
dispatch mask into account when it's known to be tightly packed.
---
src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 4 +++-
src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 8 ++--
2 files changed, 9 insertions(+), 3 dele
From: Jason Ekstrand
The state register sr0 is really a collection of dwords not a SIMD8
anything. It's much more convenient for brw_sr0_reg to return the
particular dword you're looking for rather than a giant blob you have to
massage into what you want.
Signed-off-by: Jason Ekstrand
[ Franci
On Fri, Sep 16, 2016 at 12:55 PM, Ilia Mirkin wrote:
> Signed-off-by: Ilia Mirkin
> ---
> src/mapi/glapi/gen/apiexec.py | 12
> src/mapi/glapi/gen/es_EXT.xml | 50
> +
> src/mesa/main/tests/dispatch_sanity.cpp | 11
> src/mes
On Fri, Sep 16, 2016 at 3:57 PM, Nicolai Hähnle wrote:
> From: Nicolai Hähnle
>
> These functions extract the pipe state structure from the current
> descriptors, for state saving.
> ---
> src/gallium/drivers/radeonsi/si_descriptors.c | 46
> +++
> src/gallium/drivers/ra
This just up-converts them to doubles. Not great, but this is what all
the other variants also do.
Signed-off-by: Ilia Mirkin
---
src/mesa/main/viewport.c | 19 ++-
1 file changed, 18 insertions(+), 1 deletion(-)
diff --git a/src/mesa/main/viewport.c b/src/mesa/main/viewport.c
i
This is needed for GL_OES_viewport_array.
Signed-off-by: Ilia Mirkin
---
src/mesa/main/get_hash_params.py | 12 ++--
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py
index 1f63dc3..716cb57 100644
--- a/src/m
Signed-off-by: Ilia Mirkin
---
docs/features.txt | 2 +-
docs/relnotes/12.1.0.html | 1 +
src/mesa/state_tracker/st_extensions.c | 5 +
3 files changed, 7 insertions(+), 1 deletion(-)
diff --git a/docs/features.txt b/docs/features.txt
index df81f91..ed45e10
Signed-off-by: Ilia Mirkin
---
src/compiler/glsl/builtin_variables.cpp | 9 ++---
src/compiler/glsl/glsl_parser_extras.h | 2 ++
2 files changed, 8 insertions(+), 3 deletions(-)
diff --git a/src/compiler/glsl/builtin_variables.cpp
b/src/compiler/glsl/builtin_variables.cpp
index f47daab..3b
The expectation is that drivers will set this based on
OES_geometry_shader and ARB_viewport_array support. This is a separate
enable on the same reasoning as for OES_texture_cube_map_array.
Signed-off-by: Ilia Mirkin
---
src/compiler/glsl/glsl_parser_extras.cpp | 1 +
src/mesa/main/extensions_ta
Signed-off-by: Ilia Mirkin
---
src/mapi/glapi/gen/apiexec.py | 12
src/mapi/glapi/gen/es_EXT.xml | 50 +
src/mesa/main/tests/dispatch_sanity.cpp | 11
src/mesa/main/viewport.c| 12
src/mesa/main/viewpor
We can't change how gallium is supposed to behave since other apis rely
on coverage-to-alpha working even if msaa is disabled.
Roland
Am 16.09.2016 um 18:58 schrieb Ilia Mirkin:
> FTR, the new piglit test passed as-is on NVIDIA hw (at least nv50 and
> nvc0). I'm not opposed to this new state depe
On 14 September 2016 at 19:06, Adam Jackson wrote:
> As this array was not actually sorted, FindGLXFunction's binary search
> would only sometimes work.
>
This commit message is a bit iffy, yet again most of this and
g_glxglvnddispatchfuncs.c is dead code.
Afaict the sole reason behind his file i
On 14 September 2016 at 14:59, Adam Jackson wrote:
> From: Kyle Brenneman
>
> This decorates every EGL entrypoint with _EGL_FUNC_START, which records
> the function name and primary dispatch object label in the current
> thread state. It also adds debug report functions and calls them when
> appr
From: Emil Velikov
Analogous to previous commits - with an extra bonus.
Current code, apart from not attributing the lack of 'per visual'
and overall configs also overwrites the newly added config.
Namely if the dpy supports two or more of the supported formats
(XRGB, ARGB and RGB565) e
From: Emil Velikov
v2: Remove gratuitous newline/semicolon (Eric)
Signed-off-by: Emil Velikov
Reviewed-by: Eric Engestrom
---
src/egl/drivers/dri2/platform_drm.c | 21 +++--
1 file changed, 15 insertions(+), 6 deletions(-)
diff --git a/src/egl/drivers/dri2/platform_drm.c
b/s
From: Emil Velikov
... in dri2_x11_add_configs_for_visuals().
Currently the latter does not consider that, thus in such cases it adds
"empty" configs in the list.
Properly account for things and as we do that we can reuse count,
instead of calling _eglGetArraySize to deterime if we've added any
From: Emil Velikov
Analogous to previous commit.
v2: Use correct comparison in loop conditional (Eric)
Use valid C initializer (Gurchetan)
Signed-off-by: Emil Velikov
Reviewed-by: Gurchetan Singh
---
src/egl/drivers/dri2/platform_surfaceless.c | 15 ---
1 file changed, 8 inse
From: Emil Velikov
Iterate over the driver_configs first in order to cut down the number of
getConfigAttrib() calls by a factor of 5.
While we're here, also drop the sentinel of the visuals array. We
already know its size so we can use that and save a few bytes.
v2: Use correct comparison in lo
From: Emil Velikov
Factor out and rework the existing code so that it prints a debug
message if we have zero configs for any visual.
As a nice side effect we now provide a correct (sequential ID) when
creating a config (via dri2_add_config).
v2: Use correct comparison in loop conditional (Eric)
From: Emil Velikov
Currently we print a debug message if the total configs is non-zero only
to do the same (at an error level) as we return from the function.
Rework the message to print if we're missing a config for the given
format.
Signed-off-by: Emil Velikov
Reviewed-by: Gurchetan Singh
-
On 09/16/2016 06:48 AM, Nicolai Hähnle wrote:
> Hi all,
>
> this is really Dave's work, with a few touch-ups from me that I think make
> sense. I've kept those separate with the intention to squash. I'd like to
> land these in master even before the main ARB_gpu_shader_int64 stuff lands
> (that is
On 09/16/2016 06:57 AM, Nicolai Hähnle wrote:
> Hi all,
>
> as the title says. The implementation uses a compute shader to summarize
> data from the query buffers. As long as only one query buffer is in flight
> (the normal case), that compute shader is launched exactly once, on a
> single thread.
On 25 August 2016 at 17:18, Emil Velikov wrote:
> Hi all,
>
> With the resent noise in the egl area I decided to do some of the long
> planned cleanup in the area. It spans across the following:
>
> - glapi missing glFlush and non-shared glapi are not an option
> - encapsulate/separate disp->Dri
From: Emil Velikov
Introduce a helper and use it throughout the platform code. This allows
us to reduce the amount of ifdef(s) and (potentially) use
kms_swrast_dri.so for !drm platforms (namely wayland and x11).
Note: in the future as other platforms (android, surfaceless) support
the extension
From: Emil Velikov
v2: Rebase.
Signed-off-by: Emil Velikov
---
src/egl/drivers/dri2/egl_dri2.c | 14 +++---
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c
index d2ae25a..75070da 100644
--- a/src/egl/driver
From: Emil Velikov
v2: dri2_bind_extensions() now takes optional as an argument.
Signed-off-by: Emil Velikov
---
src/egl/drivers/dri2/egl_dri2.c | 28 ++--
1 file changed, 10 insertions(+), 18 deletions(-)
diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/
From: Emil Velikov
Will allow us to reuse the function for optional extensions and fold a
bit of code.
v2: Make dri2_bind_extensions::optional flag an argument to
dri2_bind_extensions (Kristian).
Cc: Rob Clark
Signed-off-by: Emil Velikov
---
src/egl/drivers/dri2/egl_dri2.c | 24 +
From: Emil Velikov
Remove the error prone fixed size array.
While we're here also rename to loader_extensions like in the GLX code.
v2: Rebase. Keep image_loader_extension within the wayland_drm
dri2_loader_extensions list.
Signed-off-by: Emil Velikov
---
src/egl/drivers/dri2/egl_dri2.c
From: Emil Velikov
Analogous to the earlier android and wayland patches. As we're here we
can drop exposing the old version of the extension.
Any dri loader/driver interface use lower bound checking thus exposing
dri2 loader v3 to a v2 capable driver is perfectly normal.
v2: Preserve compat wit
FTR, the new piglit test passed as-is on NVIDIA hw (at least nv50 and
nvc0). I'm not opposed to this new state dependency if Marek isn't
(he's analyzed these things a whole lot more than I suspect anyone
else), but just wanted to point it out in case the preference is to
instead change how gallium
> Do you need someone to push it for you?
Yeah, I don't have push rights.
Thanks for the review, it didn't occur to me before to look at logs to
see what prefix is correct.
Martina
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.
A couple of forward-declarations were causing warnings in clang:
'value' defined as a class here but previously declared as a struct
[-Wmismatched-tags]
Signed-off-by: Martina Kollarova
Reviewed-by: Bas Nieuwenhuizen
---
src/gallium/drivers/r600/sb/sb_ir.h | 6 +-
1 file changed, 1
Am 16.09.2016 um 15:48 schrieb Nicolai Hähnle:
> From: Dave Airlie
>
> This just adds the basic support for 64-bit opcodes,
> and the new types.
>
> v2: add conversion opcodes.
> add documentation.
>
> Reviewed-by: Marek Olšák
> Reviewed-by: Nicolai Hähnle
> Signed-off-by: Dave Airlie
> ---
On 16.09.2016 15:57, Nicolai Hähnle wrote:
From: Nicolai Hähnle
---
docs/features.txt | 2 +-
docs/relnotes/12.1.0.html | 1 +
src/gallium/drivers/radeonsi/si_pipe.c | 2 +-
3 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/docs/features.txt b/doc
Changes make copy_propagation_elements pass faster, reducing link
time spent in test case of bug 94477. Does not fix the actual issue
but brings down the total time. No regressions seen in CI.
Signed-off-by: Tapani Pälli
---
For performance measurements, Martina reported in the bug 8x speedup
to
2016-09-16 15:57 GMT+02:00 Nicolai Hähnle :
> From: Nicolai Hähnle
>
> ---
> src/gallium/drivers/radeon/r600_pipe_common.c | 3 +
> src/gallium/drivers/radeon/r600_pipe_common.h | 2 +
> src/gallium/drivers/radeon/r600_query.c | 391
> +-
> src/gallium/drivers/r
I don't think the "gallium:" commit message prefix is correct here.
Looking at the logs it should be "r600g/sb:".
With that change:
Reviewed-by: Bas Nieuwenhuizen
Do you need someone to push it for you?
- Bas
On Fri, Sep 16, 2016 at 4:58 PM, Martina Kollarova
wrote:
> A couple of forward-dec
On Thu, Sep 15, 2016 at 9:03 AM, Timothy Arceri
wrote:
> + const int bias[] = { -1, 1, 1 };
> +
> + for (unsigned i = 0; i < ARRAY_SIZE(bias); i++) {
> + iter_int = iter_int + bias[i];
> +
> + switch (cond_op) {
> + case nir_op_ige:
> + case nir_op_ilt:
> + case nir_op
A couple of forward-declarations were causing warnings in clang:
'value' defined as a class here but previously declared as a struct
[-Wmismatched-tags]
Signed-off-by: Martina Kollarova
---
src/gallium/drivers/r600/sb/sb_ir.h | 6 +-
1 file changed, 1 insertion(+), 5 deletions(-)
di
On Thu, Sep 15, 2016 at 9:03 AM, Timothy Arceri
wrote:
> This will be used by the loop unroll and lcssa passes.
>
> V2:
> - Check instruction count is not too large for unrolling
> - Add helper for complex loop unrolling
> ---
> src/compiler/nir/nir.h | 31 +++
> 1 fil
On 09/16/2016 08:07 AM, Marek Olšák wrote:
On Thu, Sep 15, 2016 at 11:20 PM, Brian Paul wrote:
Regardless of whether GL_MULTISAMPLE is enabled (it's enabled by default)
we should not set the alpha_to_coverage or alpha_to_one flags if the
current drawing buffer does not do MSAA.
This fixes the
This Patch is Reviewed-by: Leo Liu
On 09/16/2016 08:51 AM, Nayan Deshmukh wrote:
In case of prime when rendering is done on GPU other then the
server GPU, use a seprate linear buffer for each back buffer
which will be displayed using present extension.
v2: Use a seprate linear buffer for each
On Thu, Sep 15, 2016 at 11:20 PM, Brian Paul wrote:
> Regardless of whether GL_MULTISAMPLE is enabled (it's enabled by default)
> we should not set the alpha_to_coverage or alpha_to_one flags if the
> current drawing buffer does not do MSAA.
>
> This fixes the new piglit gl-1.3-alpha_to_coverage_n
From: Nicolai Hähnle
For bottom-of-pipe fences inside the gfx command stream.
---
src/gallium/drivers/radeon/r600_pipe_common.c | 52 +++
src/gallium/drivers/radeon/r600_pipe_common.h | 5 +++
src/gallium/drivers/radeonsi/si_perfcounter.c | 41 ++---
3 fi
From: Nicolai Hähnle
There are driver-specific context flags for barriers that are not covered
by the Gallium barrier interfaces.
The R600 settings of these flags may not be optimal, but we're not going
to use them yet anyway.
---
src/gallium/drivers/r600/r600_pipe.c | 6 ++
src/g
From: Nicolai Hähnle
We will support the waiting option in ARB_query_buffer_object using
WAIT_REG_MEM on an appropriate fence-like dword. Some queries conveniently
write their results with the highest bit set, and we can just use that;
for others, we have to write a fence explicitly.
ZPASS_DONE
From: Nicolai Hähnle
To ensure that fences are properly initialized.
---
src/gallium/drivers/radeon/r600_query.c | 26 ++
src/gallium/drivers/radeon/r600_query.h | 2 +-
2 files changed, 11 insertions(+), 17 deletions(-)
diff --git a/src/gallium/drivers/radeon/r600_quer
From: Nicolai Hähnle
---
src/gallium/drivers/radeon/r600_query.c | 6 +-
1 file changed, 1 insertion(+), 5 deletions(-)
diff --git a/src/gallium/drivers/radeon/r600_query.c
b/src/gallium/drivers/radeon/r600_query.c
index b9041eb..c1c3599 100644
--- a/src/gallium/drivers/radeon/r600_query.c
From: Nicolai Hähnle
---
docs/features.txt | 2 +-
docs/relnotes/12.1.0.html | 1 +
src/gallium/drivers/radeonsi/si_pipe.c | 2 +-
3 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/docs/features.txt b/docs/features.txt
index 9850a43..8b87b08 100644
Hi all,
as the title says. The implementation uses a compute shader to summarize
data from the query buffers. As long as only one query buffer is in flight
(the normal case), that compute shader is launched exactly once, on a
single thread. If multiple buffers were required, then one compute grid
From: Nicolai Hähnle
These functions extract the pipe state structure from the current
descriptors, for state saving.
---
src/gallium/drivers/radeonsi/si_descriptors.c | 46 +++
src/gallium/drivers/radeonsi/si_state.h | 5 +++
2 files changed, 51 insertions(+)
dif
From: Nicolai Hähnle
---
src/gallium/drivers/radeon/r600_pipe_common.c | 3 +
src/gallium/drivers/radeon/r600_pipe_common.h | 2 +
src/gallium/drivers/radeon/r600_query.c | 391 +-
src/gallium/drivers/radeon/r600_query.h | 7 +
4 files changed, 402 inser
From: Nicolai Hähnle
Save compute shader state that will be used for the ARB_query_buffer_object
implementation.
---
src/gallium/drivers/radeon/r600_pipe_common.h | 3 +++
src/gallium/drivers/radeon/r600_query.h | 7 +++
src/gallium/drivers/radeonsi/si_state.c | 12
Without this we will regress the max-samplers piglit test on Gen6
and lower when loop unrolling is done in NIR. There is a check
in the GLSL IR linker that errors when it finds indirects and
EmitNoIndirectSampler is set.
As far as I can tell there is no reason for not enabling this for
all gens re
From: Nicolai Hähnle
- PIPE_CAP_INT64 is not there yet
- emit DIV/MOD without the divide-by-zero workaround
---
src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c | 5 +
src/gallium/drivers/radeonsi/si_pipe.c | 1 -
2 files changed, 5 insertions(+), 1 deletion(-)
diff --git a/
From: Dave Airlie
This adds support to TGSI for 64-bit integer immediates.
Reviewed-by: Marek Olšák
Reviewed-by: Nicolai Hähnle
Signed-off-by: Dave Airlie
---
src/gallium/auxiliary/tgsi/tgsi_dump.c | 14 ++
src/gallium/auxiliary/tgsi/tgsi_exec.c | 2 ++
src/gallium/auxiliary
From: Nicolai Hähnle
---
src/gallium/drivers/softpipe/sp_screen.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/src/gallium/drivers/softpipe/sp_screen.c
b/src/gallium/drivers/softpipe/sp_screen.c
index 01d7e8a..cd4269f 100644
--- a/src/gallium/drivers/softpipe/sp_screen.c
+++ b/src/galli
From: Dave Airlie
This adds all the opcodes to tgsi_exec for softpipe to use.
It also enables the cap.
v2: add conversion opcodes.
Reviewed-by: Nicolai Hähnle
Signed-off-by: Dave Airlie
---
src/gallium/auxiliary/tgsi/tgsi_exec.c | 673 +--
src/gallium/drivers/s
From: Dave Airlie
This enables 64-bit integer support in gallivm and
llvmpipe.
v2: add conversion opcodes.
Signed-off-by: Dave Airlie
---
src/gallium/auxiliary/gallivm/lp_bld_tgsi.c| 2 +
src/gallium/auxiliary/gallivm/lp_bld_tgsi.h| 4 +
src/gallium/auxiliary/gallivm/lp_bld
From: Dave Airlie
This passes all my current piglit tests except the variants on:
fs-op-div-int64_t-i64vec3
I'm guessing this is probably a backend bug.
[rfc: this needs more testing - just posting to show I've done
it]
Reviewed-by: Marek Olšák
Signed-off-by: Dave Airlie
---
.../drivers/rad
From: Dave Airlie
This just adds the basic support for 64-bit opcodes,
and the new types.
v2: add conversion opcodes.
add documentation.
Reviewed-by: Marek Olšák
Reviewed-by: Nicolai Hähnle
Signed-off-by: Dave Airlie
---
src/gallium/auxiliary/tgsi/tgsi_info.c | 92 +--
src/gall
From: Nicolai Hähnle
- PIPE_CAP_INT64 is not there yet
- restrict DIV/MOD defaults to the CPU, as for 32 bits
---
src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c | 17 -
src/gallium/drivers/llvmpipe/lp_screen.c | 1 -
2 files changed, 8 insertions(+), 10 deletions(-
From: Nicolai Hähnle
This should be analogous to 32-bit integers.
---
src/gallium/auxiliary/gallivm/lp_bld_tgsi.c | 4
1 file changed, 4 insertions(+)
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi.c
b/src/gallium/auxiliary/gallivm/lp_bld_tgsi.c
index b397261..68ac695 100644
--- a
Hi all,
this is really Dave's work, with a few touch-ups from me that I think make
sense. I've kept those separate with the intention to squash. I'd like to
land these in master even before the main ARB_gpu_shader_int64 stuff lands
(that is currently in Ian's court).
The reason is that radeonsi's
On Fri, Sep 16, 2016 at 4:03 AM, Christian König
wrote:
> Am 16.09.2016 um 09:50 schrieb Michel Dänzer:
>>
>> On 16/09/16 04:33 PM, Christian König wrote:
>>>
>>> Am 15.09.2016 um 21:43 schrieb Dave Airlie:
On 15 September 2016 at 17:43, Christian König
wrote:
>
> Am 15.09.
On Fri, Sep 16, 2016 at 10:03 AM, Christian König
wrote:
> Am 16.09.2016 um 09:50 schrieb Michel Dänzer:
>>
>> On 16/09/16 04:33 PM, Christian König wrote:
>>>
>>> Am 15.09.2016 um 21:43 schrieb Dave Airlie:
On 15 September 2016 at 17:43, Christian König
wrote:
>
> Am 15.09
Later we will pass compiler to nir_optimise to be used by the loop unroll
pass.
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 10 --
src/mesa/drivers/dri/i965/brw_nir.c | 7 ---
src/mesa/drivers/dri/i965/brw_nir.h | 4 ++--
src/mesa/drivers/dri/i
V2:
- enable on all gens
---
src/compiler/glsl/glsl_parser_extras.cpp | 12 +++-
src/mesa/drivers/dri/i965/brw_compiler.c | 5 -
src/mesa/drivers/dri/i965/brw_nir.c | 23 ++-
3 files changed, 29 insertions(+), 11 deletions(-)
diff --git a/src/compiler/glsl/gl
On Thu, Sep 15, 2016 at 11:35 PM, Ilia Mirkin wrote:
> What about integer RTs? I had to add a hack in nouveau to make it
> disable those when RT0 is an integer. It'd be more convenient if they
> were turned off in the first place.
Deriving one hw state from 2 states isn't hacking. That's normal.
---
src/compiler/nir/nir.h | 2 ++
src/compiler/nir/nir_clone.c | 41 ++---
2 files changed, 36 insertions(+), 7 deletions(-)
diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index 29a6f45..d052cad 100644
--- a/src/compiler/nir/nir.h
+++ b/sr
V2:
- tidy ups suggested by Connor.
- tidy up cloning logic and handle copy propagation
based of suggestion by Connor.
- use nir_ssa_def_rewrite_uses to fix up lcssa phis
suggested by Connor.
- add support for complex loop unrolling (two terminators)
- handle case were the ssa defs use outside t
This will be useful for fixing phi srcs when cloning a loop body
during loop unrolling.
---
src/compiler/nir/nir_clone.c | 36 +---
1 file changed, 21 insertions(+), 15 deletions(-)
diff --git a/src/compiler/nir/nir_clone.c b/src/compiler/nir/nir_clone.c
index 0e39
From: Thomas Helland
V2: Do a "depth first search" to convert to LCSSA
V3: Small comment fixup
V4: Rebase, adapt to removal of function overloads
V5: Rebase, adapt to relocation of nir to compiler/nir
Still need to adapt to potential if-uses
Work around nir_validate issue
V6 (Timothy)
From: Thomas Helland
This pass detects induction variables and calculates the
trip count of loops to be used for loop unrolling.
I've removed support for float induction values for now, for the
simple reason that they don't appear in my shader-db collection,
and so I don't see it as common enoug
This moves the nir_lower_indirect_derefs() call into
brw_preprocess_nir() so thats is called by both OpenGL and Vulkan
and removes that call to the old GLSL IR pass
lower_variable_index_to_cond_assign()
We want to do this pass in nir to be able to move loop unrolling
to nir.
There is a increase o
---
src/compiler/nir/nir_opt_remove_phis.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/src/compiler/nir/nir_opt_remove_phis.c
b/src/compiler/nir/nir_opt_remove_phis.c
index acaa6e1..d4344b0 100644
--- a/src/compiler/nir/nir_opt_remove_phis.c
+++ b/src/compiler/nir/nir_o
This will be used by the loop unroll and lcssa passes.
V2:
- Check instruction count is not too large for unrolling
- Add helper for complex loop unrolling
---
src/compiler/nir/nir.h | 31 +++
1 file changed, 31 insertions(+)
diff --git a/src/compiler/nir/nir.h b/src/
Sorry for the noise. Connor pointed out that some of my assumptions
for not enabling this on all gens were wrong this lead to finding
a subtle bug where loop analysis was being run when is shouldn't
due to an error with the loop analysis flag (0x10 vs 0x16).
This version enabled unrolling for all
In case of prime when rendering is done on GPU other then the
server GPU, use a seprate linear buffer for each back buffer
which will be displayed using present extension.
v2: Use a seprate linear buffer for each back buffer (Michel)
v3: Change variable names and fix coding style (Leo and Emil)
v4
Hi Michel,
Thanks for the review.
On Fri, Sep 16, 2016 at 1:47 PM, Christian König
wrote:
> Am 16.09.2016 um 10:07 schrieb Michel Dänzer:
>
>> On 14/09/16 02:34 PM, Nayan Deshmukh wrote:
>>
>>> In case of prime when rendering is done on GPU other then the
>>> server GPU, use a seprate linear bu
1 - 100 of 109 matches
Mail list logo