On 2017-08-03 22:26, Alex Deucher wrote:
IIRC, user_ptrs require page alignment.
Alex
I didn't follow the whole discussion (sorry if I'm saying something
redundant), but AMD's older OpenCL Optimization Guide [1] has some notes
regarding the implementation of the USE_HOST_PTR flag.
It initi
Hi,
there also is a patch needed to make this work for Xorg on the
xorg-devel list as well as preliminary piglit test to verify the
functionality on the piglit list.
Grigori
On 2017-08-03 20:07, Grigori Goronzy wrote:
---
src/glx/dri2_glx.c | 12
src/glx/dri3_glx.c
---
src/gallium/state_trackers/glx/xlib/glx_api.c | 55 ---
src/gallium/state_trackers/glx/xlib/xm_api.c | 6 ++-
src/gallium/state_trackers/glx/xlib/xm_api.h | 4 +-
3 files changed, 57 insertions(+), 8 deletions(-)
diff --git a/src/gallium/state_trackers/glx/xlib/glx
---
src/glx/dri2_glx.c | 12
src/glx/dri3_glx.c | 8
src/glx/dri_common.c| 52 -
src/glx/dri_common.h| 5 +
src/glx/drisw_glx.c | 3 +++
src/glx/glxclient.h | 6 ++
src/glx/glxextensions.c |
On 2017-07-19 23:51, Grigori Goronzy wrote:
The check is too aggressive and might also fail if context flags
appear after the no-error attribute in the context attribute list.
Delay the check to after attribute parsing to fix this.
---
This was found by the piglit test I just sent to the piglit
On 2017-07-18 20:25, Ian Romanick wrote:
On 07/14/2017 04:10 PM, Kenneth Graunke wrote:
Grigori recently added EGL_KHR_create_context_no_error support,
which causes EGL to pass a new __DRI_CTX_FLAG_NO_ERROR flag to
drivers when requesting an appropriate context mode.
driContextSetFlags() will a
The check is too aggressive and might also fail if context flags
appear after the no-error attribute in the context attribute list.
Delay the check to after attribute parsing to fix this.
---
This was found by the piglit test I just sent to the piglit ML. I promise,
next time I'll write tests befo
On 2017-07-18 20:25, Ian Romanick wrote:
On 07/14/2017 04:10 PM, Kenneth Graunke wrote:
Grigori recently added EGL_KHR_create_context_no_error support,
which causes EGL to pass a new __DRI_CTX_FLAG_NO_ERROR flag to
drivers when requesting an appropriate context mode.
driContextSetFlags() will a
On 2017-07-17 19:21, Emil Velikov wrote:
On 13 July 2017 at 12:09, Grigori Goronzy wrote:
On 2017-07-12 15:15, Emil Velikov wrote:
As mentioned in earlier commit no_error should be device agnostic.
Hence removing the st/dri bits and adding a DRI_CONF_MESA_NO_ERROR()
line next to
classic drivers all have code to explicitly balk at unknown flags. We
need to let it through or they'll fail to create a no_error context.
I can't test it, but LGTM, so:
Reviewed-by: Grigori Goronzy
---
src/mesa/drivers/dri/i915/intel_screen.c | 2 +-
src/mesa/driver
On 2017-07-14 23:30, Kenneth Graunke wrote:
This accidentally set __DRI_CTX_FLAG_NO_ERROR whenever any flags were
present. Just needs extra parenthesis.
Fixes: 4909519a6655 (egl: Add EGL_KHR_create_context_no_error support)
Reviewed-by: Grigori Goronzy
Sorry for breaking so much stuff
This was broken by commit 1ad24faa.
---
src/mesa/main/marshal.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/src/mesa/main/marshal.h b/src/mesa/main/marshal.h
index f2dc842..63e0295 100644
--- a/src/mesa/main/marshal.h
+++ b/src/mesa/main/marshal.h
@@ -257,7 +257,7 @@
This basic extension allows usage of the __DRI_CTX_FLAG_NO_ERROR flag.
This includes support code for classic Mesa drivers to switch on the
no-error mode if the flag is set.
v2: Move to common DRI code.
---
include/GL/internal/dri_interface.h | 19 +++
src/gallium/state_
This only adds the EGL side, needs to be plumbed into Mesa frontend.
v2: Add check for extension availability.
---
src/egl/drivers/dri2/egl_dri2.c | 20 ++--
src/egl/drivers/dri2/egl_dri2.h | 1 +
src/egl/main/eglapi.c | 1 +
src/egl/main/eglcontext.c | 31 ++
Add a new context flag and plumb it through the various layers of the
context creation code to set up dispatch tables for the no-error mode.
---
src/gallium/include/state_tracker/st_api.h | 1 +
src/gallium/state_trackers/dri/dri_context.c | 3 +++
src/mesa/state_tracker/st_context.c
Allows applications to be whitelisted.
v2: Remove misguided DRI common part.
---
src/gallium/state_trackers/dri/dri_context.c| 3 +++
src/gallium/state_trackers/dri/dri_screen.c | 1 +
src/mesa/drivers/dri/common/xmlpool/t_options.h | 5 +
3 files changed, 9 insertions(+)
diff --git
On 2017-07-12 15:15, Emil Velikov wrote:
As mentioned in earlier commit no_error should be device agnostic.
Hence removing the st/dri bits and adding a DRI_CONF_MESA_NO_ERROR()
line next to DRI_CONF_VBLANK_MODE seems like the better solution.
Hm, driconf overrides are typically set per screen
On 2017-07-12 15:08, Emil Velikov wrote:
On 11 July 2017 at 23:26, Grigori Goronzy wrote:
Add a new context flag and plumb it through the various layers of the
context creation code to set up dispatch tables for the no-error mode.
---
src/gallium/include/state_tracker/st_api.h | 1 +
src
On 2017-07-12 15:16, Emil Velikov wrote:
On 11 July 2017 at 23:26, Grigori Goronzy wrote:
Hi,
this series implements support for the EGL_KHR_context_create_no
error extension and the associated plumbing through the different
layers of Mesa - EGL, DRI, Gallium state tracker, Mesa frontend. It
On 2017-07-12 12:33, Eric Engestrom wrote:
+ case EGL_CONTEXT_OPENGL_NO_ERROR_KHR:
+ if (dpy->Version < 14) {
+err = EGL_BAD_ATTRIBUTE;
+break;
+ }
+
+ /* The KHR_no_error spec only applies against OpenGL 2.0+
and
+ * OpenGL ES 2.0+
This basic extension allows usage of the __DRI_CTX_FLAG_NO_ERROR flag.
This includes support code for classic Mesa drivers to switch on the
no-error mode if the flag is set.
---
include/GL/internal/dri_interface.h | 19 +++
src/gallium/state_trackers/dri/dri2.c| 6
This only adds the EGL side, needs to be plumbed into Mesa frontend.
---
src/egl/drivers/dri2/egl_dri2.c | 20 ++--
src/egl/drivers/dri2/egl_dri2.h | 1 +
src/egl/main/eglapi.c | 1 +
src/egl/main/eglcontext.c | 30 ++
src/egl/main/eglc
Hi,
this series implements support for the EGL_KHR_context_create_no
error extension and the associated plumbing through the different
layers of Mesa - EGL, DRI, Gallium state tracker, Mesa frontend. It
took me a while to figure out how everything is connected together
and still it's somewhat conf
Add a new context flag and plumb it through the various layers of the
context creation code to set up dispatch tables for the no-error mode.
---
src/gallium/include/state_tracker/st_api.h | 1 +
src/gallium/state_trackers/dri/dri_context.c | 3 +++
src/mesa/state_tracker/st_context.c
Allows applications to be whitelisted.
---
src/gallium/state_trackers/dri/dri_context.c| 3 +++
src/gallium/state_trackers/dri/dri_screen.c | 1 +
src/mesa/drivers/dri/common/dri_util.c | 3 +++
src/mesa/drivers/dri/common/xmlpool/t_options.h | 5 +
4 files changed, 12 inserti
The semantics are similar to glBufferData. Fixes a crash with VMWare
Player.
Signed-off-by: Grigori Goronzy
---
src/mesa/main/marshal.c | 17 +
1 file changed, 13 insertions(+), 4 deletions(-)
diff --git a/src/mesa/main/marshal.c b/src/mesa/main/marshal.c
index 8db4531..b801bdc
On 2017-06-26 15:51, Marc Dietrich wrote:
Am Montag, 26. Juni 2017, 15:35:15 CEST schrieb Grigori Goronzy:
On 2017-06-26 15:11, Marc Dietrich wrote:
> unfortunately, this change broke vmware/vmplayer here (bisected).
> Windows
> guest on linux host. Sig 11 in SVGA driver. Al
On 2017-07-09 18:52, Matt Turner wrote:
+static inline size_t buffer_to_size(GLenum buffer)
+{
+ switch (buffer) {
+ case GL_COLOR:
+ return 4;
+ case GL_DEPTH_STENCIL:
+ return 2;
+ case GL_STENCIL:
+ case GL_DEPTH:
+ return 1;
+ default:
+ return 0;
+ }
+}
+
+s
Add async marshalling/unmarshalling for all glClearBuffer variants.
These entry points are commonly used in general and Alien Isolation
specifically uses glClearBufferiv. Slightly reduces the number of
thread synchronizations with glthread in that game.
---
src/mapi/glapi/gen/GL3x.xml | 6 +-
sr
Extract clear buffer helper functions in preparation for adding
marshal/unmarshal functions for the various glClearBuffer variants.
---
src/mesa/main/marshal.c | 74 +++--
src/mesa/main/marshal.h | 5 ++--
2 files changed, 50 insertions(+), 29 deletions
turns the switch/case block into an
efficient jump table with the ID method, so an array for function lookup
instead of that doesn't improve anything.
I didn't see any measurable benefit of the function pointer method
either.
Best regards
Grigori
On Fri, Jun 30, 2017 at 7:14 PM,
On 2017-06-30 15:27, Nicolai Hähnle wrote:
On 30.06.2017 02:29, Grigori Goronzy wrote:
Use function pointers to identify the unmarshalling function, which
is simpler and gets rid of a lot generated code.
This removes an indirection and possibly results in a slight speedup
as well.
The fact
Use function pointers to identify the unmarshalling function, which
is simpler and gets rid of a lot generated code.
This removes an indirection and possibly results in a slight speedup
as well.
---
src/mapi/glapi/gen/Makefile.am | 4 --
src/mapi/glapi/gen/gl_marshal.py | 36 ++
don't really get it, by the way. Isn't the SVGA driver for Linux
guests?
Best regards
Grigori
> Best regards
> Grigori
>
>> [1]
>> https://lists.freedesktop.org/archives/mesa-dev/2017-June/160329.html
>>
>> On 25/06/17 02:59, Grigori Goronzy wrot
On 2017-06-22 17:10, Marek Olšák wrote:
From: Marek Olšák
+2.3% better score on Fiji. It might be better without HBM.
Is this really useful? Superposition is a benchmark. It would make more
sense if this also targeted some actual games.
Optimizations specific to only benchmarks are considere
ow much. It wouldn't surprise me if it is in the
40-50% region with both, though.
Best regards
Grigori
[1]
https://lists.freedesktop.org/archives/mesa-dev/2017-June/160329.html
On 25/06/17 02:59, Grigori Goronzy wrote:
These entry points are used by Alien Isolation and caused
synchroni
These entry points are used by Alien Isolation and caused
synchronization with glthread. The async marshalling implementation
is similar to glBuffer(Sub)Data.
Results in an approximately 6x drop in glthread synchronizations and a
~30% FPS jump in Alien Isolation (Medium preset, Athlon 860K, RX 480
On 2017-06-23 13:48, Andy Furniss wrote:
Marek Olšák wrote:
From: Marek Olšák
The kernel sort of does the same thing with fences.
v2: do emit partial flushes on SI
Bugzilla seems to be down currently so replying here.
On R9 285 with current agd5f 4.13-wip kernel I get some slight
artifacts
n the end,
BEST_SPEED might be a better compromise, particularly for systems with a
slow CPU.
Apart from that, consider the series
Reviewed-by: Grigori Goronzy
Best regards
Grigori
Am Donnerstag, 2. März 2017, 03:20:05 CET schrieb Matt Turner:
On Wed, Mar 1, 2017 at 2:19 PM, Timothy Arceri
wrot
On 2016-10-04 12:32, Emil Velikov wrote:
On 2 October 2016 at 14:17, Axel Davy wrote:
I'd prefer myself Oct 14, because we have a lot of patches for nine,
and
they deserve more cleaning and testing, but if it's Oct 7, we'll try
be on
time.
14th it is. As mentioned before: _don't_ wait for t
---
src/amd/vulkan/radv_descriptor_set.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/src/amd/vulkan/radv_descriptor_set.c
b/src/amd/vulkan/radv_descriptor_set.c
index d1d2b1f..ba8a002 100644
--- a/src/amd/vulkan/radv_descriptor_set.c
+++ b/src/amd/vulkan/radv_descriptor_set.c
@@ -113,6 +1
---
src/amd/vulkan/radv_pipeline_cache.c | 7 +--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/src/amd/vulkan/radv_pipeline_cache.c
b/src/amd/vulkan/radv_pipeline_cache.c
index 032a7e4..85a2b6d 100644
--- a/src/amd/vulkan/radv_pipeline_cache.c
+++ b/src/amd/vulkan/radv_pipeli
This gets rid of "may be used uninitialized" compiler warnings.
---
src/amd/vulkan/radv_formats.c | 2 +-
src/amd/vulkan/radv_pipeline.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/src/amd/vulkan/radv_formats.c b/src/amd/vulkan/radv_formats.c
index 90c140c..76d5fa1 1006
On 2016-06-28 11:25, Nayan Deshmukh wrote:
This is a shader based bicubic interpolater which uses cubic
Hermite spline algorithm.
v2: set dst_area and dst_clip during scaling (Christian)
v3: clear the render target before rendering
v4: intialize offsets while initializing shaders
use a const
On 2016-05-27 15:16, Emil Velikov wrote:
The odd things is that VLC uses/used to? check that information before
feeding the video to the decoder, while others implementations (like
the original one in mplayer done by the Nvidia devs) do/did? not
bother.
Many files either have an incorrect leve
thout any calls into the kernel, right? The
winsys code makes that conditional and calls into the kernel when no
fence pointer is available.
Grigori
On 19.04.2016 18:13, Grigori Goronzy wrote:
Small IBs help to reduce stalls for workloads that require a lot of
synchronization. On the other han
Add missing break, add default case. Additionally initialize variables
to avoid compiler warnings.
---
src/gallium/winsys/amdgpu/drm/amdgpu_cs.c | 6 +-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_cs.c
b/src/gallium/winsys/amdgpu/drm/amdgp
Small IBs help to reduce stalls for workloads that require a lot of
synchronization. On the other hand, if there is no notable
synchronization, we can use a large IB size to slightly improve
performance in some cases.
This introduces tuning of the IB size based on feedback on the average
buffer wa
Interesting, and thanks for poking at this issue. I've been thinking
about tuning IB sizes as well. I'd like for us to get this right, so I
wonder: What's your theory for _why_ your change helps?
See below. I think you discovered it yourself.
I'll be honest with you: Right now, I think your a
On 2016-04-15 20:30, Jakob Sinclair wrote:
In other places in radeonsi that require reinterpretation (e.g.
si_blit.c), the surface template is modified instead of changing the
surface after creation. I'm not sure if r600/radeonsi like it if the
format is changed late like here. Seems to be cleane
Hi,
apps that cause a lot of synchronization benefit from small IB
sizes. The current IB size is a bit on the large side for this class
of apps. On the other hand, if there isn't much synchronization going
on, increasing the IB size can slightly improve performance, too.
Here's a quick hack that
Small IBs help to reduce stalls for workloads that require a lot of
synchronization. On the other hand, if there is no notable
synchronization, we can use a large IB size to slightly improve
performance in some cases.
This introduces tuning of the IB size based on feedback on the average
buffer wa
On 2016-04-15 18:38, Ilia Mirkin wrote:
+ } else {
+ union pipe_color_union color;
+ switch (util_format_get_blocksizebits(res->format)) {
+ case 128:
+ sf->format = PIPE_FORMAT_R32G32B32A32_UINT;
Just as an FYI... this is sa
e case: Only 1 viewport is active. */
- if (mask & 1 &&
- !si_get_vs_info(sctx)->writes_viewport_index) {
+ if (!si_get_vs_info(sctx)->writes_viewport_index) {
+ if (!(mask & 1))
+ return;
+
Reviewed-by: Grigori Goronzy
ssor & viewport code is deleted.
Thanks for implementing this properly.
Reviewed-by: Grigori Goronzy
Grigori
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
With the previous changes to handling of viewport clipping, it is
almost trivial to add proper support for guard band clipping. Select a
suitable integer clipping value to keep inside the rasterizer's guard
band range of [-32768, 32767] and program the hardware to use guard
band clipping.
Guard b
With the previous changes to handling of viewport clipping, it is
almost trivial to add proper support for guard band clipping. Select a
suitable integer clipping value to keep inside the rasterizer's guard
band range of [-32768, 32767] and program the hardware to use guard
band clipping.
Guard b
From: Marek Olšák
In other words, vport scissors are derived from viewport states.
If the scissor test is enabled, the intersection of both is used.
The guard band will disable clipping, so we have to clip per-pixel.
v2: fix check for r600_draw_rectangle and other overflow conditions.
(Grigori)
On 2016-02-23 17:45, Marek Olšák wrote:
From: Marek Olšák
This can increase perf for shaders that kill pixels (kill, alpha-test,
alpha-to-coverage).
---
src/gallium/drivers/radeonsi/si_shader.h| 1 +
src/gallium/drivers/radeonsi/si_state.c | 6 +++---
src/gallium/drivers/rade
On 2016-02-24 12:47, Marek Olšák wrote:
On Wed, Feb 24, 2016 at 12:22 PM, Grigori Goronzy
wrote:
S_00B32C_SCRATCH_EN(shader->config.scratch_bytes_per_wave > 0));
+
+ /* Prefer RE_Z if the shader is complex enough. */
+ if (info->num_memory_instructions >= 2 ||
+
Hi,
On 23.09.2015 10:11, Christian König wrote:
> From: Boyuan Zhang
>
> Signed-off-by: Boyuan Zhang
> Reviewed-by: Christian König
> ---
Thanks, nice to see this finally getting fixed, and it was a pretty
simple thing after all... well, not quite yet apparently. Sometimes
playback works corr
On 2015-06-09 22:52, Francisco Jerez wrote:
+
+ if (blocking)
+ hev().wait();
+
hard_event::wait() may fail, so this should probably be done before the
ret_object() call to avoid leaks.
Alright... C++ exceptions are a minefield. :)
Is there any reason you didn't make
the same change
On 2015-05-28 13:04, Grigori Goronzy wrote:
Work-group size should always be aligned to subgroup size; this is a
basic requirement, otherwise some work-items will be no-operation.
It might make sense to refine the value according to a kernel's
resource usage, but that's a possible op
On 28.05.2015 10:10, Grigori Goronzy wrote:
> Wrap MapBuffer and MapImage as hard_event actions, like other
> operations. This enables correct profiling. Also make sure to wait
> for events to finish when blocking is requested by the caller.
> ---
Ping?
> src/gallium/state_trac
On 28.05.2015 13:04, Grigori Goronzy wrote:
> We need this to implement OpenCL's
> CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE.
> ---
Ping?
> src/gallium/docs/source/screen.rst | 2 ++
> src/gallium/drivers/ilo/ilo_screen.c | 8
> src/
We need this to implement OpenCL's
CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE.
---
src/gallium/docs/source/screen.rst | 2 ++
src/gallium/drivers/ilo/ilo_screen.c | 8
src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 4
src/gallium/drivers/radeon/r600_pipe_
Work-group size should always be aligned to subgroup size; this is a
basic requirement, otherwise some work-items will be no-operation.
It might make sense to refine the value according to a kernel's
resource usage, but that's a possible optimization for the future.
---
src/gallium/state_trackers
Mapping can fail, and this should be handled. Return the proper error
code and abort the associated event in this case.
---
src/gallium/state_trackers/clover/api/transfer.cpp | 16 ++--
1 file changed, 14 insertions(+), 2 deletions(-)
diff --git a/src/gallium/state_trackers/clover/api
Wrap MapBuffer and MapImage as hard_event actions, like other
operations. This enables correct profiling. Also make sure to wait
for events to finish when blocking is requested by the caller.
---
src/gallium/state_trackers/clover/api/transfer.cpp | 50 --
1 file changed, 46 ins
same issues as SI? We should really
try to figure out what's wrong with tiled DMA copies.
Anyway,
Reviewed-by: Grigori Goronzy
> Signed-off-by: Michel Dänzer
> ---
> src/gallium/drivers/radeonsi/Makefile.sources | 1 +
> src/gallium/drivers/radeonsi/cik_sdma.c | 364
On 23.05.2015 15:53, Francisco Jerez wrote:
>> diff --git a/src/gallium/state_trackers/clover/core/resource.cpp
>> b/src/gallium/state_trackers/clover/core/resource.cpp
>> index 8ed4c42..8e51b3c 100644
>> --- a/src/gallium/state_trackers/clover/core/resource.cpp
>> +++ b/src/gallium/state_trackers
According to spec, CL_MEM_USE_HOST_PTR should directly use host memory,
if possible. This is just what userptr is for, so use it.
In case the memory cannot be mapped, a fallback similar to
CL_MEM_COPY_HOST_PTR is used.
---
src/gallium/state_trackers/clover/core/memory.cpp | 2 +-
src/gallium/s
This flag is typically used to request pinned host memory, to avoid
any copies between GPU and CPU.
This improves throughput with an older OpenCL app which I unfortunately
can't publish due to its licensing.
---
src/gallium/state_trackers/clover/core/resource.cpp | 4
1 file changed, 4 inser
Am 2015-02-18 09:13, schrieb Michel Dänzer:
On 18.02.2015 16:52, Grigori Goronzy wrote:
Hi,
AFAIR not enabling this makes LLVM generate really slow code in some
common cases. Maybe this is just a bug in LLVM/R600 triggered by
unsafe
FP math optimization or some optimization is too eager
Hi,
AFAIR not enabling this makes LLVM generate really slow code in some
common cases. Maybe this is just a bug in LLVM/R600 triggered by unsafe
FP math optimization or some optimization is too eager. Other drivers do
fine with these types of optimization.
What's the impact on performance with un
Reviewed-by: Grigori Goronzy
I've been using a similar patch to fix stability issues on my machine
for quite a while. Still, it's a pity we have to go that far to get
everything stable again.
On 13.11.2014 07:52, Michel Dänzer wrote:
> From: Michel Dänzer
>
> Using the asyn
On 30.09.2014 05:58, Michel Dänzer wrote:
> diff --git a/src/gallium/drivers/radeonsi/si_dma.c
> b/src/gallium/drivers/radeonsi/si_dma.c
> index ff64722..643ce3f 100644
> --- a/src/gallium/drivers/radeonsi/si_dma.c
> +++ b/src/gallium/drivers/radeonsi/si_dma.c
> @@ -251,7 +251,9 @@ void si_dma_cop
LGTM, but I have a comments below.
Grigori
On 10.09.2014 10:54, Michel Dänzer wrote:
> From: Michel Dänzer
>
> Signed-off-by: Michel Dänzer
> ---
>
> This might help for investigating DMA related bugs.
>
> src/gallium/drivers/radeonsi/si_dma.c | 103
> ++
>
On 08.09.2014 21:07, Axel Davy wrote:
> On 08/09/2014 20:21, Grigori Goronzy wrote :
>> On 08.09.2014 14:50, Axel Davy wrote:
>>> Hi,
>>>
>>> When reading si_dma.c code, it looks like the requested width of the
>>> copy is ignored except for PIPE_BUFFER
On 08.09.2014 14:50, Axel Davy wrote:
> Hi,
>
> When reading si_dma.c code, it looks like the requested width of the
> copy is ignored except for PIPE_BUFFER.
> Perhaps that explains the bugs observed ?
>
It isn't ignored. Partial DMA copies (i.e. operations that do not copy
whole lines) are simp
On 29.08.2014 12:31, Andy Furniss wrote:
>> As for that 4:2:2 "doesn't work", AFAICT it absolutely does, but
>> there is no linear interpolation for chroma, so quality isn't ideal.
>> This seems to be a hardware restriction, unfortunately.
>
> Hmm, we may have to disagree on the definition of work
On 29.08.2014 10:19, Christian König wrote:
>
> That sounds like something doesn't work correctly.
>
> The resources are created with the subsamled formats R8G8_R8B8 or
> G8R8_B8R8, but since this can't be accessed by the CB we need to use
> R8G8B8A8 as surface format for writing to them.
>
> If
On 04.07.2014 01:24, Andy Furniss wrote:
> Maybe not 1/frame but anyway the first couple of a run have numbers
> rather than s
>
> [27977.386795] radeon :01:00.0: GPU fault detected: 146 0x0c035014
> [27977.386800] radeon :01:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR
> 0x15E0
> [
Passes all piglit tests.
v2: rebased
---
src/gallium/drivers/radeonsi/si_state.c | 20
1 file changed, 20 insertions(+)
diff --git a/src/gallium/drivers/radeonsi/si_state.c
b/src/gallium/drivers/radeonsi/si_state.c
index 6e9a60a..4f7adea 100644
--- a/src/gallium/drivers/rad
Passes corrected piglit test and should also handle signed vs unsigned
float correctly.
---
src/gallium/drivers/radeonsi/si_state.c | 20
1 file changed, 20 insertions(+)
diff --git a/src/gallium/drivers/radeonsi/si_state.c
b/src/gallium/drivers/radeonsi/si_state.c
index 3de
On 17.07.2014 21:24, Tom Stellard wrote:
> On Thu, Jul 17, 2014 at 06:44:25PM +0200, Grigori Goronzy wrote:
>> Accuracy of some operations was recently improved in the R600 backend,
>> at the cost of slower code. This is required for compute shaders,
>> but not for graphics s
On 18.07.2014 13:45, Marek Olšák wrote:
> If the requirements of GL_MAP_COHERENT_BIT are satisfied, then the
> patch is okay.
>
Apart from correctness, I still wonder how this will affect performance,
most notably CPU reads. This change unconditionally uses write-combined,
uncached memory for MAP_
Use K&R and same indent as most other code. No functional change
intended.
---
src/gallium/drivers/radeon/radeon_llvm_emit.c | 24 ++--
1 file changed, 14 insertions(+), 10 deletions(-)
diff --git a/src/gallium/drivers/radeon/radeon_llvm_emit.c
b/src/gallium/drivers/radeon/ra
Accuracy of some operations was recently improved in the R600 backend,
at the cost of slower code. This is required for compute shaders,
but not for graphics shaders. Add unsafe-fp-math hint to make LLVM
generate faster but possibly less accurate code.
Piglit didn't indicate any regressions.
---
On 17.07.2014 12:01, Michel Dänzer wrote:
> From: Michel Dänzer
>
> This is hopefully safe: The kernel makes sure writes to these mappings
> finish before the GPU might start reading from them, and the GPU caches
> are invalidated at the start of a command stream.
>
Aren't CPU reads from write-c
On 02.07.2014 22:18, Andy Furniss wrote:
>
> Before I knew how to get field sync to use my TVs deinterlacer I had to
> modify mesa so that I could use the vdpau de-interlacer(s), when I did
> this I noticed that 422 didn't work and looked the same as it does now
> this has gone in with my si.
>
A
> This looks good to me.
>>
>> Reviewed-by: Marek Olšák
>>
>> Marek
>>
>> On Wed, Jun 4, 2014 at 6:54 PM, Grigori Goronzy
>> wrote:
>>> This makes 4:2:2 video surfaces work in VDPAU.
>>> ---
>>> src/gal
Ping? I'm not sure if this is completely correct, but this code path is
only excercised by VDPAU and it seems to work fine on SI.
Grigori
On 04.06.2014 18:54, Grigori Goronzy wrote:
> This makes 4:2:2 video surfaces work in VDPAU.
> ---
> src/gallium/drivers/radeon/r600_texture.c
It's about as broken as on later UVD revisions.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66452
Cc: "10.1 10.2"
---
src/gallium/drivers/radeon/radeon_video.c | 5 -
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/src/gallium/drivers/radeon/radeon_video.c
b/src/gall
This makes 4:2:2 video surfaces work in VDPAU.
---
src/gallium/drivers/radeon/r600_texture.c | 5 +-
src/gallium/drivers/radeonsi/si_blit.c| 91 ++-
src/gallium/drivers/radeonsi/si_state.c | 15 +
3 files changed, 71 insertions(+), 40 deletions(-)
diff --git
We need this for radeonsi, and it might be useful for other drivers,
too.
---
src/gallium/auxiliary/util/u_format.c | 11 +++
src/gallium/auxiliary/util/u_format.h | 3 +++
src/gallium/drivers/r600/r600_blit.c | 12 +---
3 files changed, 15 insertions(+), 11 deletions(-)
diff --
On 20.04.2014 03:02, Marek Olšák wrote:
It looks like the check is not needed with SB, because SB performs
register allocation. What happens if you comment out the conditional
which fails?
SB takes the machine code generated by the "classic" compiler as input,
so the check is still needed. Th
On 10.04.2014 11:23, Michel Dänzer wrote:
From: Michel Dänzer
---
This is just an RFC; if other developers approve of this approach, I can
make a more extensive patch removing the use_reusable_pool parameters.
The x11perf numbers below compare ShmGet/PutImage before and after this
change with
---
src/gallium/state_trackers/vdpau/mixer.c | 8
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/src/gallium/state_trackers/vdpau/mixer.c
b/src/gallium/state_trackers/vdpau/mixer.c
index 996fd8e..e6bfb8c 100644
--- a/src/gallium/state_trackers/vdpau/mixer.c
+++ b/src/galli
The spec incorrectly used void as return type, when it should have
been GLboolean. This has now been fixed. According to Nvidia, their
implementation always used GLboolean.
---
include/GL/glext.h | 2 +-
src/mapi/glapi/gen/NV_vdpau_interop.xml | 1 +
src/mesa/main/vdpau.c
1 - 100 of 161 matches
Mail list logo