[Mesa-dev] [Bug 70410] egl-static/Makefile: linking fails with llvm >= 3.4
https://bugs.freedesktop.org/show_bug.cgi?id=70410 Mike Lothian changed: What|Removed |Added CC||m...@fireburn.co.uk --- Comment #5 from Mike Lothian --- The Gentoo bug is https://bugs.gentoo.org/show_bug.cgi?id=481316 I've readded llvm--r1 to my overlay again with --disable-terminfo until the problem is fixed I'm pretty sure this is a bug in LLVM (probably llvm-config) or maybe the way we compile LLVM in Gentooland Either way I think this bug should be closed -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 70410] egl-static/Makefile: linking fails with llvm >= 3.4
https://bugs.freedesktop.org/show_bug.cgi?id=70410 --- Comment #6 from Mike Lothian --- It seems that compiling mesa with --with-llvm-shared-libs when ever llvm is used rather than just with opencl fixes the issue to (at least for me) -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] gallium: add PIPE_CAP_MIXED_FRAMEBUFFER_SIZES
For the series: Reviewed-by: Marek Olšák Marek On Sun, Oct 13, 2013 at 3:43 AM, Ilia Mirkin wrote: > ping > > On Fri, Oct 4, 2013 at 4:32 AM, Ilia Mirkin wrote: >> This CAP will determine whether ARB_framebuffer_object can be enabled. >> The nv30 driver does not allow mixing swizzled and linear zsbuf/cbuf >> textures. >> >> Signed-off-by: Ilia Mirkin >> --- >> src/gallium/docs/source/screen.rst | 3 +++ >> src/gallium/drivers/freedreno/freedreno_screen.c | 1 + >> src/gallium/drivers/i915/i915_screen.c | 1 + >> src/gallium/drivers/ilo/ilo_screen.c | 1 + >> src/gallium/drivers/llvmpipe/lp_screen.c | 1 + >> src/gallium/drivers/nouveau/nv30/nv30_screen.c | 1 + >> src/gallium/drivers/nouveau/nv50/nv50_screen.c | 1 + >> src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 1 + >> src/gallium/drivers/r300/r300_screen.c | 1 + >> src/gallium/drivers/r600/r600_pipe.c | 1 + >> src/gallium/drivers/radeonsi/radeonsi_pipe.c | 1 + >> src/gallium/drivers/softpipe/sp_screen.c | 1 + >> src/gallium/drivers/svga/svga_screen.c | 1 + >> src/gallium/include/pipe/p_defines.h | 3 ++- >> 14 files changed, 17 insertions(+), 1 deletion(-) >> >> diff --git a/src/gallium/docs/source/screen.rst >> b/src/gallium/docs/source/screen.rst >> index d19cd1a..a01f548 100644 >> --- a/src/gallium/docs/source/screen.rst >> +++ b/src/gallium/docs/source/screen.rst >> @@ -173,6 +173,9 @@ The integer capabilities: >>viewport/scissor combination. >> * ''PIPE_CAP_ENDIANNESS``:: The endianness of the device. Either >>PIPE_ENDIAN_BIG or PIPE_ENDIAN_LITTLE. >> +* ``PIPE_CAP_MIXED_FRAMEBUFFER_SIZES``: Whether it is allowed to have >> + different sizes for fb color/zs attachments. This controls whether >> + ARB_framebuffer_object is provided. >> >> >> .. _pipe_capf: >> diff --git a/src/gallium/drivers/freedreno/freedreno_screen.c >> b/src/gallium/drivers/freedreno/freedreno_screen.c >> index a038a77..7d0fb3b 100644 >> --- a/src/gallium/drivers/freedreno/freedreno_screen.c >> +++ b/src/gallium/drivers/freedreno/freedreno_screen.c >> @@ -140,6 +140,7 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum >> pipe_cap param) >> switch (param) { >> /* Supported features (boolean caps). */ >> case PIPE_CAP_NPOT_TEXTURES: >> + case PIPE_CAP_MIXED_FRAMEBUFFER_SIZES: >> case PIPE_CAP_TWO_SIDED_STENCIL: >> case PIPE_CAP_ANISOTROPIC_FILTER: >> case PIPE_CAP_POINT_SPRITE: >> diff --git a/src/gallium/drivers/i915/i915_screen.c >> b/src/gallium/drivers/i915/i915_screen.c >> index 556dda8..77607d0 100644 >> --- a/src/gallium/drivers/i915/i915_screen.c >> +++ b/src/gallium/drivers/i915/i915_screen.c >> @@ -172,6 +172,7 @@ i915_get_param(struct pipe_screen *screen, enum pipe_cap >> cap) >> /* Supported features (boolean caps). */ >> case PIPE_CAP_ANISOTROPIC_FILTER: >> case PIPE_CAP_NPOT_TEXTURES: >> + case PIPE_CAP_MIXED_FRAMEBUFFER_SIZES: >> case PIPE_CAP_POINT_SPRITE: >> case PIPE_CAP_PRIMITIVE_RESTART: /* draw module */ >> case PIPE_CAP_TEXTURE_SHADOW_MAP: >> diff --git a/src/gallium/drivers/ilo/ilo_screen.c >> b/src/gallium/drivers/ilo/ilo_screen.c >> index 3f8d431..ddf11ff 100644 >> --- a/src/gallium/drivers/ilo/ilo_screen.c >> +++ b/src/gallium/drivers/ilo/ilo_screen.c >> @@ -286,6 +286,7 @@ ilo_get_param(struct pipe_screen *screen, enum pipe_cap >> param) >> >> switch (param) { >> case PIPE_CAP_NPOT_TEXTURES: >> + case PIPE_CAP_MIXED_FRAMEBUFFER_SIZES: >> case PIPE_CAP_TWO_SIDED_STENCIL: >>return true; >> case PIPE_CAP_MAX_DUAL_SOURCE_RENDER_TARGETS: >> diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c >> b/src/gallium/drivers/llvmpipe/lp_screen.c >> index b3cd77f..2bbc2c9 100644 >> --- a/src/gallium/drivers/llvmpipe/lp_screen.c >> +++ b/src/gallium/drivers/llvmpipe/lp_screen.c >> @@ -109,6 +109,7 @@ llvmpipe_get_param(struct pipe_screen *screen, enum >> pipe_cap param) >> case PIPE_CAP_MAX_COMBINED_SAMPLERS: >>return 2 * PIPE_MAX_SAMPLERS; /* VS + FS samplers */ >> case PIPE_CAP_NPOT_TEXTURES: >> + case PIPE_CAP_MIXED_FRAMEBUFFER_SIZES: >>return 1; >> case PIPE_CAP_TWO_SIDED_STENCIL: >>return 1; >> diff --git a/src/gallium/drivers/nouveau/nv30/nv30_screen.c >> b/src/gallium/drivers/nouveau/nv30/nv30_screen.c >> index 50ddfec..807100e 100644 >> --- a/src/gallium/drivers/nouveau/nv30/nv30_screen.c >> +++ b/src/gallium/drivers/nouveau/nv30/nv30_screen.c >> @@ -125,6 +125,7 @@ nv30_screen_get_param(struct pipe_screen *pscreen, enum >> pipe_cap param) >> case PIPE_CAP_QUERY_PIPELINE_STATISTICS: >> case PIPE_CAP_TEXTURE_BORDER_COLOR_QUIRK: >> case PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE: >> + case PIPE_CAP_MIXED_FRAMEBUFFER_SIZES: >>return 0; >> case PIPE_CAP_VERTEX_BUFFER_OFFSET_4BYTE_ALIGNED_ONLY: >> case PIPE_CAP_VERTEX_BUFFER_STRIDE_4BYTE_ALIGNED_ONLY:
[Mesa-dev] [Bug 70402] SIGSEGV when selecting polygons with i915 (libdricore9.2.0.so)
https://bugs.freedesktop.org/show_bug.cgi?id=70402 Igor Gnatenko changed: What|Removed |Added CC||i.gnatenko.br...@gmail.com -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 70410] egl-static/Makefile: linking fails with llvm >= 3.4
https://bugs.freedesktop.org/show_bug.cgi?id=70410 --- Comment #7 from David "okias" Heidelberger --- in ixit overlay I yesterday added configurable llvm [ncurses,tinfo]. I'm using gcc-4.9-git, but it was same with gcc-4.8.0 -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/3] radeon: use staging for mapping linear textures
Textures that likely reside in VRAM, are mapped for reading and don't require direct mapping should be staged into GTT, to avoid bad performance. This fixes readback performance of VDPAU surfaces. --- src/gallium/drivers/radeon/r600_texture.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/src/gallium/drivers/radeon/r600_texture.c b/src/gallium/drivers/radeon/r600_texture.c index ebb70906..9ba1e36 100644 --- a/src/gallium/drivers/radeon/r600_texture.c +++ b/src/gallium/drivers/radeon/r600_texture.c @@ -852,6 +852,12 @@ static void *r600_texture_transfer_map(struct pipe_context *ctx, if (rtex->surface.level[level].mode >= RADEON_SURF_MODE_1D) use_staging_texture = TRUE; + /* Untiled buffers in VRAM, which is slow for CPU reads */ + if ((usage & PIPE_TRANSFER_READ) && !(usage & PIPE_TRANSFER_MAP_DIRECTLY) && + (rtex->resource.domains == RADEON_DOMAIN_VRAM)) { + use_staging_texture = TRUE; + } + /* Use a staging texture for uploads if the underlying BO is busy. */ if (!(usage & PIPE_TRANSFER_READ) && (r600_rings_is_buffer_referenced(rctx, rtex->resource.cs_buf, RADEON_USAGE_READWRITE) || -- 1.8.1.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/3] radeon/uvd: use PIPE_BIND_LINEAR for video surfaces
This new bind flag forces linear storage, but does not have other side effects like R600_RESOURCE_FLAG_TRANSFER. --- src/gallium/drivers/r600/r600_uvd.c | 6 +++--- src/gallium/drivers/radeonsi/radeonsi_uvd.c | 8 2 files changed, 7 insertions(+), 7 deletions(-) diff --git a/src/gallium/drivers/r600/r600_uvd.c b/src/gallium/drivers/r600/r600_uvd.c index 05d2ad0..300bccb 100644 --- a/src/gallium/drivers/r600/r600_uvd.c +++ b/src/gallium/drivers/r600/r600_uvd.c @@ -77,7 +77,7 @@ struct pipe_video_buffer *r600_video_buffer_create(struct pipe_context *pipe, vl_video_buffer_template(&templ, &template, resource_formats[0], 1, array_size, PIPE_USAGE_STATIC, 0); if (ctx->b.chip_class < EVERGREEN || tmpl->interlaced) - templ.flags = R600_RESOURCE_FLAG_TRANSFER; + templ.bind = PIPE_BIND_LINEAR; resources[0] = (struct r600_texture *) pipe->screen->resource_create(pipe->screen, &templ); if (!resources[0]) @@ -86,7 +86,7 @@ struct pipe_video_buffer *r600_video_buffer_create(struct pipe_context *pipe, if (resource_formats[1] != PIPE_FORMAT_NONE) { vl_video_buffer_template(&templ, &template, resource_formats[1], 1, array_size, PIPE_USAGE_STATIC, 1); if (ctx->b.chip_class < EVERGREEN || tmpl->interlaced) - templ.flags = R600_RESOURCE_FLAG_TRANSFER; + templ.bind = PIPE_BIND_LINEAR; resources[1] = (struct r600_texture *) pipe->screen->resource_create(pipe->screen, &templ); if (!resources[1]) @@ -96,7 +96,7 @@ struct pipe_video_buffer *r600_video_buffer_create(struct pipe_context *pipe, if (resource_formats[2] != PIPE_FORMAT_NONE) { vl_video_buffer_template(&templ, &template, resource_formats[2], 1, array_size, PIPE_USAGE_STATIC, 2); if (ctx->b.chip_class < EVERGREEN || tmpl->interlaced) - templ.flags = R600_RESOURCE_FLAG_TRANSFER; + templ.bind = PIPE_BIND_LINEAR; resources[2] = (struct r600_texture *) pipe->screen->resource_create(pipe->screen, &templ); if (!resources[2]) diff --git a/src/gallium/drivers/radeonsi/radeonsi_uvd.c b/src/gallium/drivers/radeonsi/radeonsi_uvd.c index 1cb3be0..6ecb17c 100644 --- a/src/gallium/drivers/radeonsi/radeonsi_uvd.c +++ b/src/gallium/drivers/radeonsi/radeonsi_uvd.c @@ -76,8 +76,8 @@ struct pipe_video_buffer *radeonsi_video_buffer_create(struct pipe_context *pipe template.height = align(tmpl->height / array_size, VL_MACROBLOCK_HEIGHT); vl_video_buffer_template(&templ, &template, resource_formats[0], 1, array_size, PIPE_USAGE_STATIC, 0); - /* TODO: Setting the transfer flag is only a workaround till we get tiling working */ - templ.flags = R600_RESOURCE_FLAG_TRANSFER; + /* TODO: get tiling working */ + templ.bind = PIPE_BIND_LINEAR; resources[0] = (struct r600_texture *) pipe->screen->resource_create(pipe->screen, &templ); if (!resources[0]) @@ -85,7 +85,7 @@ struct pipe_video_buffer *radeonsi_video_buffer_create(struct pipe_context *pipe if (resource_formats[1] != PIPE_FORMAT_NONE) { vl_video_buffer_template(&templ, &template, resource_formats[1], 1, array_size, PIPE_USAGE_STATIC, 1); - templ.flags = R600_RESOURCE_FLAG_TRANSFER; + templ.bind = PIPE_BIND_LINEAR; resources[1] = (struct r600_texture *) pipe->screen->resource_create(pipe->screen, &templ); if (!resources[1]) @@ -94,7 +94,7 @@ struct pipe_video_buffer *radeonsi_video_buffer_create(struct pipe_context *pipe if (resource_formats[2] != PIPE_FORMAT_NONE) { vl_video_buffer_template(&templ, &template, resource_formats[2], 1, array_size, PIPE_USAGE_STATIC, 2); - templ.flags = R600_RESOURCE_FLAG_TRANSFER; + templ.bind = PIPE_BIND_LINEAR; resources[2] = (struct r600_texture *) pipe->screen->resource_create(pipe->screen, &templ); if (!resources[2]) -- 1.8.1.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/3] st/vdpau: add format conversions for GetBitsYCbCr
Add simple plain C routines for NV12<->YV12 and YUYV<->UYVY conversions. The NV12->YV12 conversion is commonly used, for instance by VLC. --- src/gallium/state_trackers/vdpau/surface.c | 125 +++-- 1 file changed, 117 insertions(+), 8 deletions(-) diff --git a/src/gallium/state_trackers/vdpau/surface.c b/src/gallium/state_trackers/vdpau/surface.c index a5682e3..cd798a2 100644 --- a/src/gallium/state_trackers/vdpau/surface.c +++ b/src/gallium/state_trackers/vdpau/surface.c @@ -38,6 +38,13 @@ #include "vdpau_private.h" +enum getbits_conversion { + CONVERSION_NONE, + CONVERSION_NV12_TO_YV12, + CONVERSION_YV12_TO_NV12, + CONVERSION_SWAP_YUYV_UYVY, +}; + /** * Create a VdpVideoSurface. */ @@ -185,6 +192,80 @@ vlVdpVideoSurfaceSize(vlVdpSurface *p_surf, int component, *height /= 2; } +static void +vlVdpCopyNV12ToYV12(void *const *destination_data, +uint32_t const *destination_pitches, +int src_plane, int src_field, +int src_stride, int num_fields, +uint8_t const *src, +int width, int height) +{ + int x, y; + unsigned u_stride = destination_pitches[2] * num_fields; + unsigned v_stride = destination_pitches[1] * num_fields; + uint8_t *u_dst = (uint8_t *)destination_data[2] + destination_pitches[2] * src_field; + uint8_t *v_dst = (uint8_t *)destination_data[1] + destination_pitches[1] * src_field; + + /* TODO: SIMD */ + for (y = 0; y < height; y++) { + for (x = 0; x < width; x++) { + u_dst[x] = src[2*x]; + v_dst[x] = src[2*x+1]; + } + u_dst += u_stride; + v_dst += v_stride; + src += src_stride; + } +} + +static void +vlVdpCopyYV12ToNV12(void *const *destination_data, +uint32_t const *destination_pitches, +int src_plane, int src_field, +int src_stride, int num_fields, +uint8_t const *src, +int width, int height) +{ + int x, y; + unsigned offset = 2 - src_plane; + unsigned stride = destination_pitches[1] * num_fields; + uint8_t *dst = (uint8_t *)destination_data[1] + destination_pitches[1] * src_field; + + /* TODO: SIMD */ + for (y = 0; y < height; y++) { + for (x = 0; x < 2 * width; x += 2) { + dst[x+offset] = src[x>>1]; + } + dst += stride; + src += src_stride; + } +} + +static void +vlVdpCopySwap422Packed(void *const *destination_data, + uint32_t const *destination_pitches, + int src_plane, int src_field, + int src_stride, int num_fields, + uint8_t const *src, + int width, int height) +{ + int x, y; + unsigned stride = destination_pitches[0] * num_fields; + uint8_t *dst = (uint8_t *)destination_data[0] + destination_pitches[0] * src_field; + + /* TODO: SIMD */ + for (y = 0; y < height; y++) { + for (x = 0; x < 4 * width; x += 4) { + dst[x+0] = src[x+1]; + dst[x+1] = src[x+0]; + dst[x+2] = src[x+3]; + dst[x+3] = src[x+2]; + } + dst += stride; + src += src_stride; + } +} + /** * Copy image data from a VdpVideoSurface to application memory in a specified * YCbCr format. @@ -197,8 +278,9 @@ vlVdpVideoSurfaceGetBitsYCbCr(VdpVideoSurface surface, { vlVdpSurface *vlsurface; struct pipe_context *pipe; - enum pipe_format format; + enum pipe_format format, buffer_format; struct pipe_sampler_view **sampler_views; + enum getbits_conversion conversion = CONVERSION_NONE; unsigned i, j; vlsurface = vlGetDataHTAB(surface); @@ -211,10 +293,23 @@ vlVdpVideoSurfaceGetBitsYCbCr(VdpVideoSurface surface, format = FormatYCBCRToPipe(destination_ycbcr_format); if (format == PIPE_FORMAT_NONE) - return VDP_STATUS_INVALID_Y_CB_CR_FORMAT; - - if (vlsurface->video_buffer == NULL || format != vlsurface->video_buffer->buffer_format) - return VDP_STATUS_NO_IMPLEMENTATION; /* TODO We don't support conversion (yet) */ + return VDP_STATUS_INVALID_Y_CB_CR_FORMAT; + + if (vlsurface->video_buffer == NULL) + return VDP_STATUS_INVALID_VALUE; + + buffer_format = vlsurface->video_buffer->buffer_format; + if (format != buffer_format) { + if (format == PIPE_FORMAT_YV12 && buffer_format == PIPE_FORMAT_NV12) + conversion = CONVERSION_NV12_TO_YV12; + else if (format == PIPE_FORMAT_NV12 && buffer_format == PIPE_FORMAT_YV12) + conversion = CONVERSION_YV12_TO_NV12; + else if ((format == PIPE_FORMAT_YUYV && buffer_format == PIPE_FORMAT_UYVY) || + (format == PIPE_FORMAT_UYVY && buffer_format == PIPE_FORMAT_YUYV)) + conversion = CONVERSION_SWAP_YUYV_UYVY; + else + return VDP_STATUS_NO_IMPLEMENTATION; + } pipe_mutex_lock(vlsurface->device->mutex); sampler_views = vlsurface->video_buffer->get_s
Re: [Mesa-dev] [PATCH 2/3] radeon: use staging for mapping linear textures
Reviewed-by: Marek Olšák Marek On Sun, Oct 13, 2013 at 6:08 PM, Grigori Goronzy wrote: > Textures that likely reside in VRAM, are mapped for reading and > don't require direct mapping should be staged into GTT, to avoid bad > performance. This fixes readback performance of VDPAU surfaces. > --- > src/gallium/drivers/radeon/r600_texture.c | 6 ++ > 1 file changed, 6 insertions(+) > > diff --git a/src/gallium/drivers/radeon/r600_texture.c > b/src/gallium/drivers/radeon/r600_texture.c > index ebb70906..9ba1e36 100644 > --- a/src/gallium/drivers/radeon/r600_texture.c > +++ b/src/gallium/drivers/radeon/r600_texture.c > @@ -852,6 +852,12 @@ static void *r600_texture_transfer_map(struct > pipe_context *ctx, > if (rtex->surface.level[level].mode >= RADEON_SURF_MODE_1D) > use_staging_texture = TRUE; > > + /* Untiled buffers in VRAM, which is slow for CPU reads */ > + if ((usage & PIPE_TRANSFER_READ) && !(usage & > PIPE_TRANSFER_MAP_DIRECTLY) && > + (rtex->resource.domains == RADEON_DOMAIN_VRAM)) { > + use_staging_texture = TRUE; > + } > + > /* Use a staging texture for uploads if the underlying BO is busy. */ > if (!(usage & PIPE_TRANSFER_READ) && > (r600_rings_is_buffer_referenced(rctx, rtex->resource.cs_buf, > RADEON_USAGE_READWRITE) || > -- > 1.8.1.2 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] build: remove forced -fno-rtti
* As discussed on the mailing list, forced no-rtti breaks C++ public API's such as the Haiku C++ libGL.so * -fno-rtti *can* be still set however instead of blindly forcing -fno-rtti, we can rely on the llvm-config --cppflags output. If the system llvm is built without rtti (default), the no-rtti flag will be present in llvm-config --cppflags (which we pick up on) If llvm is built with rtti (REQUIRES_RTTI=1), then -fno-rtti is removed from llvm-config --cppflags. * We could selectively add / remove rtti from various components, however mixing rtti and non-rtti code is tricky and could introduce bugs. * This needs impact tested. --- configure.ac | 1 - scons/llvm.py | 3 --- src/gallium/auxiliary/Makefile.am | 6 -- 3 files changed, 10 deletions(-) diff --git a/configure.ac b/configure.ac index 0d082d2..3335575 100644 --- a/configure.ac +++ b/configure.ac @@ -1943,7 +1943,6 @@ AM_CONDITIONAL(HAVE_LOADER_GALLIUM, test x$enable_gallium_loader = xyes) AM_CONDITIONAL(HAVE_DRM_LOADER_GALLIUM, test x$enable_gallium_drm_loader = xyes) AM_CONDITIONAL(HAVE_GALLIUM_COMPUTE, test x$enable_opencl = xyes) AM_CONDITIONAL(HAVE_MESA_LLVM, test x$MESA_LLVM = x1) -AM_CONDITIONAL(LLVM_NEEDS_FNORTTI, test $LLVM_VERSION_INT -ge 302) AC_SUBST([ELF_LIB]) diff --git a/scons/llvm.py b/scons/llvm.py index 7cd609c..c1c3736 100644 --- a/scons/llvm.py +++ b/scons/llvm.py @@ -195,9 +195,6 @@ def generate(env): if llvm_version >= distutils.version.LooseVersion('3.1'): components.append('mcjit') -if llvm_version >= distutils.version.LooseVersion('3.2'): -env.Append(CXXFLAGS = ('-fno-rtti',)) - env.ParseConfig('llvm-config --libs ' + ' '.join(components)) env.ParseConfig('llvm-config --ldflags') except OSError: diff --git a/src/gallium/auxiliary/Makefile.am b/src/gallium/auxiliary/Makefile.am index 670e124..2d2d8d4 100644 --- a/src/gallium/auxiliary/Makefile.am +++ b/src/gallium/auxiliary/Makefile.am @@ -25,12 +25,6 @@ AM_CXXFLAGS += \ $(GALLIUM_CFLAGS) \ $(LLVM_CXXFLAGS) -if LLVM_NEEDS_FNORTTI - -AM_CXXFLAGS += -fno-rtti - -endif - libgallium_la_SOURCES += \ $(GALLIVM_SOURCES) \ $(GALLIVM_CPP_SOURCES) -- 1.8.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] radeon/uvd: use PIPE_BIND_LINEAR for video surfaces
Am 13.10.2013 18:08, schrieb Grigori Goronzy: This new bind flag forces linear storage, but does not have other side effects like R600_RESOURCE_FLAG_TRANSFER. Reviewed and committed the whole patch set. Thansk for the help, Christian. --- src/gallium/drivers/r600/r600_uvd.c | 6 +++--- src/gallium/drivers/radeonsi/radeonsi_uvd.c | 8 2 files changed, 7 insertions(+), 7 deletions(-) diff --git a/src/gallium/drivers/r600/r600_uvd.c b/src/gallium/drivers/r600/r600_uvd.c index 05d2ad0..300bccb 100644 --- a/src/gallium/drivers/r600/r600_uvd.c +++ b/src/gallium/drivers/r600/r600_uvd.c @@ -77,7 +77,7 @@ struct pipe_video_buffer *r600_video_buffer_create(struct pipe_context *pipe, vl_video_buffer_template(&templ, &template, resource_formats[0], 1, array_size, PIPE_USAGE_STATIC, 0); if (ctx->b.chip_class < EVERGREEN || tmpl->interlaced) - templ.flags = R600_RESOURCE_FLAG_TRANSFER; + templ.bind = PIPE_BIND_LINEAR; resources[0] = (struct r600_texture *) pipe->screen->resource_create(pipe->screen, &templ); if (!resources[0]) @@ -86,7 +86,7 @@ struct pipe_video_buffer *r600_video_buffer_create(struct pipe_context *pipe, if (resource_formats[1] != PIPE_FORMAT_NONE) { vl_video_buffer_template(&templ, &template, resource_formats[1], 1, array_size, PIPE_USAGE_STATIC, 1); if (ctx->b.chip_class < EVERGREEN || tmpl->interlaced) - templ.flags = R600_RESOURCE_FLAG_TRANSFER; + templ.bind = PIPE_BIND_LINEAR; resources[1] = (struct r600_texture *) pipe->screen->resource_create(pipe->screen, &templ); if (!resources[1]) @@ -96,7 +96,7 @@ struct pipe_video_buffer *r600_video_buffer_create(struct pipe_context *pipe, if (resource_formats[2] != PIPE_FORMAT_NONE) { vl_video_buffer_template(&templ, &template, resource_formats[2], 1, array_size, PIPE_USAGE_STATIC, 2); if (ctx->b.chip_class < EVERGREEN || tmpl->interlaced) - templ.flags = R600_RESOURCE_FLAG_TRANSFER; + templ.bind = PIPE_BIND_LINEAR; resources[2] = (struct r600_texture *) pipe->screen->resource_create(pipe->screen, &templ); if (!resources[2]) diff --git a/src/gallium/drivers/radeonsi/radeonsi_uvd.c b/src/gallium/drivers/radeonsi/radeonsi_uvd.c index 1cb3be0..6ecb17c 100644 --- a/src/gallium/drivers/radeonsi/radeonsi_uvd.c +++ b/src/gallium/drivers/radeonsi/radeonsi_uvd.c @@ -76,8 +76,8 @@ struct pipe_video_buffer *radeonsi_video_buffer_create(struct pipe_context *pipe template.height = align(tmpl->height / array_size, VL_MACROBLOCK_HEIGHT); vl_video_buffer_template(&templ, &template, resource_formats[0], 1, array_size, PIPE_USAGE_STATIC, 0); - /* TODO: Setting the transfer flag is only a workaround till we get tiling working */ - templ.flags = R600_RESOURCE_FLAG_TRANSFER; + /* TODO: get tiling working */ + templ.bind = PIPE_BIND_LINEAR; resources[0] = (struct r600_texture *) pipe->screen->resource_create(pipe->screen, &templ); if (!resources[0]) @@ -85,7 +85,7 @@ struct pipe_video_buffer *radeonsi_video_buffer_create(struct pipe_context *pipe if (resource_formats[1] != PIPE_FORMAT_NONE) { vl_video_buffer_template(&templ, &template, resource_formats[1], 1, array_size, PIPE_USAGE_STATIC, 1); - templ.flags = R600_RESOURCE_FLAG_TRANSFER; + templ.bind = PIPE_BIND_LINEAR; resources[1] = (struct r600_texture *) pipe->screen->resource_create(pipe->screen, &templ); if (!resources[1]) @@ -94,7 +94,7 @@ struct pipe_video_buffer *radeonsi_video_buffer_create(struct pipe_context *pipe if (resource_formats[2] != PIPE_FORMAT_NONE) { vl_video_buffer_template(&templ, &template, resource_formats[2], 1, array_size, PIPE_USAGE_STATIC, 2); - templ.flags = R600_RESOURCE_FLAG_TRANSFER; + templ.bind = PIPE_BIND_LINEAR; resources[2] = (struct r600_texture *) pipe->screen->resource_create(pipe->screen, &templ); if (!resources[2]) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 70432] New: Handling of S3TC with backends not supporting s3tc (like i915g)
https://bugs.freedesktop.org/show_bug.cgi?id=70432 Priority: medium Bug ID: 70432 Assignee: mesa-dev@lists.freedesktop.org Summary: Handling of S3TC with backends not supporting s3tc (like i915g) Severity: minor Classification: Unclassified OS: Linux (All) Reporter: freedesktop-bugzi...@mkarcher.dialup.fu-berlin.de Hardware: x86-64 (AMD64) Status: NEW Version: git Component: Mesa core Product: Mesa Portal for Windows (in steam) tries to set up compressed textures like this (from seen in apitrace) glTexImage2D(target = GL_TEXTURE_2D, level = 0, internalformat = GL_COMPRESSED_SRGB_ALPHA_S3TC_DXT1_EXT, width = 1024, height = 1024, border = 0, format = GL_RGBA, type = GL_UNSIGNED_BYTE, pixels = NULL) glCompressedTexSubImage2DARB(target = GL_TEXTURE_2D, level = 0, xoffset = 0, yoffset = 0, width = 1024, height = 1024, format = GL_COMPRESSED_SRGB_ALPHA_S3TC_DXT1_EXT, imageSize = 524288, data = [binary data, size = 512 kb]) The glTexImage2D would encounter an assertion failure on debug builds, and the glCompressedTexSubImage2DARB causes an division by zero on release builds. The generic Mesa core decides that S3TC compression is OK when libtxc_dxtn is installed (I have the s2tc version installed), and so texture_error_check succeeds on the first call. Nevertheless _mesa_choose_texture_format fails as S3TC is not supported on i915g. If assertions were enabled, the assertion texFormat != MESA_FORMAT_NONE would trigger, but that condition is ignored. The block size check in glCompressedTexSubImage2DARB ("(xoffset % bw != 0) || (yoffset % bh != 0)") crashes for MESA_FORMAT_NONE, as bw and bh are zero in that case. Uninstalling libtxc-dxtn-s2tc helps to not trigger that problem, thus the "minor" severity, as that library on a system without s3tc support is mostly pointless anyway. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 70432] Handling of S3TC with backends not supporting s3tc (like i915g)
https://bugs.freedesktop.org/show_bug.cgi?id=70432 John Paul Adrian Glaubitz changed: What|Removed |Added CC||glaub...@physik.fu-berlin.d ||e -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] build: remove forced -fno-rtti
On 10/13/2013 10:52 AM, Alexander von Gluck IV wrote: > * As discussed on the mailing list, > forced no-rtti breaks C++ public > API's such as the Haiku C++ libGL.so > * -fno-rtti *can* be still set however > instead of blindly forcing -fno-rtti, > we can rely on the llvm-config > --cppflags output. Does this means that builds that don't need LLVM will have RTTI (i.e., -fno-rtti will not be used)? It seems like if Haiku needs RTTI, we should enable RTTI only on Haiku. Am I missing something? > If the system llvm is built without > rtti (default), the no-rtti flag will be > present in llvm-config --cppflags > (which we pick up on) > If llvm is built with rtti > (REQUIRES_RTTI=1), then -fno-rtti is > removed from llvm-config --cppflags. > * We could selectively add / remove rtti > from various components, however mixing > rtti and non-rtti code is tricky and > could introduce bugs. > * This needs impact tested. > --- > configure.ac | 1 - > scons/llvm.py | 3 --- > src/gallium/auxiliary/Makefile.am | 6 -- > 3 files changed, 10 deletions(-) > > diff --git a/configure.ac b/configure.ac > index 0d082d2..3335575 100644 > --- a/configure.ac > +++ b/configure.ac > @@ -1943,7 +1943,6 @@ AM_CONDITIONAL(HAVE_LOADER_GALLIUM, test > x$enable_gallium_loader = xyes) > AM_CONDITIONAL(HAVE_DRM_LOADER_GALLIUM, test x$enable_gallium_drm_loader = > xyes) > AM_CONDITIONAL(HAVE_GALLIUM_COMPUTE, test x$enable_opencl = xyes) > AM_CONDITIONAL(HAVE_MESA_LLVM, test x$MESA_LLVM = x1) > -AM_CONDITIONAL(LLVM_NEEDS_FNORTTI, test $LLVM_VERSION_INT -ge 302) > > AC_SUBST([ELF_LIB]) > > diff --git a/scons/llvm.py b/scons/llvm.py > index 7cd609c..c1c3736 100644 > --- a/scons/llvm.py > +++ b/scons/llvm.py > @@ -195,9 +195,6 @@ def generate(env): > if llvm_version >= distutils.version.LooseVersion('3.1'): > components.append('mcjit') > > -if llvm_version >= distutils.version.LooseVersion('3.2'): > -env.Append(CXXFLAGS = ('-fno-rtti',)) > - > env.ParseConfig('llvm-config --libs ' + ' '.join(components)) > env.ParseConfig('llvm-config --ldflags') > except OSError: > diff --git a/src/gallium/auxiliary/Makefile.am > b/src/gallium/auxiliary/Makefile.am > index 670e124..2d2d8d4 100644 > --- a/src/gallium/auxiliary/Makefile.am > +++ b/src/gallium/auxiliary/Makefile.am > @@ -25,12 +25,6 @@ AM_CXXFLAGS += \ > $(GALLIUM_CFLAGS) \ > $(LLVM_CXXFLAGS) > > -if LLVM_NEEDS_FNORTTI > - > -AM_CXXFLAGS += -fno-rtti > - > -endif > - > libgallium_la_SOURCES += \ > $(GALLIVM_SOURCES) \ > $(GALLIVM_CPP_SOURCES) > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: add XRGB to fast texture upload
On 10/11/2013 10:16 AM, Courtney Goeltzenleuchter wrote: > MESA_FORMAT_XRGB is equivalent to MESA_FORMAT_ARGB in terms > of storage on the device, so okay to use this optimized copy routine. > --- > src/mesa/drivers/dri/i965/intel_tex_subimage.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/src/mesa/drivers/dri/i965/intel_tex_subimage.c > b/src/mesa/drivers/dri/i965/intel_tex_subimage.c > index 5cfdbd9..4aec05d 100644 > --- a/src/mesa/drivers/dri/i965/intel_tex_subimage.c > +++ b/src/mesa/drivers/dri/i965/intel_tex_subimage.c > @@ -564,7 +564,8 @@ intel_texsubimage_tiled_memcpy(struct gl_context * ctx, > (texImage->TexFormat == MESA_FORMAT_A8 && format == GL_ALPHA)) { >cpp = 1; >mem_copy = memcpy; > - } else if (texImage->TexFormat == MESA_FORMAT_ARGB) { > + } else if ((texImage->TexFormat == MESA_FORMAT_ARGB) || > + (texImage->TexFormat == MESA_FORMAT_XRGB)) { >cpp = 4; >if (format == GL_BGRA) { > mem_copy = memcpy; > Are there appropriate test cases that hit this path? I don't think piglit exercises a lot of different texture formats. I just want to be sure that some test now hits this fast path that didn't hit it before. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Fast texture upload now supports all levels
Same question about tests here as on the XRGB patch. On 10/11/2013 10:17 AM, Courtney Goeltzenleuchter wrote: > Support all levels of a supported texture format. > --- > src/mesa/drivers/dri/i965/intel_tex_subimage.c | 13 +++-- > 1 file changed, 11 insertions(+), 2 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/intel_tex_subimage.c > b/src/mesa/drivers/dri/i965/intel_tex_subimage.c > index 4aec05d..5e46760 100644 > --- a/src/mesa/drivers/dri/i965/intel_tex_subimage.c > +++ b/src/mesa/drivers/dri/i965/intel_tex_subimage.c > @@ -541,14 +541,13 @@ intel_texsubimage_tiled_memcpy(struct gl_context * ctx, > uint32_t cpp; > mem_copy_fn mem_copy = NULL; > > - /* This fastpath is restricted to specific texture types: level 0 of > + /* This fastpath is restricted to specific texture types: > * a 2D BGRA, RGBA, L8 or A8 texture. It could be generalized to support > * more types. > */ > if (!brw->has_llc || > type != GL_UNSIGNED_BYTE || > texImage->TexObject->Target != GL_TEXTURE_2D || > - texImage->Level != 0 || > pixels == NULL || > _mesa_is_bufferobj(packing->BufferObj) || > packing->Alignment > 4 || > @@ -616,6 +615,16 @@ intel_texsubimage_tiled_memcpy(struct gl_context * ctx, > DBG("%s: level=%d offset=(%d,%d) (w,h)=(%d,%d)\n", > __FUNCTION__, texImage->Level, xoffset, yoffset, width, height); > > + /* Adjust x and y offset based on miplevel > +*/ > + if (texImage->Level) { > + GLuint xlevel, ylevel; > + intel_miptree_get_image_offset(image->mt, texImage->Level, 0, > + &xlevel, &ylevel); > + xoffset += xlevel; > + yoffset += ylevel; > + } > + > linear_to_tiled( >xoffset * cpp, (xoffset + width) * cpp, >yoffset, yoffset + height, > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] build: remove forced -fno-rtti
On Sun, 2013-10-13 at 12:36 -0700, Ian Romanick wrote: > On 10/13/2013 10:52 AM, Alexander von Gluck IV wrote: > > * As discussed on the mailing list, > > forced no-rtti breaks C++ public > > API's such as the Haiku C++ libGL.so > > * -fno-rtti *can* be still set however > > instead of blindly forcing -fno-rtti, > > we can rely on the llvm-config > > --cppflags output. > > Does this means that builds that don't need LLVM will have RTTI (i.e., > -fno-rtti will not be used)? This is actually currently how it operates. Mesa only gets forced -fno-rtti when LLVM >= 3.2 is installed (further showing how odd this current design is) Current Design: * LLVM not installed: RTTI enabled * LLVM >= 3.2 installed: RTTI always disabled (reguardless if LLVM had RTTI enabled) * LLVM < 3.2 installed: Mesa mimics LLVM's rtti support status llvm-config --cppflags New design after patch * LLVM not installed: RTTI enabled * LLVM installed: Mesa mimics LLVM's rtti support status through llvm-config --cppflags Adding an extra bit of code to force -fno-rtti on non-public C++ abi's (aka everything else) could be done, however we are once again throwing in risk of mixing rtti and non-rtti code. ( I think the LLVM 3.2 check came from the fact that LLVM 3.2 was the first release that dropped the need for rtti.) > It seems like if Haiku needs RTTI, we should enable RTTI only on Haiku. > Am I missing something? We could, however what is the point? As previously mentioned, mixing RTTI and non-RTTI code could result in broken binaries on some platforms and the performance impact is almost null on current systems. Matching the LLVM rtti status seems like the most logical solution. If we did go the route of leaving rtti enabled only for C++ facing ABI's, how would we verify and remove the -fno-rtti flag from llvm-config --cppflags in scons and automake? (i'm honestly not sure here) -- Alex ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 70435] glTexSubImage corrupted rendering
https://bugs.freedesktop.org/show_bug.cgi?id=70435 U. Artie Eoff changed: What|Removed |Added CC||mesa-dev@lists.freedesktop. ||org -- You are receiving this mail because: You are on the CC list for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: add XRGB to fast texture upload
On Sun, Oct 13, 2013 at 1:41 PM, Ian Romanick wrote: > On 10/11/2013 10:16 AM, Courtney Goeltzenleuchter wrote: > > MESA_FORMAT_XRGB is equivalent to MESA_FORMAT_ARGB in terms > > of storage on the device, so okay to use this optimized copy routine. > > --- > > src/mesa/drivers/dri/i965/intel_tex_subimage.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/src/mesa/drivers/dri/i965/intel_tex_subimage.c > b/src/mesa/drivers/dri/i965/intel_tex_subimage.c > > index 5cfdbd9..4aec05d 100644 > > --- a/src/mesa/drivers/dri/i965/intel_tex_subimage.c > > +++ b/src/mesa/drivers/dri/i965/intel_tex_subimage.c > > @@ -564,7 +564,8 @@ intel_texsubimage_tiled_memcpy(struct gl_context * > ctx, > > (texImage->TexFormat == MESA_FORMAT_A8 && format == GL_ALPHA)) { > >cpp = 1; > >mem_copy = memcpy; > > - } else if (texImage->TexFormat == MESA_FORMAT_ARGB) { > > + } else if ((texImage->TexFormat == MESA_FORMAT_ARGB) || > > + (texImage->TexFormat == MESA_FORMAT_XRGB)) { > >cpp = 4; > >if (format == GL_BGRA) { > > mem_copy = memcpy; > > > > Are there appropriate test cases that hit this path? I don't think > piglit exercises a lot of different texture formats. I just want to be > sure that some test now hits this fast path that didn't hit it before. > This path is exercised by the Smokin' Guns benchmark during it's initialization but not during the measured part of the run. I've also verified that the piglit test: glean -o -v -v -v -t +pixelFormats --quick exercises this path. As for other benchmarks, I'd welcome suggestions. I couldn't find anything beyond the Mesa demo that Frank used to test his patch. On Smokin' Guns, with these patches combined (Franks, XRGB and all-levels), using valgrind I measured a reduction in instruction count of intelTexImage of 47.3% (from 2344404725 to 1235584087). Without all three patches, the driver was falling back to a Mesa copy routine that copied one byte at a time. Does that help? Courtney -- Courtney Goeltzenleuchter LunarG ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Fast texture upload now supports all levels
On Sun, Oct 13, 2013 at 2:50 PM, Frank Henigman wrote: > On Fri, Oct 11, 2013 at 10:00 PM, Chad Versace > wrote: > > On 10/11/2013 10:17 AM, Courtney Goeltzenleuchter wrote: > >> > >> Support all levels of a supported texture format. > >> --- > >> src/mesa/drivers/dri/i965/intel_tex_subimage.c | 13 +++-- > >> 1 file changed, 11 insertions(+), 2 deletions(-) > >> > >> diff --git a/src/mesa/drivers/dri/i965/intel_tex_subimage.c > >> b/src/mesa/drivers/dri/i965/intel_tex_subimage.c > >> index 4aec05d..5e46760 100644 > >> --- a/src/mesa/drivers/dri/i965/intel_tex_subimage.c > >> +++ b/src/mesa/drivers/dri/i965/intel_tex_subimage.c > >> @@ -541,14 +541,13 @@ intel_texsubimage_tiled_memcpy(struct gl_context * > >> ctx, > >> uint32_t cpp; > >> mem_copy_fn mem_copy = NULL; > >> > >> - /* This fastpath is restricted to specific texture types: level 0 of > >> + /* This fastpath is restricted to specific texture types: > >> * a 2D BGRA, RGBA, L8 or A8 texture. It could be generalized to > >> support > >> * more types. > >> */ > >> if (!brw->has_llc || > >> type != GL_UNSIGNED_BYTE || > >> texImage->TexObject->Target != GL_TEXTURE_2D || > >> - texImage->Level != 0 || > >> pixels == NULL || > >> _mesa_is_bufferobj(packing->BufferObj) || > >> packing->Alignment > 4 || > >> @@ -616,6 +615,16 @@ intel_texsubimage_tiled_memcpy(struct gl_context * > >> ctx, > >> DBG("%s: level=%d offset=(%d,%d) (w,h)=(%d,%d)\n", > >> __FUNCTION__, texImage->Level, xoffset, yoffset, width, > height); > >> > >> + /* Adjust x and y offset based on miplevel > >> +*/ > >> + if (texImage->Level) { > >> + GLuint xlevel, ylevel; > >> + intel_miptree_get_image_offset(image->mt, texImage->Level, 0, > >> + &xlevel, &ylevel); > >> + xoffset += xlevel; > >> + yoffset += ylevel; > >> + } > >> + > >> linear_to_tiled( > >> xoffset * cpp, (xoffset + width) * cpp, > >> yoffset, yoffset + height, > >> > > > > Usually when we commit performance patches like this, we state in the > > commit message what the observed relative performance gain. > > > > What gain did you see? Hardware? Benchmark? Kernel version? How many > > runs? > > We could quote from my patch, as this is just opening more paths into that > code. > Or do you think this calls for different testing? > Smokin' Guns goes down this path (and when it's wrong you can see it :-). I'll check piglit. -- Courtney Goeltzenleuchter LunarG ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Fast texture upload now supports all levels
On Sun, Oct 13, 2013 at 4:39 PM, Courtney Goeltzenleuchter < court...@lunarg.com> wrote: > > On Sun, Oct 13, 2013 at 2:50 PM, Frank Henigman wrote: > >> On Fri, Oct 11, 2013 at 10:00 PM, Chad Versace >> wrote: >> > On 10/11/2013 10:17 AM, Courtney Goeltzenleuchter wrote: >> >> >> >> Support all levels of a supported texture format. >> >> --- >> >> src/mesa/drivers/dri/i965/intel_tex_subimage.c | 13 +++-- >> >> 1 file changed, 11 insertions(+), 2 deletions(-) >> >> >> >> diff --git a/src/mesa/drivers/dri/i965/intel_tex_subimage.c >> >> b/src/mesa/drivers/dri/i965/intel_tex_subimage.c >> >> index 4aec05d..5e46760 100644 >> >> --- a/src/mesa/drivers/dri/i965/intel_tex_subimage.c >> >> +++ b/src/mesa/drivers/dri/i965/intel_tex_subimage.c >> >> @@ -541,14 +541,13 @@ intel_texsubimage_tiled_memcpy(struct gl_context >> * >> >> ctx, >> >> uint32_t cpp; >> >> mem_copy_fn mem_copy = NULL; >> >> >> >> - /* This fastpath is restricted to specific texture types: level 0 >> of >> >> + /* This fastpath is restricted to specific texture types: >> >> * a 2D BGRA, RGBA, L8 or A8 texture. It could be generalized to >> >> support >> >> * more types. >> >> */ >> >> if (!brw->has_llc || >> >> type != GL_UNSIGNED_BYTE || >> >> texImage->TexObject->Target != GL_TEXTURE_2D || >> >> - texImage->Level != 0 || >> >> pixels == NULL || >> >> _mesa_is_bufferobj(packing->BufferObj) || >> >> packing->Alignment > 4 || >> >> @@ -616,6 +615,16 @@ intel_texsubimage_tiled_memcpy(struct gl_context * >> >> ctx, >> >> DBG("%s: level=%d offset=(%d,%d) (w,h)=(%d,%d)\n", >> >> __FUNCTION__, texImage->Level, xoffset, yoffset, width, >> height); >> >> >> >> + /* Adjust x and y offset based on miplevel >> >> +*/ >> >> + if (texImage->Level) { >> >> + GLuint xlevel, ylevel; >> >> + intel_miptree_get_image_offset(image->mt, texImage->Level, 0, >> >> + &xlevel, &ylevel); >> >> + xoffset += xlevel; >> >> + yoffset += ylevel; >> >> + } >> >> + >> >> linear_to_tiled( >> >> xoffset * cpp, (xoffset + width) * cpp, >> >> yoffset, yoffset + height, >> >> >> > >> > Usually when we commit performance patches like this, we state in the >> > commit message what the observed relative performance gain. >> > >> > What gain did you see? Hardware? Benchmark? Kernel version? How many >> > runs? >> >> We could quote from my patch, as this is just opening more paths into >> that code. >> Or do you think this calls for different testing? >> > > Smokin' Guns goes down this path (and when it's wrong you can see it :-). > > I'll check piglit. > All the piglit glsl1 (bin/glean -o -v -v -v -t +glsl1 --quick) and a bunch of the ARB_ES3_compatibility tests go through this path as well as a handful of other tests. > > -- > Courtney Goeltzenleuchter > LunarG > > -- Courtney Goeltzenleuchter LunarG ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Fast texture upload now supports all levels
On 10/13/2013 01:50 PM, Frank Henigman wrote: > On Fri, Oct 11, 2013 at 10:00 PM, Chad Versace > wrote: >> On 10/11/2013 10:17 AM, Courtney Goeltzenleuchter wrote: >>> >>> Support all levels of a supported texture format. >>> --- >>> src/mesa/drivers/dri/i965/intel_tex_subimage.c | 13 +++-- >>> 1 file changed, 11 insertions(+), 2 deletions(-) >>> >>> diff --git a/src/mesa/drivers/dri/i965/intel_tex_subimage.c >>> b/src/mesa/drivers/dri/i965/intel_tex_subimage.c >>> index 4aec05d..5e46760 100644 >>> --- a/src/mesa/drivers/dri/i965/intel_tex_subimage.c >>> +++ b/src/mesa/drivers/dri/i965/intel_tex_subimage.c >>> @@ -541,14 +541,13 @@ intel_texsubimage_tiled_memcpy(struct gl_context * >>> ctx, >>> uint32_t cpp; >>> mem_copy_fn mem_copy = NULL; >>> >>> - /* This fastpath is restricted to specific texture types: level 0 of >>> + /* This fastpath is restricted to specific texture types: >>> * a 2D BGRA, RGBA, L8 or A8 texture. It could be generalized to >>> support >>> * more types. >>> */ >>> if (!brw->has_llc || >>> type != GL_UNSIGNED_BYTE || >>> texImage->TexObject->Target != GL_TEXTURE_2D || >>> - texImage->Level != 0 || >>> pixels == NULL || >>> _mesa_is_bufferobj(packing->BufferObj) || >>> packing->Alignment > 4 || >>> @@ -616,6 +615,16 @@ intel_texsubimage_tiled_memcpy(struct gl_context * >>> ctx, >>> DBG("%s: level=%d offset=(%d,%d) (w,h)=(%d,%d)\n", >>> __FUNCTION__, texImage->Level, xoffset, yoffset, width, height); >>> >>> + /* Adjust x and y offset based on miplevel >>> +*/ >>> + if (texImage->Level) { >>> + GLuint xlevel, ylevel; >>> + intel_miptree_get_image_offset(image->mt, texImage->Level, 0, >>> + &xlevel, &ylevel); >>> + xoffset += xlevel; >>> + yoffset += ylevel; >>> + } >>> + >>> linear_to_tiled( >>> xoffset * cpp, (xoffset + width) * cpp, >>> yoffset, yoffset + height, >>> >> >> Usually when we commit performance patches like this, we state in the >> commit message what the observed relative performance gain. >> >> What gain did you see? Hardware? Benchmark? Kernel version? How many >> runs? > > We could quote from my patch, as this is just opening more paths into that > code. > Or do you think this calls for different testing? I think what Chad is asking is whether there's some information like "Improves load time of application XYZ 12.3+4.5%" or similar. In the past, we've had problems with patches that just make vague claims of "improves performance" when we later find critical bugs in those patches... can we just revert the code, or is it going to run the performance of... something? For reference, see commit 329cd6a9b and this thread from mesa-dev: http://lists.freedesktop.org/archives/mesa-dev/2013-June/040811.html > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Mesa (master): i965/fs: Convert gen7 to using GRFs for texture messages.
On Sat, Oct 12, 2013 at 3:18 AM, Eric Anholt wrote: > Chia-I Wu writes: > >> Hi Eric, >> The frame rate of Unigine Tropics (with low shader quality) dropped >> from 40.8 to 23.5 after this change. > > Thanks for the note. I see the regression as well, and I see a shader > that's started spilling. It looks like we can drop the regs_written <= > 1 check on gen7+'s pre-regalloc scheduling to fix the problem (the MRF > setup thing is no longer an issue, and its presence is now making us > pessimize instead of optimize in general in the pre-regalloc > scheduling). I'll want to run a few more tests to make sure that this > doesn't regress something else. > > This shader is also in bad shape now that we don't have the redundant > MRF move optimization, and we need to look into grf_size > 1 CSE. That > would probably also have avoided the problem on this shader, though the > scheduling problem is more general than this one shader. The last shader_time output[1] for the fragment shader in question gives BEFORE fs8 glsl 465:959.12 Gcycles 1.1% AFTER fs8 glsl 465: 13336.14 Gcycles 9.6% Comparing with the total cycles, those extra cycles should bring down the fps to ~35. What's odd is this shader BEFORE vsglsl 264: 16127.47 Gcycles 17.7% AFTER vsglsl 264: 56543.36 Gcycles 40.8% The generated code for this vertex shader is not affected by the commit. But it runs significantly slower, bringing the fps down to ~24. I suspect it is context-switched away frequently and for a good while, but I do not have a theory as to why. Do you have a better idea? [1] Formatted for this mail. Also, the demo is time-based. To make sure the same frames are rendered, I have to use apitrace. -- o...@lunarg.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev