From: Roland Scheidegger
These versions still need wrapper but already have both success and
failure ordering.
(Compile tested on llvm 3.7, llvm 3.8.)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=02
---
src/gallium/auxiliary/gallivm/lp_bld_misc.cpp | 16 +++-
1 file ch
From: Roland Scheidegger
LLVM 7.0 ditched the pmulu intrinsics.
This is only a trivial patch to use the fallback code instead.
It'll likely produce atrocious code since the pattern doesn't match what
llvm itself uses in its autoupgrade paths, hence the pattern won't be
recognized.
Should fix htt
From: Roland Scheidegger
Should fix some issues we're seeing. And use REALLOC instead of realloc.
---
src/gallium/drivers/llvmpipe/lp_cs_tpool.c | 6 +++---
src/gallium/drivers/llvmpipe/lp_state_cs.c | 3 ++-
2 files changed, 5 insertions(+), 4 deletions(-)
diff --git a/src/gallium/drivers/llvm
From: Roland Scheidegger
The 1GB limit was arbitrary, increase this to 2GB (which is the max
possible without code changes).
---
src/gallium/drivers/llvmpipe/lp_limits.h | 6 +-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/src/gallium/drivers/llvmpipe/lp_limits.h
b/src/galli
From: Roland Scheidegger
LLVM 8 did remove both the signed and unsigned sse2/avx intrinsics in
the end, and provide arch-independent llvm intrinsics instead.
Fixes a crash when using snorm framebuffers (tested with piglit
arb_color_buffer_float-render GL_RGBA8_SNORM -auto).
CC:
---
src/gallium
From: Roland Scheidegger
Braces mismatch (flagged by CI, untested).
Fixes: 385d13f26d2 "util/atomic: Add a _return variant of p_atomic_add"
---
src/util/u_atomic.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/src/util/u_atomic.h b/src/util/u_atomic.h
index 9cbc6dd1eaa
From: Roland Scheidegger
0 is a valid value as max index, and the code handles it fine. This isn't
commonly seen, as it will only happen with array declarations of size 1.
The assert was introduced with a3c898dc97ec5f0e0b93b2ee180bdf8ca3bab14c.
Fixes piglit tests/shaders/complex-loop-analysis-bu
From: Roland Scheidegger
llvm 8 removed saturated unsigned add / sub x86 sse2 intrinsics, and
now llvm 9 removed the signed versions as well - they were proposed for
removal earlier, but the pattern to recognize those was very complex,
so it wasn't done then. However, instead of these arch-specif
From: Roland Scheidegger
Brian noticed there was an uninitialized var for the 8-wide case and 128
bit blocks, which made it always crash. Likewise, the 64bit block case
had another crash bug due to type mismatch.
Color decode (used for all s3tc formats) also had a bogus shuffle for
this case, lea
From: Roland Scheidegger
transform feedback draws get the number of vertices from the transform
feedback object. In draw, we'll figure this out with the number of bytes
written divided by the stride. However, it is apparently possible we end
up with a stride of 0 there (not entirely sure it could
From: Roland Scheidegger
The default null_output really needs to be static, otherwise the values
we'll eventually get later are doubly random (they are not initialized,
and even if they were it's a pointer to a local stack variable).
VMware bug 2349556.
---
src/gallium/auxiliary/gallivm/lp_bld_t
From: Roland Scheidegger
The x86asmprinter component is gone, and things seem to work by just
removing it.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110707
---
scons/llvm.py | 5 -
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/scons/llvm.py b/scons/llvm.py
index a
From: Roland Scheidegger
The border clamping code is unnecessary, since we don't care if a wrapped
coord value is -1 or <-1 (same for length vs. >length), in either case the
border handling code will mask out the offset and replace the texel value with
the border color.
Note that technically this
From: Roland Scheidegger
we need to rely on util code for fetching those, just like before
9f06061d50f90bf425a5337cea1b0adb94a46d25.
Fixes bugs 57699 and 57756.
---
src/gallium/auxiliary/gallivm/lp_bld_format_aos.c |3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/src/gall
From: Roland Scheidegger
Since we don't call lp_build_sample_common() in the texel fetch path we missed
the layer fixup code. If someone would have tried to do texelFetch with array
textures it would have crashed for sure.
Not really tested (no overlap of texelFetch and array texture tests in pig
From: Roland Scheidegger
a460aea3f14222af46f88d1bc686f82180b8a872 wasn't entirely correct,
since all coords are already ints hence need to skip the iround.
Passes piglit texelFetch with sampler1DArray/sampler2DArray.
---
src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c |5 +++--
1 file chan
From: Roland Scheidegger
Since the idea is to just expand or shrink the bit width but not otherwise do
conversion we also need to adjust the sign bit according to src, otherwise
the conversion code will incorrectly clamp the values. (Since this only works
for casting to ordinary floats the norm a
From: Roland Scheidegger
Change the texel type to int/uint instead of float throughout the sampling
code which makes it easier to catch errors (as llvm will complain about wrong
types if we mistakenly treat these values as real floats somewhere).
This should also get things like e.g. sampler swiz
From: Roland Scheidegger
Need to bitcast the float border color (luckily we already get
the color as int just disguised as float).
Fixes piglit texwrap GL_EXT_texture_integer bordercolor.
---
src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c |5 +
1 file changed, 5 insertions(+)
diff --
From: Roland Scheidegger
We get int/uint clear color value in this case, and util_pack_color can't
handle these formats at all (even if it could, float input color isn't what
we want).
Pass through the color union appropriately and handle the packing ourselves
(as I couldn't think of a good gener
From: Roland Scheidegger
Change the texel type to int/uint instead of float throughout the sampling
code which makes it easier to catch errors (as llvm will complain about wrong
types if we mistakenly treat these values as real floats somewhere).
This should also get things like e.g. sampler swiz
From: Roland Scheidegger
We were passing in the rt index however this was always 0 for non-independent
blend case. (The format was only actually used to decide if the color mask
covered all channels so this went unnoticed and was discovered by accident.)
(Also do some trivial cleanup.)
---
src/g
From: Roland Scheidegger
---
src/gallium/drivers/llvmpipe/lp_state_fs.c | 26 --
1 file changed, 12 insertions(+), 14 deletions(-)
diff --git a/src/gallium/drivers/llvmpipe/lp_state_fs.c
b/src/gallium/drivers/llvmpipe/lp_state_fs.c
index 3eae162..83b902d 100644
--- a/
From: Roland Scheidegger
We were passing in the rt index however this was always 0 for non-independent
blend case. (The format was only actually used to decide if the color mask
covered all channels so this went unnoticed and was discovered by accident.)
Additionally, there was a second problem b
From: Roland Scheidegger
Cast back the fake floats to ints, and make sure we don't try to do scaling
in format conversion (which only makes sense with normalized values).
Also need to disable blending and alpha test (as per spec) for such buffers.
This makes fbo-blending from the piglit ext_textu
From: Roland Scheidegger
Now that things mostly seem to work enable those formats.
Some formats cause crashes (notably RGB8 variants) so switch these off
(these crashes are not specific to INT/UINT variants but the state tracker
doesn't use them for UNORM etc. formats so it went unnoticed so far)
From: Roland Scheidegger
Make it obvious what "unit" this is (no change in functionality).
draw still uses "unit" in places where it changes the shader by adding
texture sampling itself - it seems like this can't work with shaders
using dx10-style sample opcodes (can't mix gl-style and dx10-style
From: Roland Scheidegger
The struct padding got broken by c789b981b244333cfc903bcd1e2fefc010500013.
This caused serious performance regression because part of the key was
unitialized and hence the shader always recompiled (at least on release
builds...).
While here also fix key size calculation w
From: Roland Scheidegger
The emulation of these if there's no rounding instruction available
is a bit more complicated than what the code did.
In particular, doing fp-to-int/int-to-fp will not work if the exponent
is large enough (and with NaNs, Infs). Hence such values need to be filtered
out an
From: Roland Scheidegger
They are similar to old-style tex opcodes but with separate sampler and
texture units (and other arguments in different places).
Also adjust the debug tgsi dump code.
---
src/gallium/auxiliary/gallivm/lp_bld_tgsi.h |6 +-
src/gallium/auxiliary/gallivm/lp_bld_tgs
From: Roland Scheidegger
Need to calculate the number of mip levels (if it would be worthwile could
store it in dynamic state).
Also, it looks like without modifiers this opcode should return floats
so handle that as well.
While here, the query code also used chan 2 for the lod value.
This worked
From: Roland Scheidegger
Need to calculate the number of mip levels (if it would be worthwile could
store it in dynamic state).
While here, the query code also used chan 2 for the lod value.
This worked with mesa state tracker but it seems safer to use chan 0.
Still passes piglit textureSize (wit
From: Roland Scheidegger
There were several bugs how this was handled, most opcodes wouldn't even
have fetched the right arguments.
Also, the tex "target" is coming from the sampler view, hence it cannot
have information about shadow comparisons - fortunately this is not only
sampler state but al
From: Roland Scheidegger
None of the filters used it (why would they). Maybe that param
was just there because some of the lines were considered to be
too short...
---
src/gallium/drivers/softpipe/sp_tex_sample.c | 76 +-
src/gallium/drivers/softpipe/sp_tex_sample.h |
From: Roland Scheidegger
This optimized filter (when using repeat wrap modes,
linear min/mag/mip filters, pot textures) only applies to 2d textures,
but nothing prevented it from being used for other textures (likely
leading to very bogus sample results).
Note: This is a candidate for the 9.0 br
From: Roland Scheidegger
This should handle the new lod_zero modifier more correctly.
The runtime-conditional is a bit more complex however we now also do
scalar lod computation when appropriate which should more than make up for it.
The refactoring should also fix an issue with explicit lods
(lo
From: Roland Scheidegger
This adds support of the additional blending factors to the blend function
itself, and also enables testing of it in lp_test_blend (which passes).
Still need to add the glue code of linking fs shader outputs to blend inputs
in llvmpipe, and probably need to add special ha
From: Roland Scheidegger
link up the fs outputs and blend inputs, and make sure the second blend source
is correctly loaded and converted (which is quite complex).
There's a slight refactoring of the monster generate_unswizzled_blend()
function where it makes sense to factor out alpha conversion
From: Roland Scheidegger
There can be other per-thread data than just vis_counter, so pass a struct
around instead (some of our non-public code uses this already and this
difference is a major cause of merge pain).
---
src/gallium/drivers/llvmpipe/lp_jit.c | 19 +++
src/g
From: Roland Scheidegger
It looks like using coord.w as explicit lod value is a mistake, most likely
because some dx10 docs had it specified that way. Seems this was changed though:
http://msdn.microsoft.com/en-us/library/windows/desktop/hh447229%28v=vs.85%29.aspx
- let's just hope it doesn't dep
601 - 640 of 640 matches
Mail list logo