Am 24.02.2014 09:33, schrieb Dave Airlie: > On Wed, Feb 12, 2014 at 9:10 AM, Roland Scheidegger <[email protected]> > wrote: >> Am 11.02.2014 22:58, schrieb Dave Airlie: >>>>> dst.z = texture_depth(unit, lod) >>>>> >>>>> +.. opcode:: TG4 - Texture Gather (as per ARB_texture_gather) >>>>> + Gathers the four texels to be used in a bi-linear >>>>> + filtering operation and packs them into a single register. >>>>> + Only works with 2D, 2D array, cubemaps, and cubemaps >>>>> arrays. >>>>> + For 2D textures, only the addressing modes of the sampler >>>>> and >>>>> + the top level of any mip pyramid are used. Set W to zero. >>>>> + It behaves like the TEX instruction, but a filtered >>>>> + sample is not generated. The four samples that contribute >>>>> + to filtering are placed into xyzw in clockwise order, >>>>> + starting with the (u,v) texture coordinate delta at the >>>>> + following locations (-, +), (+, +), (+, -), (-, -), where >>>>> + the magnitude of the deltas are half a texel. >>>>> + >>>>> + PIPE_CAP_TEXTURE_SM5 enhances this instruction to support >>>>> + shadow per-sample depth compares, single component >>>>> selection, >>>>> + and a non-constant offset. It doesn't allow support for >>>>> the >>>>> + GL independent offset to get i0,j0. This would require >>>>> another >>>>> + CAP is hw can do it natively. For now we lower that before >>>>> + TGSI. >>>>> + >>>>> +.. math:: >>>>> + >>>>> + coord = src0 >>>>> + >>>>> + component = src1 >>>>> + >>>>> + dst = texture_gather4 (unit, coord, component) >>>>> + >>>>> +(with SM5 - cube array shadow) >>>>> + >>>>> + coord = src0 >>>>> + >>>>> + compare = src1 >>>>> + >>>>> + dst = texture_gather (uint, coord, compare) >>>>> + >>>> So how does component selection work with the latter version? >>>> I think it would be nice if you wouldn't really need two versions (so if >>>> you don't support comparisons, the src would just be unused). >>> >>> That's docs not being clear enough if you read it like that. The >>> second version is only for cube array shadow compares, which have no >>> components. The first version is the same for non-shadow compares. >> Ah right that works, I forgot you don't need the channel select with >> shadow comparisons (not that I'm a big fan of such "overloaded" sources >> but that's nothing new really). >> >>> >>>> Also, FWIW for llvmpipe you'd probably wanted a native 4 offsets >>>> versions, I don't think llvm could eliminate the huge amount of >>>> duplicated code completely if you generate 4 texture lookups. Of course, >>>> someone would need to implement it first (shouldn't be too difficult). >>> >>> Yeah llvmpipe might be in the category for using the extra CAP, I'm >>> really hoping nvidia hw does do this, but the interface is kinda >>> arbitrary and maybe we should consider another opcode, >>> >>> Since we have for SM5 nonconstant ones something like, >>> >>> TG4 TEMP[1], TEMP[1], SAMP[0] , TEMP[2].xyz >>> which will sample around temp[1] i0,j0 - i1, j1 at the offset in temp[2] >>> >>> and >>> TG4 TEMP[1], TEMP[1], SAMP[0], TEMP[2].xyz, TEMP[3].xyz, TEMP[4].xyz, >>> TEMP[5].xyz >>> which will sample i0,j0 from TEMP[1] and the respective offsets. >>> >> >> Yes since the offsets are in separate offset structure and the amount of >> offsets is indicated I think it should just work actually if a driver >> wants to implement multiple offsets natively. > > So you okay with this version I think it covers everything, and we can > add a CAP if/when someone works out hw/llvmpipe for the 4 offset case. > > Dave >
Yes, looks good to me. Roland _______________________________________________ mesa-dev mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/mesa-dev
