Re: [Mesa-dev] Choosing texture internal format in GLES
Hi, > No, I said it would be better to use st_choose_matching_format in > st_ChooseTextureFormat, > because st_choose_matching_format does exactly what you're trying to do. > > I have gone ahead and implemented what I had in mind. See the attached patch. > > Marek One thing to make sure of in the tables used by st_choose_matching_format in GLES1/2: that HALF_FLOAT and FLOAT make the texture 16-bit and 32-bit texture respectively(for when half float and/or full float textures are supported). From a developers point of view, the GLES1/2 glTexImage parameters of external format and type are to essentially determine the format of the texture; the idea was/is that in GLES1/2 glTex[Sum]Image calls were not supposed to do format conversions, so the developer selected the format with the external format and external type parameters. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [r600g] Mesa CVS 4e9aa67: vdpau has only MPEG1/2 on RV730
Hello Christian, after latest git pull I've only MPEG1, MPEG2_SIMPLE and MPEG2_MAIN with my RV730 (AGP). All nice videos didn't play any longer. -Dieter BTW I'm not on Mesa Devel, so please CC me. /opt/mesa> vdpauinfo display: :0 screen: 0 API version: 1 Information string: G3DVL VDPAU Driver Shared Library version 1.0 Video surface: name width height types --- 420 8192 8192 NV12 YV12 422 8192 8192 UYVY YUYV 444 8192 8192 Y8U8V8A8 V8U8Y8A8 Decoder capabilities: name level macbs width height --- MPEG1 0 262144 8192 8192 MPEG2_SIMPLE 3 262144 8192 8192 MPEG2_MAIN3 262144 8192 8192 Output surface: name width height nat types B8G8R8A8 8192 8192y NV12 YV12 UYVY YUYV Y8U8V8A8 V8U8Y8A8 R8G8B8A8 8192 8192y NV12 YV12 UYVY YUYV Y8U8V8A8 V8U8Y8A8 R10G10B10A2 8192 8192y NV12 YV12 UYVY YUYV Y8U8V8A8 V8U8Y8A8 B10G10R10A2 8192 8192y NV12 YV12 UYVY YUYV Y8U8V8A8 V8U8Y8A8 Bitmap surface: name width height -- B8G8R8A8 8192 8192 R8G8B8A8 8192 8192 R10G10B10A2 8192 8192 B10G10R10A2 8192 8192 A88192 8192 Video mixer: feature namesup DEINTERLACE_TEMPORAL - DEINTERLACE_TEMPORAL_SPATIAL - INVERSE_TELECINE - NOISE_REDUCTION y SHARPNESSy LUMA_KEY - HIGH QUALITY SCALING - L1- HIGH QUALITY SCALING - L2- HIGH QUALITY SCALING - L3- HIGH QUALITY SCALING - L4- HIGH QUALITY SCALING - L5- HIGH QUALITY SCALING - L6- HIGH QUALITY SCALING - L7- HIGH QUALITY SCALING - L8- HIGH QUALITY SCALING - L9- parameter name sup min max - VIDEO_SURFACE_WIDTH y48 8192 VIDEO_SURFACE_HEIGHT y48 8192 CHROMA_TYPE y LAYERS y 04 attribute name sup min max - BACKGROUND_COLOR y CSC_MATRIX y NOISE_REDUCTION_LEVELy 0.00 1.00 SHARPNESS_LEVEL y -1.00 1.00 LUMA_KEY_MIN_LUMAy LUMA_KEY_MAX_LUMAy Inconsistency detected by ld.so: dl-close.c: 765: _dl_close: Assertion `map->l_init_called' failed! (I've only have libvdpau1-0.6 not 0.7 on openSUSE 12.3.) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965/blorp: Use passed in framebuffer rather than ctx->DrawBuffer
We have the destination framebuffer object passed in; there's no need to go digging around in the context. Signed-off-by: Chris Forbes --- src/mesa/drivers/dri/i965/brw_blorp_clear.cpp | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp b/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp index f26f39d..4ff776f 100644 --- a/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp +++ b/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp @@ -451,16 +451,16 @@ brw_blorp_clear_color(struct brw_context *brw, struct gl_framebuffer *fb, * see if any require fallback, and fall back for all if any of them need * to. */ - for (unsigned buf = 0; buf < ctx->DrawBuffer->_NumColorDrawBuffers; buf++) { - struct gl_renderbuffer *rb = ctx->DrawBuffer->_ColorDrawBuffers[buf]; + for (unsigned buf = 0; buf < fb->_NumColorDrawBuffers; buf++) { + struct gl_renderbuffer *rb = fb->_ColorDrawBuffers[buf]; struct intel_renderbuffer *irb = intel_renderbuffer(rb); if (irb && irb->mt->msaa_layout != INTEL_MSAA_LAYOUT_NONE) return false; } - for (unsigned buf = 0; buf < ctx->DrawBuffer->_NumColorDrawBuffers; buf++) { - struct gl_renderbuffer *rb = ctx->DrawBuffer->_ColorDrawBuffers[buf]; + for (unsigned buf = 0; buf < fb->_NumColorDrawBuffers; buf++) { + struct gl_renderbuffer *rb = fb->_ColorDrawBuffers[buf]; struct intel_renderbuffer *irb = intel_renderbuffer(rb); /* If this is an ES2 context or GL_ARB_ES2_compatibility is supported, -- 1.8.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [r600g] Mesa CVS 4e9aa67: vdpau has only MPEG1/2 on RV730
On Son, 2013-09-29 at 22:34 +0200, Dieter Nützel wrote: > > after latest git pull I've only MPEG1, MPEG2_SIMPLE and MPEG2_MAIN with > my RV730 (AGP). That probably means you lost UVD support for some reason. Assuming UVD is still enabled in the kernel, can you bisect which Mesa change caused the problem for you? -- Earthling Michel Dänzer| http://www.amd.com Libre software enthusiast |Mesa and X developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH RFC 0/6] i965: emulate SIMD16 sample_d with dual SIMD8 ones
From: Chia-I Wu Hi, This series of patches implements the emulation SIMD16 sample_d with dual SIMD8 sample_d. Before the changes, the compiler would fail to generate SIMD16 code for fragment shaders that use textureGrad. And that hurts the performance. The first four patches prepare the compiler for supporting SIMD8 sampler messages in SIMD16 mode. The last two patches implement the emulation. For some changes, there are more than one way to achieve the same goals. That is why this series is marked RFC. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH RFC 1/6] i965: make BRW_COMPRESSION_2NDHALF valid for brw_SAMPLE
From: Chia-I Wu SIMD8 sampler messages are allowed in SIMD16 mode, and they could not work without BRW_COMPRESSION_2NDHALF. Later PRMs (gen5 and later) do not explicitly state whether BRW_COMPRESSION_2NDHALF is allowed, but they do have examples using send with SecHalf. It should be safe to assume SecHalf is valid. Signed-off-by: Chia-I Wu --- src/mesa/drivers/dri/i965/brw_eu_emit.c | 17 - 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c b/src/mesa/drivers/dri/i965/brw_eu_emit.c index 7ed3df0..12515ec 100644 --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c @@ -2195,7 +2195,22 @@ void brw_SAMPLE(struct brw_compile *p, insn = next_insn(p, BRW_OPCODE_SEND); insn->header.predicate_control = 0; /* XXX */ - insn->header.compression_control = BRW_COMPRESSION_NONE; + + /* From the 965 PRM (volume 4, part 1, section 14.2.41): +* +*"Instruction compression is not allowed for this instruction (that +* is, send). The hardware behavior is undefined if this instruction is +* set as compressed. However, compress control can be set to "SecHalf" +* to affect the EMask generation." +* +* No similar wording is found in later PRMs, but there are examples +* utilizing send with SecHalf. More importantly, SIMD8 sampler messages +* are allowed in SIMD16 mode and they could not work without SecHalf. For +* these reasons, we allow BRW_COMPRESSION_2NDHALF here. +*/ + if (insn->header.compression_control != BRW_COMPRESSION_2NDHALF) + insn->header.compression_control = BRW_COMPRESSION_NONE; + if (brw->gen < 6) insn->header.destreg__conditionalmod = msg_reg_nr; -- 1.8.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH RFC 2/6] i965: allow SIMD8 sampler messages in SIMD16 mode
From: Chia-I Wu When the instruction to send the sampler message is forced uncompressed or sechalf, send SIMD8 one even in SIMD16 mode. Signed-off-by: Chia-I Wu --- src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp index 4475058..9406f7b 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp @@ -381,7 +381,8 @@ fs_generator::generate_tex(fs_inst *inst, struct brw_reg dst, struct brw_reg src break; } - if (dispatch_width == 16) + if (dispatch_width == 16 && + !inst->force_uncompressed && !inst->force_sechalf) simd_mode = BRW_SAMPLER_SIMD_MODE_SIMD16; if (brw->gen >= 5) { -- 1.8.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH RFC 3/6] i965: add FS_OPCODE_OVERWRITE_DST
From: Chia-I Wu FS_OPCODE_OVERWRITE_DST is used to indicate that the destination register is (completely) overwritten. No code is emitted, but the liveness analysis can use it as a hint to add the destination register to DEF bitset. This is needed because it is hard to figure out if some partial writes combined constitute a complete write during liveness analysis, while it is easier for the FS visitor to know if that is the case. Signed-off-by: Chia-I Wu --- src/mesa/drivers/dri/i965/brw_defines.h | 1 + src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 4 src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp | 5 +++-- src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp | 3 ++- src/mesa/drivers/dri/i965/brw_shader.cpp| 3 +++ 5 files changed, 13 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_defines.h b/src/mesa/drivers/dri/i965/brw_defines.h index b14c346..2618180 100644 --- a/src/mesa/drivers/dri/i965/brw_defines.h +++ b/src/mesa/drivers/dri/i965/brw_defines.h @@ -789,6 +789,7 @@ enum opcode { FS_OPCODE_UNPACK_HALF_2x16_SPLIT_X, FS_OPCODE_UNPACK_HALF_2x16_SPLIT_Y, FS_OPCODE_PLACEHOLDER_HALT, + FS_OPCODE_OVERWRITE_DST, VS_OPCODE_URB_WRITE, VS_OPCODE_SCRATCH_READ, diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp index 9406f7b..2b179b6 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp @@ -1484,6 +1484,10 @@ fs_generator::generate_code(exec_list *instructions) patch_discard_jumps_to_fb_writes(); break; + case FS_OPCODE_OVERWRITE_DST: + /* This is to help liveness analysis. */ + break; + default: if (inst->opcode < (int) ARRAY_SIZE(opcode_descs)) { _mesa_problem(ctx, "Unsupported opcode `%s' in FS", diff --git a/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp b/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp index f5daab2..13891f8 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp @@ -77,8 +77,9 @@ fs_live_variables::setup_def_use() * variable, and thus qualify for being in def[]. */ if (inst->dst.file == GRF && -inst->regs_written == v->virtual_grf_sizes[inst->dst.reg] && -!inst->is_partial_write()) { +(inst->opcode == FS_OPCODE_OVERWRITE_DST || + (inst->regs_written == v->virtual_grf_sizes[inst->dst.reg] && + !inst->is_partial_write( { int reg = inst->dst.reg; if (!BITSET_TEST(bd[b].use, reg)) BITSET_SET(bd[b].def, reg); diff --git a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp index 5530683..4e59a10 100644 --- a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp +++ b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp @@ -562,7 +562,8 @@ fs_instruction_scheduler::calculate_deps() schedule_node *n = (schedule_node *)node; fs_inst *inst = (fs_inst *)n->inst; - if (inst->opcode == FS_OPCODE_PLACEHOLDER_HALT) + if (inst->opcode == FS_OPCODE_PLACEHOLDER_HALT || + inst->opcode == FS_OPCODE_OVERWRITE_DST) add_barrier_deps(n); /* read-after-write deps. */ diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp b/src/mesa/drivers/dri/i965/brw_shader.cpp index a558d36..78029e2 100644 --- a/src/mesa/drivers/dri/i965/brw_shader.cpp +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp @@ -485,6 +485,9 @@ brw_instruction_name(enum opcode op) case FS_OPCODE_PLACEHOLDER_HALT: return "placeholder_halt"; + case FS_OPCODE_OVERWRITE_DST: + return "overwrite_dst"; + case VS_OPCODE_URB_WRITE: return "vs_urb_write"; case VS_OPCODE_SCRATCH_READ: -- 1.8.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH RFC 4/6] i965: keep SecHalf flag after register coalescing
From: Chia-I Wu Copy sechalf to the new register, otherwise we would read wrong HW registers. Signed-off-by: Chia-I Wu --- src/mesa/drivers/dri/i965/brw_fs.cpp | 1 + 1 file changed, 1 insertion(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 2ebadc8..8991ee8 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -2217,6 +2217,7 @@ fs_visitor::register_coalesce() new_src.abs = 1; } new_src.negate ^= scan_inst->src[i].negate; + new_src.sechalf = scan_inst->src[i].sechalf; scan_inst->src[i] = new_src; } } -- 1.8.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH RFC 5/6] i965: refactor texture instruction emission
From: Chia-I Wu Add fs_visitor::emit_texture, which is used to emit the texture instruction after the message payload has been set up. Signed-off-by: Chia-I Wu --- src/mesa/drivers/dri/i965/brw_fs.h | 10 ++- src/mesa/drivers/dri/i965/brw_fs_fp.cpp | 13 ++- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 129 --- 3 files changed, 70 insertions(+), 82 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h index b2aa041..c161e7d 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.h +++ b/src/mesa/drivers/dri/i965/brw_fs.h @@ -327,13 +327,17 @@ public: fs_reg rescale_texcoord(ir_texture *ir, fs_reg coordinate, bool is_rect, int sampler, int texunit); fs_inst *emit_texture_gen4(ir_texture *ir, fs_reg dst, fs_reg coordinate, - fs_reg shadow_comp, fs_reg lod, fs_reg lod2); + fs_reg shadow_comp, fs_reg lod, fs_reg lod2, + int sampler); fs_inst *emit_texture_gen5(ir_texture *ir, fs_reg dst, fs_reg coordinate, fs_reg shadow_comp, fs_reg lod, fs_reg lod2, - fs_reg sample_index); + fs_reg sample_index, int sampler); fs_inst *emit_texture_gen7(ir_texture *ir, fs_reg dst, fs_reg coordinate, fs_reg shadow_comp, fs_reg lod, fs_reg lod2, - fs_reg sample_index); + fs_reg sample_index, int sampler); + fs_inst *emit_texture(ir_texture *ir, fs_reg dst, int base_mrf, int mlen, + bool header_present, int regs_written, int sampler); + fs_reg fix_math_operand(fs_reg src); fs_inst *emit_math(enum opcode op, fs_reg dst, fs_reg src0); fs_inst *emit_math(enum opcode op, fs_reg dst, fs_reg src0, fs_reg src1); diff --git a/src/mesa/drivers/dri/i965/brw_fs_fp.cpp b/src/mesa/drivers/dri/i965/brw_fs_fp.cpp index 0594948..46ff03d 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_fp.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_fp.cpp @@ -499,18 +499,17 @@ fs_visitor::emit_fragment_program_code() fpi->TexSrcTarget == TEXTURE_RECT_INDEX, fpi->TexSrcUnit, fpi->TexSrcUnit); - fs_inst *inst; if (brw->gen >= 7) { -inst = emit_texture_gen7(ir, dst, coordinate, shadow_c, lod, dpdy, sample_index); +emit_texture_gen7(ir, dst, coordinate, shadow_c, lod, dpdy, + sample_index, fpi->TexSrcUnit); } else if (brw->gen >= 5) { -inst = emit_texture_gen5(ir, dst, coordinate, shadow_c, lod, dpdy, sample_index); +emit_texture_gen5(ir, dst, coordinate, shadow_c, lod, dpdy, + sample_index, fpi->TexSrcUnit); } else { -inst = emit_texture_gen4(ir, dst, coordinate, shadow_c, lod, dpdy); +emit_texture_gen4(ir, dst, coordinate, shadow_c, lod, dpdy, + fpi->TexSrcUnit); } - inst->sampler = fpi->TexSrcUnit; - inst->shadow_compare = fpi->TexShadow; - /* Reuse the GLSL swizzle_result() handler. */ swizzle_result(ir, dst, fpi->TexSrcUnit); dst = this->result; diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp index 72c379a..6435a17 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp @@ -870,8 +870,46 @@ fs_visitor::visit(ir_assignment *ir) } fs_inst * +fs_visitor::emit_texture(ir_texture *ir, fs_reg dst, int base_mrf, int mlen, + bool header_present, int regs_written, int sampler) +{ + fs_inst *inst; + + switch (ir->op) { + case ir_tex: inst = emit(SHADER_OPCODE_TEX, dst); break; + case ir_txb: inst = emit(FS_OPCODE_TXB, dst); break; + case ir_txl: inst = emit(SHADER_OPCODE_TXL, dst); break; + case ir_txd: inst = emit(SHADER_OPCODE_TXD, dst); break; + case ir_txf: inst = emit(SHADER_OPCODE_TXF, dst); break; + case ir_txf_ms: inst = emit(SHADER_OPCODE_TXF_MS, dst); break; + case ir_txs: inst = emit(SHADER_OPCODE_TXS, dst); break; + case ir_lod: inst = emit(SHADER_OPCODE_LOD, dst); break; + default: return NULL; + } + + inst->base_mrf = base_mrf; + inst->mlen = mlen; + inst->header_present = header_present; + inst->regs_written = regs_written; + + /* The header is set up by generate_tex() when necessary. */ + inst->src[0] = reg_undef; + + if (ir->offset != NULL && ir->op != ir_txf) + inst->texture_offset = brw_texture_offset(ir->offset->as_constant()); + + inst->sampler = sampler; + + if (ir->shadow_comparitor) + inst->shadow_compare = true; + + return inst; +} + +fs_inst * fs_visitor::emit_texture_gen4(ir_texture *ir, fs_reg dst, fs_reg
[Mesa-dev] [PATCH RFC 6/6] i965/gen7: emulate SIMD16 sample_d with dual SIMD8 sample_d
From: Chia-I Wu Add fs_visitor::emit_dual_texture_gen7 that emulate SIMD16 sample_d with dual SIMD8 sample_d on gen7+. Fix fs_generator::generate_tex to send SIMD8 messages when force_uncompressed or force_sechalf is set. No piglit quick.tests regression on Ivy Bridge and Haswell. With this change, I am seeing 6.76479% +/- 0.619064% (at 95.0% confidence) improvement on Xonotic with Ultra effects. Signed-off-by: Chia-I Wu --- src/mesa/drivers/dri/i965/brw_fs.h | 3 + src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 137 ++- 2 files changed, 138 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h index c161e7d..82a0a7d 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.h +++ b/src/mesa/drivers/dri/i965/brw_fs.h @@ -335,6 +335,9 @@ public: fs_inst *emit_texture_gen7(ir_texture *ir, fs_reg dst, fs_reg coordinate, fs_reg shadow_comp, fs_reg lod, fs_reg lod2, fs_reg sample_index, int sampler); + void emit_dual_texture_gen7(ir_texture *ir, fs_reg dst, fs_reg coordinate, + fs_reg shadow_comp, fs_reg lod, fs_reg lod2, + fs_reg sample_index, int sampler); fs_inst *emit_texture(ir_texture *ir, fs_reg dst, int base_mrf, int mlen, bool header_present, int regs_written, int sampler); diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp index 6435a17..b9f97b6 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp @@ -1334,6 +1334,133 @@ fs_visitor::emit_texture_gen7(ir_texture *ir, fs_reg dst, fs_reg coordinate, return emit_texture(ir, dst, base_mrf, mlen, header_present, 4, sampler); } +/* Emulate a SIMD16 sampler message with dual SIMD8 sampler messages. For + * now, and for pratical reaons, only ir_txd is supported. + */ +void +fs_visitor::emit_dual_texture_gen7(ir_texture *ir, fs_reg dst, fs_reg coordinate, + fs_reg shadow_c, fs_reg lod, fs_reg lod2, + fs_reg sample_index, int sampler) +{ + /* no need to emit dual SIMD8 messages */ + if (dispatch_width != 16 || ir->op != ir_txd) { + emit_texture_gen7(ir, dst, coordinate, shadow_c, +lod, lod2, sample_index, sampler); + return; + } + + const int reg_width = 1; + int mlen = 0; + int base_mrf = 2; + bool header_present = false; + fs_reg temp = fs_reg(GRF, virtual_grf_alloc(4), + brw_type_for_base_type(ir->type)); + + emit(FS_OPCODE_OVERWRITE_DST, dst); + emit(FS_OPCODE_OVERWRITE_DST, temp); + + for (int msg = 0; msg < 2; msg++) { + if (msg == 0) + push_force_uncompressed(); + else + push_force_sechalf(); + + /* only txd is supported for now */ + assert(ir->op == ir_txd); + + if (ir->offset) { + /* The offsets set up by the ir_texture visitor are in the + * m1 header, so we can't go headerless. + */ + header_present = true; + mlen++; + base_mrf--; + } + + if (ir->shadow_comparitor) { + emit(MOV(fs_reg(MRF, base_mrf + mlen), shadow_c)); + mlen += reg_width; + } + + /* Load dPdx and the coordinate together: + * [hdr], [ref], x, dPdx.x, dPdy.x, y, dPdx.y, dPdy.y, z, dPdx.z, dPdy.z + */ + fs_reg coord = coordinate, ddx = lod, ddy = lod2; + for (int i = 0; i < ir->coordinate->type->vector_elements; i++) { + emit(MOV(fs_reg(MRF, base_mrf + mlen), coord)); + coord.reg_offset++; + mlen += reg_width; + + /* For cube map array, the coordinate is (u,v,r,ai) but there are + * only derivatives for (u, v, r). + */ + if (i < ir->lod_info.grad.dPdx->type->vector_elements) { +emit(MOV(fs_reg(MRF, base_mrf + mlen), ddx)); +ddx.reg_offset++; +mlen += reg_width; + +emit(MOV(fs_reg(MRF, base_mrf + mlen), ddy)); +ddy.reg_offset++; +mlen += reg_width; + } + } + + if (mlen > 11) { + fail("Message length >11 disallowed by hardware\n"); + break; + } + + /* response length is 4, which are 2 vgrf */ + emit_texture(ir, temp, base_mrf, mlen, header_present, 2, sampler); + + if (msg == 0) { + /* move from temp to dst */ + for (int i = 0; i < 4; i++) { +fs_reg d = dst; +d.reg_offset += i; + +fs_reg s = temp; +s.reg_offset += i / 2; +s.sechalf = (i % 2); + +emit(MOV(d, s)); + } + + pop_force_uncompressed(); + + /* use non-overlapping MRF range if possible */ + if (base_mrf + mlen * 2 < BRW_MAX_MRF) +base_mrf += mlen; + + mlen = 0; + +
Re: [Mesa-dev] [r600g] Mesa CVS 4e9aa67: vdpau has only MPEG1/2 on RV730
On 30.09.2013 10:06, Michel Dänzer wrote: On Son, 2013-09-29 at 22:34 +0200, Dieter Nützel wrote: after latest git pull I've only MPEG1, MPEG2_SIMPLE and MPEG2_MAIN with my RV730 (AGP). Same problem on PALM. Bisection shows that it is caused by commit 68f6dec32. The initialization order seems to be wrong, the check for UVD is done too early. I'll send a patch in a minute. Best regards Grigori That probably means you lost UVD support for some reason. Assuming UVD is still enabled in the kernel, can you bisect which Mesa change caused the problem for you? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] r600g: fix UVD detection
UVD was checked before the info fields were initialized. Introduced by commit 68f6dec32. --- src/gallium/drivers/r600/r600_pipe.c | 13 +++-- 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/src/gallium/drivers/r600/r600_pipe.c b/src/gallium/drivers/r600/r600_pipe.c index 097a6b8..32df2a3 100644 --- a/src/gallium/drivers/r600/r600_pipe.c +++ b/src/gallium/drivers/r600/r600_pipe.c @@ -1037,6 +1037,13 @@ struct pipe_screen *r600_screen_create(struct radeon_winsys *ws) rscreen->b.b.fence_signalled = r600_fence_signalled; rscreen->b.b.fence_finish = r600_fence_finish; rscreen->b.b.get_driver_query_info = r600_get_driver_query_info; + r600_init_screen_resource_functions(&rscreen->b.b); + + if (!r600_common_screen_init(&rscreen->b, ws)) { + FREE(rscreen); + return NULL; + } + if (rscreen->b.info.has_uvd) { rscreen->b.b.get_video_param = ruvd_get_video_param; rscreen->b.b.is_video_format_supported = ruvd_is_format_supported; @@ -1044,12 +1051,6 @@ struct pipe_screen *r600_screen_create(struct radeon_winsys *ws) rscreen->b.b.get_video_param = r600_get_video_param; rscreen->b.b.is_video_format_supported = vl_video_buffer_is_format_supported; } - r600_init_screen_resource_functions(&rscreen->b.b); - - if (!r600_common_screen_init(&rscreen->b, ws)) { - FREE(rscreen); - return NULL; - } rscreen->b.debug_flags |= debug_get_flags_option("R600_DEBUG", r600_debug_options, 0); if (debug_get_bool_option("R600_DEBUG_COMPUTE", FALSE)) -- 1.8.1.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] context sharing of framebuffer objects
On 30 September 2013 02:18, Dave Airlie wrote: > So this led me to look at the spec and the mesa code, and I noticed it > appears at some point maybe around 3.1 that FBOs are no longer > considered shared objects at least in core profile, but mesa always > seems to share them, just wondering is someone can confirm I'm reading > things correctly, and if so I might try and do a piglit test and a > patch. > AFAIK the only FBOs that can be shared are ones create through EXT_fbo. (Specifically, see issue 10 in the ARB_fbo spec, and Appendix D in the GL 3.0 spec.) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965: fix bogus swizzle in brw_cubemap_normalize
When used with a cube array in VS, failed assertion in ir_validate: Assignment count of LHS write mask channels enabled not matching RHS vector size (3 LHS, 4 RHS). To fix this, swizzle the RHS correctly for the writemask. This showed up in the ARB_texture_gather tests, which exercise cube arrays in the VS. Signed-off-by: Chris Forbes Cc: "9.2" --- src/mesa/drivers/dri/i965/brw_cubemap_normalize.cpp | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_cubemap_normalize.cpp b/src/mesa/drivers/dri/i965/brw_cubemap_normalize.cpp index 46155fb..949414c 100644 --- a/src/mesa/drivers/dri/i965/brw_cubemap_normalize.cpp +++ b/src/mesa/drivers/dri/i965/brw_cubemap_normalize.cpp @@ -92,10 +92,12 @@ brw_cubemap_normalize_visitor::visit_leave(ir_texture *ir) /* coordinate.xyz *= expr */ assign = new(mem_ctx) ir_assignment( new(mem_ctx) ir_dereference_variable(var), - new(mem_ctx) ir_expression(ir_binop_mul, - ir->coordinate->type, - new(mem_ctx) ir_dereference_variable(var), - expr)); + new(mem_ctx) ir_swizzle( + new(mem_ctx) ir_expression(ir_binop_mul, +ir->coordinate->type, +new(mem_ctx) ir_dereference_variable(var), +expr), + 0, 1, 2, 0, 3)); assign->write_mask = WRITEMASK_XYZ; base_ir->insert_before(assign); ir->coordinate = new(mem_ctx) ir_dereference_variable(var); -- 1.8.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH V4 00/13] ARB_texture_gather
This series adds support for ARB_texture_gather in core mesa and in i965 for Gen7+. Notable changes from V3: - Only emit extra surface state, recompiles, etc if the shader actually uses gather4. - Use SCS to accomplish the workaround on Haswell [will need testing] Cc: Kenneth Graunke ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH V4 01/13] mesa: add texture gather changes
From: Maxence Le Dore Reviewed-by: Kenneth Graunke --- src/mapi/glapi/gen/ARB_texture_gather.xml | 14 ++ src/mapi/glapi/gen/gl_API.xml | 2 +- src/mesa/main/context.c | 4 src/mesa/main/extensions.c| 1 + src/mesa/main/get.c | 1 + src/mesa/main/get_hash_params.py | 6 ++ src/mesa/main/mtypes.h| 6 ++ src/mesa/main/tests/enum_strings.cpp | 3 +++ 8 files changed, 36 insertions(+), 1 deletion(-) create mode 100644 src/mapi/glapi/gen/ARB_texture_gather.xml diff --git a/src/mapi/glapi/gen/ARB_texture_gather.xml b/src/mapi/glapi/gen/ARB_texture_gather.xml new file mode 100644 index 000..cd331ac --- /dev/null +++ b/src/mapi/glapi/gen/ARB_texture_gather.xml @@ -0,0 +1,14 @@ + + + + + + + + + + + + + + \ No newline at end of file diff --git a/src/mapi/glapi/gen/gl_API.xml b/src/mapi/glapi/gen/gl_API.xml index f6511e9..3ffa817 100644 --- a/src/mapi/glapi/gen/gl_API.xml +++ b/src/mapi/glapi/gen/gl_API.xml @@ -8189,7 +8189,7 @@ http://www.w3.org/2001/XInclude"/> - +http://www.w3.org/2001/XInclude"/> diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c index 310518c..0d1f71c 100644 --- a/src/mesa/main/context.c +++ b/src/mesa/main/context.c @@ -652,6 +652,10 @@ _mesa_init_constants(struct gl_context *ctx) ctx->Const.MinProgramTexelOffset = -8; ctx->Const.MaxProgramTexelOffset = 7; + /* GL_ARB_texture_gather */ + ctx->Const.MinProgramTextureGatherOffset = -8; + ctx->Const.MaxProgramTextureGatherOffset = 7; + /* GL_ARB_robustness */ ctx->Const.ResetStrategy = GL_NO_RESET_NOTIFICATION_ARB; diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c index eb93620..2f2430e 100644 --- a/src/mesa/main/extensions.c +++ b/src/mesa/main/extensions.c @@ -142,6 +142,7 @@ static const struct extension extension_table[] = { { "GL_ARB_texture_env_crossbar", o(ARB_texture_env_crossbar),GLL,2001 }, { "GL_ARB_texture_env_dot3",o(ARB_texture_env_dot3), GLL,2001 }, { "GL_ARB_texture_float", o(ARB_texture_float), GL, 2004 }, + { "GL_ARB_texture_gather", o(ARB_texture_gather), GL, 2009 }, { "GL_ARB_texture_mirrored_repeat", o(dummy_true), GLL,2001 }, { "GL_ARB_texture_multisample", o(ARB_texture_multisample), GL, 2009 }, { "GL_ARB_texture_non_power_of_two", o(ARB_texture_non_power_of_two),GL, 2003 }, diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c index 7af5f55..89b3bf0 100644 --- a/src/mesa/main/get.c +++ b/src/mesa/main/get.c @@ -366,6 +366,7 @@ EXTRA_EXT(ARB_map_buffer_alignment); EXTRA_EXT(ARB_texture_cube_map_array); EXTRA_EXT(ARB_texture_buffer_range); EXTRA_EXT(ARB_texture_multisample); +EXTRA_EXT(ARB_texture_gather); static const int extra_ARB_color_buffer_float_or_glcore[] = { diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py index fb321a3..a896751 100644 --- a/src/mesa/main/get_hash_params.py +++ b/src/mesa/main/get_hash_params.py @@ -721,6 +721,12 @@ descriptor=[ # GL_ARB_texture_cube_map_array [ "TEXTURE_BINDING_CUBE_MAP_ARRAY_ARB", "LOC_CUSTOM, TYPE_INT, TEXTURE_CUBE_ARRAY_INDEX, extra_ARB_texture_cube_map_array" ], + +# GL_ARB_texture_gather + [ "MIN_PROGRAM_TEXTURE_GATHER_OFFSET_ARB", "CONTEXT_INT(Const.MinProgramTextureGatherOffset), extra_ARB_texture_gather"], + [ "MAX_PROGRAM_TEXTURE_GATHER_OFFSET_ARB", "CONTEXT_INT(Const.MaxProgramTextureGatherOffset), extra_ARB_texture_gather"], + [ "MAX_PROGRAM_TEXTURE_GATHER_COMPONENTS_ARB", "CONTEXT_INT(Const.MaxProgramTextureGatherComponents), extra_ARB_texture_gather"], + ]}, # Enums restricted to OpenGL Core profile diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index d82672d..3d414d5 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -3081,6 +3081,11 @@ struct gl_constants /** GL_EXT_gpu_shader4 */ GLint MinProgramTexelOffset, MaxProgramTexelOffset; + /** GL_ARB_texture_gather */ + GLuint MinProgramTextureGatherOffset; + GLuint MaxProgramTextureGatherOffset; + GLuint MaxProgramTextureGatherComponents; + /* GL_ARB_robustness */ GLenum ResetStrategy; @@ -3210,6 +3215,7 @@ struct gl_extensions GLboolean ARB_texture_env_crossbar; GLboolean ARB_texture_env_dot3; GLboolean ARB_texture_float; + GLboolean ARB_texture_gather; GLboolean ARB_texture_multisample; GLboolean ARB_texture_non_power_of_two; GLboolean ARB_texture_query_lod; diff --git a/src/mesa/main/tests/enum_strings.cpp b/src/mesa/main/tests/enum_strings.cpp index c8df819..6994f79 100644 --- a/src/mesa/main/tests/enum
[Mesa-dev] [PATCH V4 02/13] glsl: add texture gather changes
From: Maxence Le Dore V2 [Chris Forbes]: - Add new pattern, fixup parameter reading. V3: Rebase onto new builtins machinery Reviewed-by: Kenneth Graunke --- src/glsl/builtin_functions.cpp | 35 +++ src/glsl/glcpp/glcpp-parse.y| 3 +++ src/glsl/glsl_parser_extras.cpp | 1 + src/glsl/glsl_parser_extras.h | 2 ++ src/glsl/ir.cpp | 2 +- src/glsl/ir.h | 4 +++- src/glsl/ir_clone.cpp | 1 + src/glsl/ir_hv_accept.cpp | 1 + src/glsl/ir_print_visitor.cpp | 3 ++- src/glsl/ir_reader.cpp | 6 +- src/glsl/ir_rvalue_visitor.cpp | 1 + src/glsl/opt_tree_grafting.cpp | 1 + src/glsl/standalone_scaffolding.cpp | 1 + src/mesa/program/ir_to_mesa.cpp | 5 + 14 files changed, 62 insertions(+), 4 deletions(-) diff --git a/src/glsl/builtin_functions.cpp b/src/glsl/builtin_functions.cpp index 72054e0..df735ef 100644 --- a/src/glsl/builtin_functions.cpp +++ b/src/glsl/builtin_functions.cpp @@ -262,6 +262,13 @@ texture_query_lod(const _mesa_glsl_parse_state *state) state->ARB_texture_query_lod_enable; } +static bool +texture_gather(const _mesa_glsl_parse_state *state) +{ + return state->is_version(400, 0) || + state->ARB_texture_gather_enable; +} + /* Desktop GL or OES_standard_derivatives + fragment shader only */ static bool fs_oes_derivatives(const _mesa_glsl_parse_state *state) @@ -1816,6 +1823,34 @@ builtin_builder::create_builtins() _texture(ir_txd, shader_texture_lod_and_rect, glsl_type::vec4_type, glsl_type::sampler2DRectShadow_type, glsl_type::vec4_type, TEX_PROJECT), NULL); + add_function("textureGather", +_texture(ir_tg4, texture_gather, glsl_type::vec4_type, glsl_type::sampler2D_type, glsl_type::vec2_type), +_texture(ir_tg4, texture_gather, glsl_type::ivec4_type, glsl_type::isampler2D_type, glsl_type::vec2_type), +_texture(ir_tg4, texture_gather, glsl_type::uvec4_type, glsl_type::usampler2D_type, glsl_type::vec2_type), + +_texture(ir_tg4, texture_gather, glsl_type::vec4_type, glsl_type::sampler2DArray_type, glsl_type::vec3_type), +_texture(ir_tg4, texture_gather, glsl_type::ivec4_type, glsl_type::isampler2DArray_type, glsl_type::vec3_type), +_texture(ir_tg4, texture_gather, glsl_type::uvec4_type, glsl_type::usampler2DArray_type, glsl_type::vec3_type), + +_texture(ir_tg4, texture_gather, glsl_type::vec4_type, glsl_type::samplerCube_type, glsl_type::vec3_type), +_texture(ir_tg4, texture_gather, glsl_type::ivec4_type, glsl_type::isamplerCube_type, glsl_type::vec3_type), +_texture(ir_tg4, texture_gather, glsl_type::uvec4_type, glsl_type::usamplerCube_type, glsl_type::vec3_type), + +_texture(ir_tg4, texture_gather, glsl_type::vec4_type, glsl_type::samplerCubeArray_type, glsl_type::vec4_type), +_texture(ir_tg4, texture_gather, glsl_type::ivec4_type, glsl_type::isamplerCubeArray_type, glsl_type::vec4_type), +_texture(ir_tg4, texture_gather, glsl_type::uvec4_type, glsl_type::usamplerCubeArray_type, glsl_type::vec4_type), +NULL); + + add_function("textureGatherOffset", +_texture(ir_tg4, texture_gather, glsl_type::vec4_type, glsl_type::sampler2D_type, glsl_type::vec2_type, TEX_OFFSET), +_texture(ir_tg4, texture_gather, glsl_type::ivec4_type, glsl_type::isampler2D_type, glsl_type::vec2_type, TEX_OFFSET), +_texture(ir_tg4, texture_gather, glsl_type::uvec4_type, glsl_type::usampler2D_type, glsl_type::vec2_type, TEX_OFFSET), + +_texture(ir_tg4, texture_gather, glsl_type::vec4_type, glsl_type::sampler2DArray_type, glsl_type::vec3_type, TEX_OFFSET), +_texture(ir_tg4, texture_gather, glsl_type::ivec4_type, glsl_type::isampler2DArray_type, glsl_type::vec3_type, TEX_OFFSET), +_texture(ir_tg4, texture_gather, glsl_type::uvec4_type, glsl_type::usampler2DArray_type, glsl_type::vec3_type, TEX_OFFSET), +NULL); + F(dFdx) F(dFdy) F(fwidth) diff --git a/src/glsl/glcpp/glcpp-parse.y b/src/glsl/glcpp/glcpp-parse.y index 6eaa5f9..c7ad3e9 100644 --- a/src/glsl/glcpp/glcpp-parse.y +++ b/src/glsl/glcpp/glcpp-parse.y @@ -1248,6 +1248,9 @@ glcpp_parser_create (const struct gl_extensions *extensions, int api) if (extensions->EXT_shader_integer_mix) add_builtin_define(parser, "GL_EXT_shader_integer_mix", 1); + + if (extensions->ARB_texture_gather) +add_builtin_define(parser, "GL_ARB_texture_gather", 1); } } diff --git a/src/glsl/glsl_parser_extras.cpp b/src/glsl/glsl_parser_extras.cpp index e9922fc..813db6f 100644 --- a/src/glsl/glsl_parser_extras.cpp +++ b/src/glsl/glsl_parse
[Mesa-dev] [PATCH V4 03/13] i965: add SHADER_OPCODE_TG4
Adds the Gen7 message IDs, a new SHADER_OPCODE_TG4 pseudo-op, and low-level support for emitting it via generate_tex(). V3: Updated for changes in master. Signed-off-by: Chris Forbes Reviewed-by: Kenneth Graunke --- src/mesa/drivers/dri/i965/brw_defines.h | 3 +++ src/mesa/drivers/dri/i965/brw_fs.cpp | 1 + src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 5 + src/mesa/drivers/dri/i965/brw_shader.cpp | 3 ++- src/mesa/drivers/dri/i965/brw_vec4.cpp | 1 + src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 6 +- 6 files changed, 17 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_defines.h b/src/mesa/drivers/dri/i965/brw_defines.h index b14c346..ae2839a 100644 --- a/src/mesa/drivers/dri/i965/brw_defines.h +++ b/src/mesa/drivers/dri/i965/brw_defines.h @@ -767,6 +767,7 @@ enum opcode { FS_OPCODE_TXB, SHADER_OPCODE_TXF_MS, SHADER_OPCODE_LOD, + SHADER_OPCODE_TG4, SHADER_OPCODE_SHADER_TIME_ADD, @@ -1042,8 +1043,10 @@ enum brw_message_target { #define GEN5_SAMPLER_MESSAGE_SAMPLE_BIAS_COMPARE 5 #define GEN5_SAMPLER_MESSAGE_SAMPLE_LOD_COMPARE 6 #define GEN5_SAMPLER_MESSAGE_SAMPLE_LD 7 +#define GEN7_SAMPLER_MESSAGE_SAMPLE_GATHER4 8 #define GEN5_SAMPLER_MESSAGE_LOD 9 #define GEN5_SAMPLER_MESSAGE_SAMPLE_RESINFO 10 +#define GEN7_SAMPLER_MESSAGE_SAMPLE_GATHER4_PO 17 #define HSW_SAMPLER_MESSAGE_SAMPLE_DERIV_COMPARE 20 #define GEN7_SAMPLER_MESSAGE_SAMPLE_LD_MCS 29 #define GEN7_SAMPLER_MESSAGE_SAMPLE_LD2DMS 30 diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 2ebadc8..3f64434 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -725,6 +725,7 @@ fs_visitor::implied_mrf_writes(fs_inst *inst) case SHADER_OPCODE_TXD: case SHADER_OPCODE_TXF: case SHADER_OPCODE_TXF_MS: + case SHADER_OPCODE_TG4: case SHADER_OPCODE_TXL: case SHADER_OPCODE_TXS: case SHADER_OPCODE_LOD: diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp index 7ce42c4..2d59d1a 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp @@ -431,6 +431,10 @@ fs_generator::generate_tex(fs_inst *inst, struct brw_reg dst, struct brw_reg src case SHADER_OPCODE_LOD: msg_type = GEN5_SAMPLER_MESSAGE_LOD; break; + case SHADER_OPCODE_TG4: + assert(brw->gen >= 6); + msg_type = GEN7_SAMPLER_MESSAGE_SAMPLE_GATHER4; + break; default: assert(!"not reached"); break; @@ -1386,6 +1390,7 @@ fs_generator::generate_code(exec_list *instructions) case SHADER_OPCODE_TXL: case SHADER_OPCODE_TXS: case SHADER_OPCODE_LOD: + case SHADER_OPCODE_TG4: generate_tex(inst, dst, src[0]); break; case FS_OPCODE_DDX: diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp b/src/mesa/drivers/dri/i965/brw_shader.cpp index a558d36..61c4bf5 100644 --- a/src/mesa/drivers/dri/i965/brw_shader.cpp +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp @@ -532,7 +532,8 @@ backend_instruction::is_tex() opcode == SHADER_OPCODE_TXF_MS || opcode == SHADER_OPCODE_TXL || opcode == SHADER_OPCODE_TXS || - opcode == SHADER_OPCODE_LOD); + opcode == SHADER_OPCODE_LOD || + opcode == SHADER_OPCODE_TG4); } bool diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index 2c1f541..75c3d34 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp @@ -270,6 +270,7 @@ vec4_visitor::implied_mrf_writes(vec4_instruction *inst) case SHADER_OPCODE_TXF: case SHADER_OPCODE_TXF_MS: case SHADER_OPCODE_TXS: + case SHADER_OPCODE_TG4: return inst->header_present ? 1 : 0; default: assert(!"not reached"); diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp index 6916134..6bdffb3 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp @@ -308,6 +308,9 @@ vec4_generator::generate_tex(vec4_instruction *inst, case SHADER_OPCODE_TXS: msg_type = GEN5_SAMPLER_MESSAGE_SAMPLE_RESINFO; break; + case SHADER_OPCODE_TG4: + msg_type = GEN7_SAMPLER_MESSAGE_SAMPLE_GATHER4; + break; default: assert(!"should not get here: invalid VS texture opcode"); break; @@ -361,7 +364,7 @@ vec4_generator::generate_tex(vec4_instruction *inst, brw_MOV(p, retype(brw_vec1_reg(BRW_MESSAGE_REGISTER_FILE, inst->base_mrf, 2), BRW_REGISTER_TYPE_UD), - brw_imm_uw(inst->texture_offset)); + brw_imm_ud(inst->texture_offset)); brw_pop_insn_state(p);
[Mesa-dev] [PATCH V4 04/13] i965/fs: Add support for ir_tg4
Lowers ir_tg4 (from textureGather and textureGatherOffset builtins) to SHADER_OPCODE_TG4. The usual post-sampling swizzle workaround can't work for ir_tg4, so avoid doing that: * For R/G/B/A swizzles use the hardware channel select (lives in the same dword in the header as the texel offset), and then don't do anything afterward in the shader. * For 0/1 swizzles blast the appropriate constant over all the output channels instead of sampling. V2: Avoid duplicating header enabling block V3: Avoid sampling at all, for degenerate swizzles. Signed-off-by: Chris Forbes --- src/mesa/drivers/dri/i965/brw_fs.h | 1 + src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 62 ++-- 2 files changed, 60 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h index b2aa041..7eaf387 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.h +++ b/src/mesa/drivers/dri/i965/brw_fs.h @@ -213,6 +213,7 @@ public: void visit(ir_emit_vertex *); void visit(ir_end_primitive *); + uint32_t gather_channel(ir_texture *ir, int sampler); void swizzle_result(ir_texture *ir, fs_reg orig_val, int sampler); bool can_do_source_mods(fs_inst *inst); diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp index 72c379a..27b300b 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp @@ -1167,6 +1167,12 @@ fs_visitor::emit_texture_gen5(ir_texture *ir, fs_reg dst, fs_reg coordinate, case ir_lod: inst = emit(SHADER_OPCODE_LOD, dst); break; + case ir_tg4: + inst = emit(SHADER_OPCODE_TG4, dst); + break; + default: + fail("unrecognized texture opcode"); + break; } inst->base_mrf = base_mrf; inst->mlen = mlen; @@ -1191,9 +1197,12 @@ fs_visitor::emit_texture_gen7(ir_texture *ir, fs_reg dst, fs_reg coordinate, bool header_present = false; int offsets[3]; - if (ir->offset && ir->op != ir_txf) { - /* The offsets set up by the ir_texture visitor are in the + if (ir->op == ir_tg4 || (ir->offset && ir->op != ir_txf)) { + /* * The offsets set up by the ir_texture visitor are in the * m1 header, so we can't go headerless. + * + * * ir4_tg4 needs to place its channel select in the header, + * for interaction with ARB_texture_swizzle */ header_present = true; mlen++; @@ -1209,6 +1218,7 @@ fs_visitor::emit_texture_gen7(ir_texture *ir, fs_reg dst, fs_reg coordinate, switch (ir->op) { case ir_tex: case ir_lod: + case ir_tg4: break; case ir_txb: emit(MOV(fs_reg(MRF, base_mrf + mlen), lod)); @@ -1323,6 +1333,7 @@ fs_visitor::emit_texture_gen7(ir_texture *ir, fs_reg dst, fs_reg coordinate, case ir_txf_ms: inst = emit(SHADER_OPCODE_TXF_MS, dst); break; case ir_txs: inst = emit(SHADER_OPCODE_TXS, dst); break; case ir_lod: inst = emit(SHADER_OPCODE_LOD, dst); break; + case ir_tg4: inst = emit(SHADER_OPCODE_TG4, dst); break; } inst->base_mrf = base_mrf; inst->mlen = mlen; @@ -1450,6 +1461,24 @@ fs_visitor::visit(ir_texture *ir) */ int texunit = fp->Base.SamplerUnits[sampler]; + if (ir->op == ir_tg4) { + /* When tg4 is used with the degenerate ZERO/ONE swizzles, don't bother + * emitting anything other than setting up the constant result. + */ + int swiz = GET_SWZ(c->key.tex.swizzles[sampler], 0); + if (swiz == SWIZZLE_ZERO || swiz == SWIZZLE_ONE) { + + fs_reg res = fs_reg(this, glsl_type::vec4_type); + this->result = res; + + for (int i=0; i<4; i++) { +emit(MOV(res, fs_reg(swiz == SWIZZLE_ZERO ? 0.0f : 1.0f))); +res.reg_offset++; + } + return; + } + } + /* Should be lowered by do_lower_texture_projection */ assert(!ir->projector); @@ -1477,6 +1506,7 @@ fs_visitor::visit(ir_texture *ir) switch (ir->op) { case ir_tex: case ir_lod: + case ir_tg4: break; case ir_txb: ir->lod_info.bias->accept(this); @@ -1499,6 +1529,8 @@ fs_visitor::visit(ir_texture *ir) ir->lod_info.sample_index->accept(this); sample_index = this->result; break; + default: + assert(!"Unrecognized texture opcode"); }; /* Writemasking doesn't eliminate channels on SIMD8 texture @@ -1523,6 +1555,9 @@ fs_visitor::visit(ir_texture *ir) if (ir->offset != NULL && ir->op != ir_txf) inst->texture_offset = brw_texture_offset(ir->offset->as_constant()); + if (ir->op == ir_tg4) + inst->texture_offset |= gather_channel(ir, sampler) << 16; // M0.2:16-17 + inst->sampler = sampler; if (ir->shadow_comparitor) @@ -1543,6 +1578,24 @@ fs_visitor::visit(ir_texture *ir) } /** + * Set up the gather channel based on the swizzle, for gather4. + */ +uint32_t +fs_visitor::gather_channel(ir_texture *ir, int sampler)
[Mesa-dev] [PATCH V4 05/13] i965/vs: Add support for ir_tg4
Pretty much the same as the FS case. Channel select goes in the header, V2: Less mangling. V3: Avoid sampling at all, for degenerate swizzles. Signed-off-by: Chris Forbes --- src/mesa/drivers/dri/i965/brw_vec4.h | 1 + src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 46 -- 2 files changed, 45 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h b/src/mesa/drivers/dri/i965/brw_vec4.h index 689040b..fc3d1f7 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.h +++ b/src/mesa/drivers/dri/i965/brw_vec4.h @@ -470,6 +470,7 @@ public: void emit_pack_half_2x16(dst_reg dst, src_reg src0); void emit_unpack_half_2x16(dst_reg dst, src_reg src0); + uint32_t gather_channel(ir_texture *ir, int sampler); void swizzle_result(ir_texture *ir, src_reg orig_val, int sampler); void emit_ndc_computation(); diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp index 3ff6a61..f095a77 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp @@ -2131,6 +2131,19 @@ vec4_visitor::visit(ir_texture *ir) int sampler = _mesa_get_sampler_uniform_value(ir->sampler, shader_prog, prog); + /* When tg4 is used with the degenerate ZERO/ONE swizzles, don't bother +* emitting anything other than setting up the constant result. +*/ + if (ir->op == ir_tg4) { + int swiz = GET_SWZ(key->tex.swizzles[sampler], 0); + if (swiz == SWIZZLE_ZERO || swiz == SWIZZLE_ONE) { + dst_reg result(this, ir->type); + this->result = src_reg(result); + emit(MOV(result, src_reg(swiz == SWIZZLE_ONE ? 1.0f : 0.0f))); + return; + } + } + /* Should be lowered by do_lower_texture_projection */ assert(!ir->projector); @@ -2180,6 +2193,7 @@ vec4_visitor::visit(ir_texture *ir) break; case ir_txb: case ir_lod: + case ir_tg4: break; } @@ -2201,18 +2215,23 @@ vec4_visitor::visit(ir_texture *ir) case ir_txs: inst = new(mem_ctx) vec4_instruction(this, SHADER_OPCODE_TXS); break; + case ir_tg4: + inst = new(mem_ctx) vec4_instruction(this, SHADER_OPCODE_TG4); + break; case ir_txb: assert(!"TXB is not valid for vertex shaders."); break; case ir_lod: assert(!"LOD is not valid for vertex shaders."); break; + default: + assert(!"Unrecognized tex op"); } bool use_texture_offset = ir->offset != NULL && ir->op != ir_txf; /* Texel offsets go in the message header; Gen4 also requires headers. */ - inst->header_present = use_texture_offset || brw->gen < 5; + inst->header_present = use_texture_offset || brw->gen < 5 || ir->op == ir_tg4; inst->base_mrf = 2; inst->mlen = inst->header_present + 1; /* always at least one */ inst->sampler = sampler; @@ -2223,6 +2242,10 @@ vec4_visitor::visit(ir_texture *ir) if (use_texture_offset) inst->texture_offset = brw_texture_offset(ir->offset->as_constant()); + /* Stuff the channel select bits in the top of the texture offset */ + if (ir->op == ir_tg4) + inst->texture_offset |= gather_channel(ir, sampler)<<16; + /* MRF for the first parameter */ int param_base = inst->base_mrf + inst->header_present; @@ -2347,6 +2370,24 @@ vec4_visitor::visit(ir_texture *ir) swizzle_result(ir, src_reg(inst->dst), sampler); } +/** + * Set up the gather channel based on the swizzle, for gather4. + */ +uint32_t +vec4_visitor::gather_channel(ir_texture *ir, int sampler) +{ + int swiz = GET_SWZ(key->tex.swizzles[sampler], 0 /* red */); + switch (swiz) { + case SWIZZLE_X: return 0; + case SWIZZLE_Y: return 1; + case SWIZZLE_Z: return 2; + case SWIZZLE_W: return 3; + default: + assert(!"Not reached"); /* zero, one swizzles handled already */ + return 0; + } +} + void vec4_visitor::swizzle_result(ir_texture *ir, src_reg orig_val, int sampler) { @@ -2356,11 +2397,12 @@ vec4_visitor::swizzle_result(ir_texture *ir, src_reg orig_val, int sampler) dst_reg swizzled_result(this->result); if (ir->op == ir_txs || ir->type == glsl_type::float_type - || s == SWIZZLE_NOOP) { + || s == SWIZZLE_NOOP || ir->op == ir_tg4) { emit(MOV(swizzled_result, orig_val)); return; } + int zero_mask = 0, one_mask = 0, copy_mask = 0; int swizzle[4] = {0}; -- 1.8.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH V4 06/13] glsl: flag shaders which use gather4 at all
--- src/glsl/ir_set_program_inouts.cpp | 9 + src/mesa/main/mtypes.h | 2 ++ 2 files changed, 11 insertions(+) diff --git a/src/glsl/ir_set_program_inouts.cpp b/src/glsl/ir_set_program_inouts.cpp index 1267d6d..ab23538 100644 --- a/src/glsl/ir_set_program_inouts.cpp +++ b/src/glsl/ir_set_program_inouts.cpp @@ -59,6 +59,7 @@ public: virtual ir_visitor_status visit_enter(ir_function_signature *); virtual ir_visitor_status visit_enter(ir_expression *); virtual ir_visitor_status visit_enter(ir_discard *); + virtual ir_visitor_status visit_enter(ir_texture *); virtual ir_visitor_status visit(ir_dereference_variable *); private: @@ -319,6 +320,14 @@ ir_set_program_inouts_visitor::visit_enter(ir_discard *) return visit_continue; } +ir_visitor_status +ir_set_program_inouts_visitor::visit_enter(ir_texture *ir) +{ + if (ir->op == ir_tg4) + prog->UsesGather = true; + return visit_continue; +} + void do_set_program_inouts(exec_list *instructions, struct gl_program *prog, GLenum shader_type) diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index 3d414d5..514f810 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -1963,6 +1963,8 @@ struct gl_program GLbitfield SamplersUsed; /**< Bitfield of which samplers are used */ GLbitfield ShadowSamplers; /**< Texture units used for shadow sampling. */ + GLboolean UsesGather; /**< Does this program use gather4 at all? */ + /** Named parameters, constants, etc. from program text */ struct gl_program_parameter_list *Parameters; -- 1.8.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH V4 07/13] i965: w/a for gather4 green RG32F
V4: Only flag quirks if there are any uses of gather in the shader, to avoid spurious recompiles just because someone happened to use RG32F. Signed-off-by: Chris Forbes --- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 4 src/mesa/drivers/dri/i965/brw_program.h| 5 + src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 4 src/mesa/drivers/dri/i965/brw_wm.c | 9 + 4 files changed, 22 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp index 27b300b..0f05607 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp @@ -1584,6 +1584,10 @@ uint32_t fs_visitor::gather_channel(ir_texture *ir, int sampler) { int swiz = GET_SWZ(c->key.tex.swizzles[sampler], 0 /* red */); + if (c->key.tex.gather_channel_quirk_mask & (1tex.gather_channel_quirk_mask & (1 yuvtex_mask); found |= key_debug(brw, "GL_MESA_ycbcr UV swapping\n", old_key->yuvtex_swap_mask, key->yuvtex_swap_mask); + found |= key_debug(brw, "gather channel quirk on any texture unit", + old_key->gather_channel_quirk_mask, key->gather_channel_quirk_mask); return found; } @@ -342,6 +345,12 @@ brw_populate_sampler_prog_key_data(struct gl_context *ctx, if (sampler->WrapR == GL_CLAMP) key->gl_clamp_mask[2] |= 1 << s; } + + /* gather4's channel select for green from RG32F is broken */ + if (brw->gen >= 7 && prog->UsesGather) { +if (img->InternalFormat == GL_RG32F && GET_SWZ(t->_Swizzle, 0) == 1) + key->gather_channel_quirk_mask |= 1 << s; + } } } } -- 1.8.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH V4 08/13] i965: Add BRW_SURFACEFORMAT_R32G32_FLOAT_LD, required for IVB gather4 w/a
gather4 GREEN channel against a surface with format R32G32_FLOAT doesn't work correctly on IVB. w/a from bspec: - use R32G32_FLOAT_LD = 0x97 instead, for gather4 only. - select BLUE channel to read GREEN Signed-off-by: Chris Forbes --- src/mesa/drivers/dri/i965/brw_defines.h | 1 + src/mesa/drivers/dri/i965/brw_surface_formats.c | 1 + 2 files changed, 2 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_defines.h b/src/mesa/drivers/dri/i965/brw_defines.h index ae2839a..7dfb2b9 100644 --- a/src/mesa/drivers/dri/i965/brw_defines.h +++ b/src/mesa/drivers/dri/i965/brw_defines.h @@ -309,6 +309,7 @@ #define BRW_SURFACEFORMAT_R16G16B16A16_USCALED 0x094 #define BRW_SURFACEFORMAT_R32G32_SSCALED 0x095 #define BRW_SURFACEFORMAT_R32G32_USCALED 0x096 +#define BRW_SURFACEFORMAT_R32G32_FLOAT_LD0x097 #define BRW_SURFACEFORMAT_R32G32_SFIXED 0x0A0 #define BRW_SURFACEFORMAT_R64_PASSTHRU 0x0A1 #define BRW_SURFACEFORMAT_B8G8R8A8_UNORM 0x0C0 diff --git a/src/mesa/drivers/dri/i965/brw_surface_formats.c b/src/mesa/drivers/dri/i965/brw_surface_formats.c index 0d8d805..8666336 100644 --- a/src/mesa/drivers/dri/i965/brw_surface_formats.c +++ b/src/mesa/drivers/dri/i965/brw_surface_formats.c @@ -110,6 +110,7 @@ const struct surface_format_info surface_formats[] = { SF( Y, x, x, x, Y, x, Y, x, x, BRW_SURFACEFORMAT_R16G16B16A16_UINT) SF( Y, Y, x, x, Y, Y, Y, x, x, BRW_SURFACEFORMAT_R16G16B16A16_FLOAT) SF( Y, 50, x, x, Y, Y, Y, Y, x, BRW_SURFACEFORMAT_R32G32_FLOAT) + SF( Y, 70, x, x, Y, Y, Y, Y, x, BRW_SURFACEFORMAT_R32G32_FLOAT_LD) SF( Y, x, x, x, Y, x, Y, Y, x, BRW_SURFACEFORMAT_R32G32_SINT) SF( Y, x, x, x, Y, x, Y, Y, x, BRW_SURFACEFORMAT_R32G32_UINT) SF( Y, 50, Y, x, x, x, x, x, x, BRW_SURFACEFORMAT_R32_FLOAT_X8X24_TYPELESS) -- 1.8.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH V4 09/13] i965: make room in the binding table for a full alternate set of surface_states
Worst-case is that *every* texunit uses a format that needs overriding. V4: Place the gather slots last, so shaders which don't use gather don't get penalized by having a huge binding table. Signed-off-by: Chris Forbes --- src/mesa/drivers/dri/i965/brw_context.h | 20 ++-- 1 file changed, 18 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index 0f88bad..3f2f4ea 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -660,6 +660,13 @@ struct brw_gs_prog_data *| . | . | *| : | : | *| 36 | UBO 11 | + *|-|-| + *| 37 | Shader time buffer | + *|-|-| + *| 38 | Gather texture 0| + *| . | . | + *| : | : | + *| 53 | Gather texture 15 | *+---+ * * Our VS (and Gen7 GS) binding tables are programmed as follows: @@ -676,6 +683,13 @@ struct brw_gs_prog_data *| . | . | *| : | : | *| 28 | UBO 11 | + *|-|-| + *| 29 | Shader time buffer | + *|-|-| + *| 30 | Gather texture 0| + *| . | . | + *| : | : | + *| 45 | Gather texture 15 | *+---+ * * Our (gen6) GS binding tables are programmed as follows: @@ -692,14 +706,16 @@ struct brw_gs_prog_data #define SURF_INDEX_TEXTURE(t)(BRW_MAX_DRAW_BUFFERS + 2 + (t)) #define SURF_INDEX_WM_UBO(u) (SURF_INDEX_TEXTURE(BRW_MAX_TEX_UNIT) + u) #define SURF_INDEX_WM_SHADER_TIME(SURF_INDEX_WM_UBO(12)) +#define SURF_INDEX_GATHER_TEXTURE(t) (SURF_INDEX_WM_SHADER_TIME + 1 + (t)) /** Maximum size of the binding table. */ -#define BRW_MAX_WM_SURFACES (SURF_INDEX_WM_SHADER_TIME + 1) +#define BRW_MAX_WM_SURFACES (SURF_INDEX_GATHER_TEXTURE(BRW_MAX_TEX_UNIT)) #define SURF_INDEX_VEC4_CONST_BUFFER (0) #define SURF_INDEX_VEC4_TEXTURE(t) (SURF_INDEX_VEC4_CONST_BUFFER + 1 + (t)) #define SURF_INDEX_VEC4_UBO(u) (SURF_INDEX_VEC4_TEXTURE(BRW_MAX_TEX_UNIT) + u) #define SURF_INDEX_VEC4_SHADER_TIME (SURF_INDEX_VEC4_UBO(12)) -#define BRW_MAX_VEC4_SURFACES(SURF_INDEX_VEC4_SHADER_TIME + 1) +#define SURF_INDEX_VEC4_GATHER_TEXTURE(t) (SURF_INDEX_VEC4_SHADER_TIME + 1 + (t)) +#define BRW_MAX_VEC4_SURFACES (SURF_INDEX_VEC4_GATHER_TEXTURE(BRW_MAX_TEX_UNIT)) #define SURF_INDEX_GEN6_SOL_BINDING(t) (t) #define BRW_MAX_GEN6_GS_SURFACES SURF_INDEX_GEN6_SOL_BINDING(BRW_MAX_SOL_BINDINGS) -- 1.8.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH V4 10/13] i965: Emit a second set of SURFACE_STATE for gather4 from textures.
This allows us to use a different surface format for gather4, which is required for R32G32_FLOAT to work on Gen7. V4: - Only emit alternate surface state for shaders which will actually use it. - Pass a simple 'for_gather' flag rather than a function pointer. The callee can decide what w/a to apply. Signed-off-by: Chris Forbes --- src/mesa/drivers/dri/i965/brw_context.h | 3 +- src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 38 +++ src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 6 +++- 3 files changed, 39 insertions(+), 8 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index 3f2f4ea..6e2edc9 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -928,7 +928,8 @@ struct brw_context void (*update_texture_surface)(struct gl_context *ctx, unsigned unit, - uint32_t *surf_offset); + uint32_t *surf_offset, + bool for_gather); void (*update_renderbuffer_surface)(struct brw_context *brw, struct gl_renderbuffer *rb, bool layered, diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c index 471fd03..89827c4 100644 --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c @@ -251,7 +251,8 @@ brw_update_buffer_texture_surface(struct gl_context *ctx, static void brw_update_texture_surface(struct gl_context *ctx, unsigned unit, - uint32_t *surf_offset) + uint32_t *surf_offset, + bool for_gather) { struct brw_context *brw = brw_context(ctx); struct gl_texture_object *tObj = ctx->Texture.Unit[unit]._Current; @@ -270,6 +271,8 @@ brw_update_texture_surface(struct gl_context *ctx, surf = brw_state_batch(brw, AUB_TRACE_SURFACE_STATE, 6 * 4, 32, surf_offset); + (void) for_gather; /* no w/a to apply for this gen */ + surf[0] = (translate_tex_target(tObj->Target) << BRW_SURFACE_TYPE_SHIFT | BRW_SURFACE_MIPMAPLAYOUT_BELOW << BRW_SURFACE_MIPLAYOUT_SHIFT | BRW_SURFACE_CUBEFACE_ENABLES | @@ -713,7 +716,8 @@ const struct brw_tracked_state gen6_renderbuffer_surfaces = { static void update_stage_texture_surfaces(struct brw_context *brw, const struct gl_program *prog, - uint32_t *surf_offset) + uint32_t *surf_offset, + bool for_gather) { if (!prog) return; @@ -730,7 +734,7 @@ update_stage_texture_surfaces(struct brw_context *brw, /* _NEW_TEXTURE */ if (ctx->Texture.Unit[unit]._ReallyEnabled) { -brw->vtbl.update_texture_surface(ctx, unit, surf_offset + s); +brw->vtbl.update_texture_surface(ctx, unit, surf_offset + s, for_gather); } } } @@ -755,13 +759,35 @@ brw_update_texture_surfaces(struct brw_context *brw) /* _NEW_TEXTURE */ update_stage_texture_surfaces(brw, vs, brw->vs.base.surf_offset + - SURF_INDEX_VEC4_TEXTURE(0)); + SURF_INDEX_VEC4_TEXTURE(0), + false); update_stage_texture_surfaces(brw, gs, brw->gs.base.surf_offset + - SURF_INDEX_VEC4_TEXTURE(0)); + SURF_INDEX_VEC4_TEXTURE(0), + false); update_stage_texture_surfaces(brw, fs, brw->wm.base.surf_offset + - SURF_INDEX_TEXTURE(0)); + SURF_INDEX_TEXTURE(0), + false); + + /* emit alternate set of surface state for gather. this +* allows the surface format to be overriden for only the +* gather4 messages. */ + if (vs && vs->UsesGather) + update_stage_texture_surfaces(brw, vs, +brw->vs.base.surf_offset + +SURF_INDEX_VEC4_GATHER_TEXTURE(0), +true); + if (gs && gs->UsesGather) + update_stage_texture_surfaces(brw, gs, +brw->gs.base.surf_offset + +SURF_INDEX_VEC4_GATHER_TEXTURE(0), +true); + if (fs && fs->UsesGather) + update_stage_texture_surfaces(brw, fs, +brw->wm.base.surf_offset + +
[Mesa-dev] [PATCH V4 12/13] i965: Enable ARB_texture_gather on Gen7
Signed-off-by: Chris Forbes --- src/mesa/drivers/dri/i965/brw_context.c | 1 + src/mesa/drivers/dri/i965/intel_extensions.c | 4 2 files changed, 5 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index 5f58a29..0d677aa 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -176,6 +176,7 @@ brw_initialize_context_constants(struct brw_context *brw) ctx->Const.MaxColorTextureSamples = 8; ctx->Const.MaxDepthTextureSamples = 8; ctx->Const.MaxIntegerSamples = 8; + ctx->Const.MaxProgramTextureGatherComponents = 4; } ctx->Const.MinLineWidth = 1.0; diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c b/src/mesa/drivers/dri/i965/intel_extensions.c index 0fc5aad..0502a48 100644 --- a/src/mesa/drivers/dri/i965/intel_extensions.c +++ b/src/mesa/drivers/dri/i965/intel_extensions.c @@ -163,6 +163,10 @@ intelInitExtensions(struct gl_context *ctx) if (brw->gen == 5) ctx->Extensions.AMD_performance_monitor = true; + if (brw->gen >= 7) { + ctx->Extensions.ARB_texture_gather = true; + } + if (ctx->API == API_OPENGL_CORE) ctx->Extensions.ARB_base_instance = true; if (ctx->API != API_OPENGL_CORE) -- 1.8.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH V4 11/13] i965: use gather slots in the binding table for gather4.
Signed-off-by: Chris Forbes --- src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 8 ++-- src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 8 ++-- 2 files changed, 12 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp index 2d59d1a..ed4ee51 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp @@ -522,11 +522,15 @@ fs_generator::generate_tex(fs_inst *inst, struct brw_reg dst, struct brw_reg src src = retype(brw_vec8_grf(0, 0), BRW_REGISTER_TYPE_UW); } + uint32_t surface_index = inst->opcode == SHADER_OPCODE_TG4 + ? SURF_INDEX_GATHER_TEXTURE(inst->sampler) + : SURF_INDEX_TEXTURE(inst->sampler); + brw_SAMPLE(p, retype(dst, BRW_REGISTER_TYPE_UW), inst->base_mrf, src, - SURF_INDEX_TEXTURE(inst->sampler), + surface_index, inst->sampler, msg_type, rlen, @@ -535,7 +539,7 @@ fs_generator::generate_tex(fs_inst *inst, struct brw_reg dst, struct brw_reg src simd_mode, return_format); - mark_surface_used(SURF_INDEX_TEXTURE(inst->sampler)); + mark_surface_used(surface_index); } diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp index 6bdffb3..00efb10 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp @@ -385,11 +385,15 @@ vec4_generator::generate_tex(vec4_instruction *inst, break; } + uint32_t surface_index = inst->opcode == SHADER_OPCODE_TG4 + ? SURF_INDEX_VEC4_GATHER_TEXTURE(inst->sampler) + : SURF_INDEX_VEC4_TEXTURE(inst->sampler); + brw_SAMPLE(p, dst, inst->base_mrf, src, - SURF_INDEX_VEC4_TEXTURE(inst->sampler), + surface_index, inst->sampler, msg_type, 1, /* response length */ @@ -398,7 +402,7 @@ vec4_generator::generate_tex(vec4_instruction *inst, BRW_SAMPLER_SIMD_MODE_SIMD4X2, return_format); - mark_surface_used(SURF_INDEX_VEC4_TEXTURE(inst->sampler)); + mark_surface_used(surface_index); } void -- 1.8.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH V4 13/13] i965/hsw: Apply gather4 RG32F w/a using SCS instead of shader.
The new surface channel select bits allow us to avoid having to recompile the shader for this workaround. Signed-off-by: Chris Forbes --- src/mesa/drivers/dri/i965/brw_wm.c| 5 +++-- src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 14 -- 2 files changed, 11 insertions(+), 8 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_wm.c b/src/mesa/drivers/dri/i965/brw_wm.c index 0c2eae9..213a5bf 100644 --- a/src/mesa/drivers/dri/i965/brw_wm.c +++ b/src/mesa/drivers/dri/i965/brw_wm.c @@ -346,8 +346,9 @@ brw_populate_sampler_prog_key_data(struct gl_context *ctx, key->gl_clamp_mask[2] |= 1 << s; } - /* gather4's channel select for green from RG32F is broken */ - if (brw->gen >= 7 && prog->UsesGather) { + /* gather4's channel select for green from RG32F is broken; + * requires a shader w/a on IVB; fixable with just SCS on HSW. */ + if (brw->gen >= 7 && !brw->is_haswell && prog->UsesGather) { if (img->InternalFormat == GL_RG32F && GET_SWZ(t->_Swizzle, 0) == 1) key->gather_channel_quirk_mask |= 1 << s; } diff --git a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c index fc1613f..4090041 100644 --- a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c +++ b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c @@ -42,13 +42,13 @@ * "Shader Channel Select" enumerations (i.e. HSW_SCS_RED) */ static unsigned -swizzle_to_scs(GLenum swizzle) +swizzle_to_scs(GLenum swizzle, bool need_green_to_blue) { switch (swizzle) { case SWIZZLE_X: return HSW_SCS_RED; case SWIZZLE_Y: - return HSW_SCS_GREEN; + return need_green_to_blue ? HSW_SCS_BLUE : HSW_SCS_GREEN; case SWIZZLE_Z: return HSW_SCS_BLUE; case SWIZZLE_W: @@ -368,11 +368,13 @@ gen7_update_texture_surface(struct gl_context *ctx, const int swizzle = unlikely(alpha_depth) ? SWIZZLE_XYZW : brw_get_texture_swizzle(ctx, tObj); + const bool need_scs_green_to_blue = for_gather && tex_format == BRW_SURFACEFORMAT_R32G32_FLOAT_LD; + surf[7] = - SET_FIELD(swizzle_to_scs(GET_SWZ(swizzle, 0)), GEN7_SURFACE_SCS_R) | - SET_FIELD(swizzle_to_scs(GET_SWZ(swizzle, 1)), GEN7_SURFACE_SCS_G) | - SET_FIELD(swizzle_to_scs(GET_SWZ(swizzle, 2)), GEN7_SURFACE_SCS_B) | - SET_FIELD(swizzle_to_scs(GET_SWZ(swizzle, 3)), GEN7_SURFACE_SCS_A); + SET_FIELD(swizzle_to_scs(GET_SWZ(swizzle, 0), need_scs_green_to_blue), GEN7_SURFACE_SCS_R) | + SET_FIELD(swizzle_to_scs(GET_SWZ(swizzle, 1), need_scs_green_to_blue), GEN7_SURFACE_SCS_G) | + SET_FIELD(swizzle_to_scs(GET_SWZ(swizzle, 2), need_scs_green_to_blue), GEN7_SURFACE_SCS_B) | + SET_FIELD(swizzle_to_scs(GET_SWZ(swizzle, 3), need_scs_green_to_blue), GEN7_SURFACE_SCS_A); } /* Emit relocation to surface contents */ -- 1.8.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Choosing texture internal format in GLES
It will indeed select RGBA32F if you supply GL_RGBA and GL_FLOAT, etc. Marek On Mon, Sep 30, 2013 at 9:36 AM, Rogovin, Kevin wrote: > Hi, > >> No, I said it would be better to use st_choose_matching_format in >> st_ChooseTextureFormat, >> because st_choose_matching_format does exactly what you're trying to do. >> >> I have gone ahead and implemented what I had in mind. See the attached patch. >> >> Marek > > One thing to make sure of in the tables used by st_choose_matching_format in > GLES1/2: that HALF_FLOAT and FLOAT make the texture 16-bit and 32-bit texture > respectively(for when half float and/or full float textures are supported). > From a developers point of view, the GLES1/2 glTexImage parameters of > external format and type are to essentially determine the format of the > texture; the idea was/is that in GLES1/2 glTex[Sum]Image calls were not > supposed to do format conversions, so the developer selected the format with > the external format and external type parameters. > > > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 55951] [regression] Torchlight exits with BadDrawable (invalid Pixmap or Window parameter)
https://bugs.freedesktop.org/show_bug.cgi?id=55951 Tim Allen changed: What|Removed |Added CC||screwt...@froup.com -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r600g: fix UVD detection
Hi, I have committed a simpler fix as 7b25f52a95fe13bfc86c4d421328f6df7690876c . Marek On Mon, Sep 30, 2013 at 10:53 AM, Grigori Goronzy wrote: > UVD was checked before the info fields were initialized. Introduced > by commit 68f6dec32. > --- > src/gallium/drivers/r600/r600_pipe.c | 13 +++-- > 1 file changed, 7 insertions(+), 6 deletions(-) > > diff --git a/src/gallium/drivers/r600/r600_pipe.c > b/src/gallium/drivers/r600/r600_pipe.c > index 097a6b8..32df2a3 100644 > --- a/src/gallium/drivers/r600/r600_pipe.c > +++ b/src/gallium/drivers/r600/r600_pipe.c > @@ -1037,6 +1037,13 @@ struct pipe_screen *r600_screen_create(struct > radeon_winsys *ws) > rscreen->b.b.fence_signalled = r600_fence_signalled; > rscreen->b.b.fence_finish = r600_fence_finish; > rscreen->b.b.get_driver_query_info = r600_get_driver_query_info; > + r600_init_screen_resource_functions(&rscreen->b.b); > + > + if (!r600_common_screen_init(&rscreen->b, ws)) { > + FREE(rscreen); > + return NULL; > + } > + > if (rscreen->b.info.has_uvd) { > rscreen->b.b.get_video_param = ruvd_get_video_param; > rscreen->b.b.is_video_format_supported = > ruvd_is_format_supported; > @@ -1044,12 +1051,6 @@ struct pipe_screen *r600_screen_create(struct > radeon_winsys *ws) > rscreen->b.b.get_video_param = r600_get_video_param; > rscreen->b.b.is_video_format_supported = > vl_video_buffer_is_format_supported; > } > - r600_init_screen_resource_functions(&rscreen->b.b); > - > - if (!r600_common_screen_init(&rscreen->b, ws)) { > - FREE(rscreen); > - return NULL; > - } > > rscreen->b.debug_flags |= debug_get_flags_option("R600_DEBUG", > r600_debug_options, 0); > if (debug_get_bool_option("R600_DEBUG_COMPUTE", FALSE)) > -- > 1.8.1.2 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] Dispatch table question: VBO
Hi all, I've been tracking through src/mesa/vbo and tracking down the dispatch stuff in relation to the stuff in src/mesa/vbo. I see how the function entries in vbo_context#exec and vbo_context#save are filled (by essentially macros defined in of src/mesa/vbo/vbo_attrib_tmp.h interacting with macros defined in src/mesa/vbo/vbo_save_api.c for save and src/mesa/vbo/vbo_exec_api.c for exec). I also see the functions in src/mesa/main: _mesa_install_exec_vtxfmt and _mesa_install_save_vtxfmt The save one is very simple, it just inserts the functions from vbo_context#save into the GL dispatch table gl_context#Save. What is trickier for me to follow is what is happening on exec. The function _mesa_install_exec_vtxfmt sets both gl_context#BeginEnd and gl_context#Exec table values to that which is in vbo_context#exec, but leaves gl_context#OutsideBeginEnd as is, i.e. nothing but no-op functions. _A_ dispatch table is initialized as having no-ops for all functions and then is filled with _mesa_initialize_dispatch_tables(), which uses _mesa_initialize_exec_table() and _mesa_initialize_save_table(), which populate gl_context#Exec and gl_context#Save respectively. However I dot not see anything that populates gl_context#OutsideBeginEnd (except for it's initialization of all no-op functions). This would be okay, but... My confusion starts in vbo_exec_Begin(): ctx->Exec = ctx->BeginEnd; /* We may have been called from a display list, in which case we should * leave dlist.c's dispatch table in place. */ if (ctx->CurrentDispatch == ctx->OutsideBeginEnd) { ctx->CurrentDispatch = ctx->BeginEnd; _glapi_set_dispatch(ctx->CurrentDispatch); } else { assert(ctx->CurrentDispatch == ctx->Save); } and this block in vbo_exec_End(): ctx->Exec = ctx->OutsideBeginEnd; if (ctx->CurrentDispatch == ctx->BeginEnd) { ctx->CurrentDispatch = ctx->OutsideBeginEnd; _glapi_set_dispatch(ctx->CurrentDispatch); } for i965, there is a chain of calls so that _mesa_install_exec_vtxfmt() is called at context creation, which sets both Exec and BeginEnd (for compatibility profiles) to the value as found in vbo_context#exec. The initial value of CurrentDispatch is OutsideBeginEnd, atleast afaik set in _mesa_initialize_context(). Thus I see that after a glBegin, CurrentDispatch is set to BeginEnd and after glEnd(), CurrentDispatch is set to OutsideBeginEnd, which I cannot track down where it is populated with something besides no-op functions... what am I missing? -Kevin ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Updates to 9.2 branch
op 11-09-13 21:11, Ian Romanick schreef: > Just an FYI... > > The 9.2 branch is falling a bit behind. I'm going to trickle out > patches to the stable branch over the next few days / week. My plan is > to do 9.2.1 during the week of XDC. > > If your favorite patch hasn't made it out but is listed by > get-pick-list, don't worry. It will make it out. > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev > Waiting for mesa 9.2.1. :D ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] context sharing of framebuffer objects
On Mon, Sep 30, 2013 at 11:01 AM, Henri Verbeet wrote: > On 30 September 2013 02:18, Dave Airlie wrote: >> So this led me to look at the spec and the mesa code, and I noticed it >> appears at some point maybe around 3.1 that FBOs are no longer >> considered shared objects at least in core profile, but mesa always >> seems to share them, just wondering is someone can confirm I'm reading >> things correctly, and if so I might try and do a piglit test and a >> patch. >> > AFAIK the only FBOs that can be shared are ones create through > EXT_fbo. (Specifically, see issue 10 in the ARB_fbo spec, and Appendix > D in the GL 3.0 spec.) This matches my reading of the spec as well, and kind of makes the world horrible. From the ARB_framebuffer_object spec: "Dependencies on EXT_framebuffer_object Framebuffer objects created with the commands defined by the GL_EXT_framebuffer_object extension are defined to be shared, while FBOs created with commands defined by the OpenGL core or GL_ARB_framebuffer_object extension are defined *not* to be shared. Undefined behavior results when using FBOs created by EXT commands through non-EXT interfaces, or vice-versa." Yuck. Also see issue #10 in the spec: " (10) Can ARB framebuffer objects be shared between contexts? ARB_framebuffer_object is supposed to be compatible with EXT_framebuffer_object, but also a subset of OpenGL 3.0. EXT_framebuffer_object (rev. 120) explicitly allows sharing in issue 76, but the 3.0 spec explicitly disallows it in Appendix D. Resolved: No. ARB_framebuffer_object is intended to capture the functionality that went into GL 3.0. Furthermore, given that the entry points and tokens in this extension and the core are identical there is no way that an implementation could differentiate FBOs created with this extension from those created by core GL. ADDITIONAL COMMENTS: Undefined behavior results when using FBOs created by EXT commands through non-EXT FBO interfaces, or vice-versa. See the "Dependencies on EXT_framebuffer_object" section above." It basically says the same thing, only with a bit more explanation. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [r600g] Mesa CVS 4e9aa67: vdpau has only MPEG1/2 on RV730
Am 30.09.2013 10:47, schrieb Grigori Goronzy: On 30.09.2013 10:06, Michel Dänzer wrote: On Son, 2013-09-29 at 22:34 +0200, Dieter Nützel wrote: after latest git pull I've only MPEG1, MPEG2_SIMPLE and MPEG2_MAIN with my RV730 (AGP). Same problem on PALM. Bisection shows that it is caused by commit 68f6dec32. The initialization order seems to be wrong, the check for UVD is done too early. I'll send a patch in a minute. Best regards Grigori Thank you Grigori and Marek (AMD!) ;-) -Dieter That probably means you lost UVD support for some reason. Assuming UVD is still enabled in the kernel, can you bisect which Mesa change caused the problem for you? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/blorp: retype destination register for texture SEND instruction to UW.
I re-ran piglit with my resource streamer v2 implementation + this patch and actually this fixed sporadic lockups that I've been struggling with. As discussed at F2F with Chad and Paul, we need this for RS. I'll be posting the RS v2 soon quite soon. -abdiel On Friday, September 27, 2013 01:08:45 PM Paul Berry wrote: > From the bspec documentation of the SEND instruction: > > "destination region cannot cross the 256-bit register boundary." > > To avoid violating this restriction when executing SIMD16 texturing > operations (such as those used by blorp), we need to ensure that the > destination of the SEND instruction doesn't exceed 256 bits in size. > An easy way to do this is to set the type of the destination register > to UW (unsigned word), since 16 unsigned words can fit inside a > 256-bit register. Fortunately, this has no effect on the sampling > operation, since the sampler always infers the destination data type > from the sampler message rather than from the type of the instruction > operand. > > Previously, we did this for texturing operations issued by the vec4 > and fs back-ends, but not for blorp. This patch makes blorp use the > same trick. > > I haven't observed any behavioural difference on actual hardware due > to this patch, but it avoids a warning from the simulator so it seems > like the right thing to do. > --- > src/mesa/drivers/dri/i965/brw_blorp_blit.cpp | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp > b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp index f07d39f..027c72e > 100644 > --- a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp > +++ b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp > @@ -1928,7 +1928,7 @@ brw_blorp_blit_program::texture_lookup(struct brw_reg > dst, } > > brw_SAMPLE(&func, > - retype(dst, BRW_REGISTER_TYPE_F) /* dest */, > + retype(dst, BRW_REGISTER_TYPE_UW) /* dest */, >base_mrf /* msg_reg_nr */, >brw_message_reg(base_mrf) /* src0 */, >BRW_BLORP_TEXTURE_BINDING_TABLE_INDEX, ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] draw: Add a null check for draw.
On 09/27/2013 10:51 PM, Vinson Lee wrote: There is an earlier null check for draw so draw could be null here as well. Fixes "Dereference after null check" defect reported by Coverity. Signed-off-by: Vinson Lee --- src/gallium/auxiliary/draw/draw_pipe_unfilled.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/auxiliary/draw/draw_pipe_unfilled.c b/src/gallium/auxiliary/draw/draw_pipe_unfilled.c index 7a88ce0..8cba07c 100644 --- a/src/gallium/auxiliary/draw/draw_pipe_unfilled.c +++ b/src/gallium/auxiliary/draw/draw_pipe_unfilled.c @@ -237,7 +237,7 @@ draw_unfilled_prepare_outputs( struct draw_context *draw, boolean is_unfilled = (rast && (rast->fill_front != PIPE_POLYGON_MODE_FILL || rast->fill_back != PIPE_POLYGON_MODE_FILL)); - const struct draw_fragment_shader *fs = draw->fs.fragment_shader; + const struct draw_fragment_shader *fs = draw ? draw->fs.fragment_shader : 0; if (is_unfilled && fs && fs->info.uses_frontface) { unfilled->face_slot = draw_alloc_extra_vertex_attrib( Reviewed-by: Brian Paul ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] util/u_format: Assert that format block size is at least 1 byte.
On 09/27/2013 11:52 PM, Vinson Lee wrote: The block size for all formats is currently at least 1 byte. Add an assertion for this. This should silence several Coverity "Division or modulo by zero" defects. Signed-off-by: Vinson Lee --- src/gallium/auxiliary/util/u_format.h | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/src/gallium/auxiliary/util/u_format.h b/src/gallium/auxiliary/util/u_format.h index 28527f5..84f16d5 100644 --- a/src/gallium/auxiliary/util/u_format.h +++ b/src/gallium/auxiliary/util/u_format.h @@ -716,10 +716,15 @@ static INLINE uint util_format_get_blocksize(enum pipe_format format) { uint bits = util_format_get_blocksizebits(format); + uint bytes = bits / 8; assert(bits % 8 == 0); + assert(bytes > 0); + if (bytes == 0) { + bytes = 1; + } - return bits / 8; + return bytes; } static INLINE uint Reviewed-by: Brian Paul ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] llvmpipe: Remove unnecessary null check of shader.
On 09/27/2013 10:30 PM, Vinson Lee wrote: shader has already been dereferenced earlier so cannot be null here. Fixes "Dereference before null check" defect reported by Coverity. Signed-off-by: Vinson Lee --- src/gallium/drivers/llvmpipe/lp_state_fs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/drivers/llvmpipe/lp_state_fs.c b/src/gallium/drivers/llvmpipe/lp_state_fs.c index 875a3cf..8223d2a 100644 --- a/src/gallium/drivers/llvmpipe/lp_state_fs.c +++ b/src/gallium/drivers/llvmpipe/lp_state_fs.c @@ -2435,7 +2435,7 @@ generate_variant(struct llvmpipe_context *lp, !shader->info.base.uses_kill ? TRUE : FALSE; - if ((!shader || shader->info.base.num_tokens <= 1) && + if ((shader->info.base.num_tokens <= 1) && !key->depth.enabled && !key->stencil[0].enabled) { variant->ps_inv_multiplier = 0; } else { Reviewed-by: Brian Paul ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/7] st/xorg: remove unnecessary headers
On 09/28/2013 08:46 AM, Emil Velikov wrote: v2: Remove xf86PciInfo.h, all drivers provide their own PCI ID list Signed-off-by: Emil Velikov --- src/gallium/state_trackers/xorg/xorg_driver.c | 1 - src/gallium/state_trackers/xorg/xorg_output.c | 7 --- 2 files changed, 8 deletions(-) diff --git a/src/gallium/state_trackers/xorg/xorg_driver.c b/src/gallium/state_trackers/xorg/xorg_driver.c index 9d7713c..dd243bc 100644 --- a/src/gallium/state_trackers/xorg/xorg_driver.c +++ b/src/gallium/state_trackers/xorg/xorg_driver.c @@ -33,7 +33,6 @@ #include "xf86.h" #include "xf86_OSproc.h" #include "compiler.h" -#include "xf86PciInfo.h" #include "xf86Pci.h" #include "mipointer.h" #include "micmap.h" diff --git a/src/gallium/state_trackers/xorg/xorg_output.c b/src/gallium/state_trackers/xorg/xorg_output.c index b183cdf..dffc28e 100644 --- a/src/gallium/state_trackers/xorg/xorg_output.c +++ b/src/gallium/state_trackers/xorg/xorg_output.c @@ -43,13 +43,6 @@ #include #include -#ifdef HAVE_XEXTPROTO_71 -#include -#else -#define DPMS_SERVER -#include -#endif - #include "xorg_tracker.h" struct output_private Reviewed-by: Brian Paul ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/7] st/xorg: add sanity checks after malloc
On 09/28/2013 08:46 AM, Emil Velikov wrote: Signed-off-by: Emil Velikov --- src/gallium/state_trackers/xorg/xorg_driver.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/src/gallium/state_trackers/xorg/xorg_driver.c b/src/gallium/state_trackers/xorg/xorg_driver.c index dd243bc..097c354 100644 --- a/src/gallium/state_trackers/xorg/xorg_driver.c +++ b/src/gallium/state_trackers/xorg/xorg_driver.c @@ -124,6 +124,9 @@ Bool xorg_tracker_have_modesetting(ScrnInfoPtr pScrn, struct pci_device *device) { char *BusID = malloc(64); + +if (!BusID) + return FALSE; sprintf(BusID, "pci:%04x:%02x:%02x.%d", device->domain, device->bus, device->dev, device->func); @@ -276,6 +279,9 @@ drv_init_drm(ScrnInfoPtr pScrn) char *BusID; BusID = malloc(64); + if (!BusID) + return FALSE; + sprintf(BusID, "PCI:%d:%d:%d", ((ms->PciInfo->domain << 8) | ms->PciInfo->bus), ms->PciInfo->dev, ms->PciInfo->func Reviewed-by: Brian Paul ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/7] st/xorg: drop set but unsused variables dxo, dyo
On 09/28/2013 08:46 AM, Emil Velikov wrote: Commit a9f8baf00b264 removed the first and only use of the variables but forgot to remove them. Signed-off-by: Emil Velikov --- src/gallium/state_trackers/xorg/xorg_xv.c | 4 1 file changed, 4 deletions(-) diff --git a/src/gallium/state_trackers/xorg/xorg_xv.c b/src/gallium/state_trackers/xorg/xorg_xv.c index 3097d00..f0de3d2 100644 --- a/src/gallium/state_trackers/xorg/xorg_xv.c +++ b/src/gallium/state_trackers/xorg/xorg_xv.c @@ -490,7 +490,6 @@ display_video(ScrnInfoPtr pScrn, struct xorg_xv_port_priv *pPriv, int id, modesettingPtr ms = modesettingPTR(pScrn); BoxPtr pbox; int nbox; - int dxo, dyo; Bool hdtv; int x, y, w, h; struct exa_pixmap_priv *dst; @@ -518,9 +517,6 @@ display_video(ScrnInfoPtr pScrn, struct xorg_xv_port_priv *pPriv, int id, -pPixmap->screen_y); #endif - dxo = dstRegion->extents.x1; - dyo = dstRegion->extents.y1; - pbox = REGION_RECTS(dstRegion); nbox = REGION_NUM_RECTS(dstRegion); Reviewed-by: Brian Paul ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 6/7] configure: use PKG_CONFIG variable over hardcoded pkg-config
On 09/28/2013 08:46 AM, Emil Velikov wrote: Already available and used in other places of configure.ac. Signed-off-by: Emil Velikov --- configure.ac | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/configure.ac b/configure.ac index 1f0a646..1dd0087 100644 --- a/configure.ac +++ b/configure.ac @@ -1361,8 +1361,8 @@ if test "x$enable_opencl" = xyes; then PKG_CONFIG_PATH environment variable. By default libclc.pc is installed to /usr/local/share/pkgconfig/]) else -LIBCLC_INCLUDEDIR=`pkg-config --variable=includedir libclc` -LIBCLC_LIBEXECDIR=`pkg-config --variable=libexecdir libclc` +LIBCLC_INCLUDEDIR=`$PKG_CONFIG --variable=includedir libclc` +LIBCLC_LIBEXECDIR=`$PKG_CONFIG --variable=libexecdir libclc` AC_SUBST([LIBCLC_INCLUDEDIR]) AC_SUBST([LIBCLC_LIBEXECDIR]) fi Reviewed-by: Brian Paul ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 05/29] softpipe: consolidate C sources list into Makefile.sources
On Sat, Sep 28, 2013 at 03:01:15PM +0100, Emil Velikov wrote: > On 28/09/13 04:48, Tom Stellard wrote: > > On Sun, Sep 22, 2013 at 09:29:28PM +0100, Emil Velikov wrote: > >> Signed-off-by: Emil Velikov > > > > As long as you have build tested these with both automake and scons and > > are prepared to deal with any fallout once they are committed. > > > > Patches 5 through 29 are: > > > > Reviewed-by: Tom Stellard > > > Big thanks for the review Tom. > > I've build tested both automake and scons, and am comfortable with > resolving any issues that this series may cause. > > The annotated series, rebased on top of master (had to resolve minor > conflict in the first patch) can be found in the makefile.sources-v4 > branch over at https://github.com/evelikov/Mesa/ > Thanks for preparing this branch, I will push it on Tuesday morning (UTC-7) if there are no objections. Once I push the patches, keep an eye on IRC/Mailing Lists/Bugzilla for potential broken build reports. -Tom ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] gallium: include u_surface.h instead of u_rect.h
u_rect.h was including u_surface.h just to avoid touching a bunch of other source files after some functions were moved from u_rect.h to u_surface.h. This patch cleans up that hack. --- src/gallium/auxiliary/util/u_format.c|2 +- src/gallium/auxiliary/util/u_rect.h |6 -- src/gallium/auxiliary/util/u_tile.c |2 +- src/gallium/auxiliary/vl/vl_mpeg12_decoder.c |2 +- 4 files changed, 3 insertions(+), 9 deletions(-) diff --git a/src/gallium/auxiliary/util/u_format.c b/src/gallium/auxiliary/util/u_format.c index 08ef6ab..a8aa571 100644 --- a/src/gallium/auxiliary/util/u_format.c +++ b/src/gallium/auxiliary/util/u_format.c @@ -34,9 +34,9 @@ #include "u_math.h" #include "u_memory.h" -#include "u_rect.h" #include "u_format.h" #include "u_format_s3tc.h" +#include "u_surface.h" #include "pipe/p_defines.h" diff --git a/src/gallium/auxiliary/util/u_rect.h b/src/gallium/auxiliary/util/u_rect.h index 10909b2..c141550 100644 --- a/src/gallium/auxiliary/util/u_rect.h +++ b/src/gallium/auxiliary/util/u_rect.h @@ -83,10 +83,4 @@ u_rect_possible_intersection(const struct u_rect *a, } #endif - -/* Include pipe copy/fill rect helpers declarations for backwards compatibility - */ -#include "util/u_surface.h" - - #endif /* U_RECT_H */ diff --git a/src/gallium/auxiliary/util/u_tile.c b/src/gallium/auxiliary/util/u_tile.c index 62298cd..fb80aec 100644 --- a/src/gallium/auxiliary/util/u_tile.c +++ b/src/gallium/auxiliary/util/u_tile.c @@ -37,7 +37,7 @@ #include "util/u_format.h" #include "util/u_math.h" #include "util/u_memory.h" -#include "util/u_rect.h" +#include "util/u_surface.h" #include "util/u_tile.h" diff --git a/src/gallium/auxiliary/vl/vl_mpeg12_decoder.c b/src/gallium/auxiliary/vl/vl_mpeg12_decoder.c index f838e74..f91f90b 100644 --- a/src/gallium/auxiliary/vl/vl_mpeg12_decoder.c +++ b/src/gallium/auxiliary/vl/vl_mpeg12_decoder.c @@ -29,8 +29,8 @@ #include #include "util/u_memory.h" -#include "util/u_rect.h" #include "util/u_sampler.h" +#include "util/u_surface.h" #include "util/u_video.h" #include "vl_mpeg12_decoder.h" -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] context sharing of framebuffer objects
On 09/29/2013 05:18 PM, Dave Airlie wrote: > Hey, > > So for virgl I was praying I could avoid having to keep a GL context > on the host per guest context but it appears to do > NV_conditional_render properly I either need the ability to > suspend/resume queries (maybe I could write an extension) or I need > contexts :-( > > So this led me to look at the spec and the mesa code, and I noticed it > appears at some point maybe around 3.1 that FBOs are no longer > considered shared objects at least in core profile, but mesa always > seems to share them, just wondering is someone can confirm I'm reading > things correctly, and if so I might try and do a piglit test and a > patch. Correct on both counts. This was part of the reason for the glBindFramebuffer / glBindFramebufferEXT separation a few months ago. That enabled fixing the problem, but actually fixing it is a fair amount of work. > Dave. > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] gallivm: ignore rho approximation for cube maps
From: Roland Scheidegger There's two reasons for this: 1) even when ignoring rho approximation for cube maps, the result is still not correct, but it's better as the max error at edges is now sqrt(2) instead of 2 (which was a full mip level), same as it is for ordinary 2d maps when doing rho approximations (so the error actually goes from factor 2 at edges and sqrt(2) completely inside a face to sqrt(2) at edges and 0 inside a face). 2) I want to repurpose rho_no_approx for cubemaps for fully correct cubemap derivatives (so don't need yet another debug var). --- src/gallium/auxiliary/gallivm/lp_bld_sample.c | 34 + 1 file changed, 12 insertions(+), 22 deletions(-) diff --git a/src/gallium/auxiliary/gallivm/lp_bld_sample.c b/src/gallium/auxiliary/gallivm/lp_bld_sample.c index c775382..ea6bec7 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_sample.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_sample.c @@ -269,10 +269,8 @@ lp_build_rho(struct lp_build_sample_context *bld, /* Could optimize this for single quad just skip the broadcast */ cubesize = lp_build_extract_broadcast(gallivm, bld->float_size_in_type, rho_bld->type, float_size, index0); - if (no_rho_opt) { - /* skipping sqrt hence returning rho squared */ - cubesize = lp_build_mul(rho_bld, cubesize, cubesize); - } + /* skipping sqrt hence returning rho squared */ + cubesize = lp_build_mul(rho_bld, cubesize, cubesize); rho = lp_build_mul(rho_bld, cubesize, rho); } else if (derivs && !(bld->static_texture_state->target == PIPE_TEXTURE_CUBE)) { @@ -757,8 +755,8 @@ lp_build_lod_selector(struct lp_build_sample_context *bld, } else { LLVMValueRef rho; - boolean rho_squared = (gallivm_debug & GALLIVM_DEBUG_NO_RHO_APPROX) && - (bld->dims > 1); + boolean rho_squared = ((gallivm_debug & GALLIVM_DEBUG_NO_RHO_APPROX) && +(bld->dims > 1)) || cube_rho; rho = lp_build_rho(bld, texture_unit, s, t, r, cube_rho, derivs); @@ -1602,31 +1600,23 @@ lp_build_cube_lookup(struct lp_build_sample_context *bld, * know the texture is square which simplifies things (we can omit the * size mul which happens very early completely here and do it at the * very end). + * Also always do calculations according to GALLIVM_DEBUG_NO_RHO_APPROX + * since the error can get quite big otherwise at edges. + * (With no_rho_approx max error is sqrt(2) at edges, same as it is + * without no_rho_approx for 2d textures, otherwise it would be factor 2.) */ ddx_ddy[0] = lp_build_packed_ddx_ddy_twocoord(coord_bld, s, t); ddx_ddy[1] = lp_build_packed_ddx_ddy_onecoord(coord_bld, r); - if (gallivm_debug & GALLIVM_DEBUG_NO_RHO_APPROX) { -ddx_ddy[0] = lp_build_mul(coord_bld, ddx_ddy[0], ddx_ddy[0]); -ddx_ddy[1] = lp_build_mul(coord_bld, ddx_ddy[1], ddx_ddy[1]); - } - else { -ddx_ddy[0] = lp_build_abs(coord_bld, ddx_ddy[0]); -ddx_ddy[1] = lp_build_abs(coord_bld, ddx_ddy[1]); - } + ddx_ddy[0] = lp_build_mul(coord_bld, ddx_ddy[0], ddx_ddy[0]); + ddx_ddy[1] = lp_build_mul(coord_bld, ddx_ddy[1], ddx_ddy[1]); tmp[0] = lp_build_swizzle_aos(coord_bld, ddx_ddy[0], swizzle01); tmp[1] = lp_build_swizzle_aos(coord_bld, ddx_ddy[0], swizzle23); tmp[2] = lp_build_swizzle_aos(coord_bld, ddx_ddy[1], swizzle02); - if (gallivm_debug & GALLIVM_DEBUG_NO_RHO_APPROX) { -rho_vec = lp_build_add(coord_bld, tmp[0], tmp[1]); -rho_vec = lp_build_add(coord_bld, rho_vec, tmp[2]); - } - else { -rho_vec = lp_build_max(coord_bld, tmp[0], tmp[1]); -rho_vec = lp_build_max(coord_bld, rho_vec, tmp[2]); - } + rho_vec = lp_build_add(coord_bld, tmp[0], tmp[1]); + rho_vec = lp_build_add(coord_bld, rho_vec, tmp[2]); tmp[0] = lp_build_swizzle_aos(coord_bld, rho_vec, swizzle0); tmp[1] = lp_build_swizzle_aos(coord_bld, rho_vec, swizzle1); -- 1.7.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/blorp: retype destination register for texture SEND instruction to UW.
Paul Berry writes: > From the bspec documentation of the SEND instruction: > > "destination region cannot cross the 256-bit register boundary." > > To avoid violating this restriction when executing SIMD16 texturing > operations (such as those used by blorp), we need to ensure that the > destination of the SEND instruction doesn't exceed 256 bits in size. > An easy way to do this is to set the type of the destination register > to UW (unsigned word), since 16 unsigned words can fit inside a > 256-bit register. Fortunately, this has no effect on the sampling > operation, since the sampler always infers the destination data type > from the sampler message rather than from the type of the instruction > operand. > > Previously, we did this for texturing operations issued by the vec4 > and fs back-ends, but not for blorp. This patch makes blorp use the > same trick. > > I haven't observed any behavioural difference on actual hardware due > to this patch, but it avoids a warning from the simulator so it seems > like the right thing to do. Reviewed-by: Eric Anholt pgpCktMRdcJop.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i915: Fix memory leak in do_blit_readpixels.
Vinson Lee writes: > Fixes "Resource leak" defect reported by Coverity. > > Signed-off-by: Vinson Lee Reviewed-by: Eric Anholt pgpsVboHuAB85.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] XXXgallium/common_dri: fix the scons build
Emil Velikov writes: > On 28/09/13 01:41, Eric Anholt wrote: >> Emil Velikov writes: >> >>> * clone the drienv to driswenv and adjust approapriately >>> * export driswenv and use it in dri-swrast >>> * ensure __NOT_HAVE_DRM_H is defined for drisw, similar >>> to all other common_drisw users >> >> I'm confused where __NOT_HAVE_DRM_H comes from. I don't see any >> references to it in the tree until your patch. >> > Yes that one did my head in a bit, here is what I've gathered > > If you decide to omit it, build will fail due to missing drm.h in the > include path, coming from dri_interface.h. > > Take a look at dri_interface.h, it has a very interesting heuristics - > #if def __APPLE__ || __CYGWIN__ || __GNU__ > #ifndef __NOT_HAVE_DRM_H > #define __NOT_HAVE_DRM_H > #endif > > Thus the obvious question, why did it work before and not after - I'm > assuming that scons plays "nicely" with the __GNU__ define. > > The last one is only speculation as I've ran out of patience at that > moment :\ Although the following info wrt __NOT_HAVE_DRM_H is quite > interesting. > > Three out of four automake swrast providers define it > * src/mesa/drivers/dri/swrast/Makefile.am > * src/gallium/targets/dri-swrast/Makefile.am > * src/gallium/state_trackers/dri/sw/Makefile.am (swrast libGL.so) > > and only one Scons target provides is > * src/gallium/state_trackers/dri/sw/SConscript > env.Append(CPPDEFINES = [('__NOT_HAVE_DRM_H', '1')]) > > Note #ifdef __NOT_HAVE_DRM_H vs #if __NOT_HAVE_DRM_H Not sure what I did before to fail at grep, but yeah, it's obviously in the tree. I've squashed your patch into mine -- sound good? pgpBUK7YCrfmQ.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/blorp: retype destination register for texture SEND instruction to UW.
I don't understand all the details, but I did confirm that it pacified the simulator. Acked-by: Chad Versace ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] XXXgallium/common_dri: fix the scons build
On 30/09/13 17:06, Eric Anholt wrote: > Emil Velikov writes: > >> On 28/09/13 01:41, Eric Anholt wrote: >>> Emil Velikov writes: >>> * clone the drienv to driswenv and adjust approapriately * export driswenv and use it in dri-swrast * ensure __NOT_HAVE_DRM_H is defined for drisw, similar to all other common_drisw users >>> >>> I'm confused where __NOT_HAVE_DRM_H comes from. I don't see any >>> references to it in the tree until your patch. >>> >> Yes that one did my head in a bit, here is what I've gathered >> >> If you decide to omit it, build will fail due to missing drm.h in the >> include path, coming from dri_interface.h. >> >> Take a look at dri_interface.h, it has a very interesting heuristics - >> #if def __APPLE__ || __CYGWIN__ || __GNU__ >> #ifndef __NOT_HAVE_DRM_H >> #define __NOT_HAVE_DRM_H >> #endif >> >> Thus the obvious question, why did it work before and not after - I'm >> assuming that scons plays "nicely" with the __GNU__ define. >> >> The last one is only speculation as I've ran out of patience at that >> moment :\ Although the following info wrt __NOT_HAVE_DRM_H is quite >> interesting. >> >> Three out of four automake swrast providers define it >> * src/mesa/drivers/dri/swrast/Makefile.am >> * src/gallium/targets/dri-swrast/Makefile.am >> * src/gallium/state_trackers/dri/sw/Makefile.am (swrast libGL.so) >> >> and only one Scons target provides is >> * src/gallium/state_trackers/dri/sw/SConscript >> env.Append(CPPDEFINES = [('__NOT_HAVE_DRM_H', '1')]) >> >> Note #ifdef __NOT_HAVE_DRM_H vs #if __NOT_HAVE_DRM_H > > Not sure what I did before to fail at grep, but yeah, it's obviously in > the tree. I've squashed your patch into mine -- sound good? > Hold the presses I've found a silly the scons which with combination of this patch will result in a broken(failed build), patch will follow shortly. Which of course you're more than welcome to squash if it looks ok with you. ~Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] XXXscons: drm is not a dri thinko :)
Strictly speaking xmlpool does not require drm, although it's main (and upto recently only user HW drivers) did. With the dri patch(es) swrast now require xmlpool, thus changing that to dri seems like the only sane option. In other words without this patch and with HW drivers disabled, swrast will fail to build. Drop the !dri check in SConscript.dri, as it's called only when the variable is set. Signed-off-by: Emil Velikov --- src/gallium/targets/SConscript.dri | 3 --- src/mesa/drivers/SConscript| 2 +- 2 files changed, 1 insertion(+), 4 deletions(-) diff --git a/src/gallium/targets/SConscript.dri b/src/gallium/targets/SConscript.dri index c5aab7c..eb1354a 100644 --- a/src/gallium/targets/SConscript.dri +++ b/src/gallium/targets/SConscript.dri @@ -3,9 +3,6 @@ Import('*') -if not env['dri']: -Return() - drienv = env.Clone() drienv.Replace(CPPPATH = [ diff --git a/src/mesa/drivers/SConscript b/src/mesa/drivers/SConscript index 6dcc506..355e680 100644 --- a/src/mesa/drivers/SConscript +++ b/src/mesa/drivers/SConscript @@ -5,7 +5,7 @@ SConscript('osmesa/SConscript') if env['x11']: SConscript('x11/SConscript') -if env['drm']: +if env['dri']: SConscript('dri/common/xmlpool/SConscript') if env['platform'] == 'windows': -- 1.8.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/blorp: Use passed in framebuffer rather than ctx->DrawBuffer
On 09/30/2013 12:48 AM, Chris Forbes wrote: We have the destination framebuffer object passed in; there's no need to go digging around in the context. Signed-off-by: Chris Forbes --- src/mesa/drivers/dri/i965/brw_blorp_clear.cpp | 8 1 file changed, 4 insertions(+), 4 deletions(-) Reviewed-by: Chad Versace ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] Use -Bsymbolic when linking libEGL.so
On 09/28/2013 12:38 AM, Eric Anholt wrote: Carl Worth writes: For some reason that I don't yet fully understand, Glaze does not work with libEGL unless libEGL is linked with -Bsymbolic.[*] Beyond that specific reason, all of the reasons for which libGL.so is linked with -Bsymbolic, (see the commit history), should also apply here. [*] The specific behavior I am seeing is that when Glaze calls dlopen for libEGL.so, ifunc resolvers within Glaze for EGL functions are called before the dlopen returns. These resolvers cannot succeed, as they need the return value from dlopen in order to find the functions to resolve to. I don't know what's causing these resolvers to be called, but I have verified that linking libEGL with -Bsymbolic causes this problematic behavior to stop. Could you print which thing is trying to get resolved early? I see a few egl* calls within main/egl*.c (eglQueryAPI(), eglGetDisplay(), eglWaitClient()), and I'm wondering if not having Bsymbolic on them is causing an RTLD_NOW (perhaps by the glaze-find-libgl.c?) to try to call the ifuncs early. We should certainly be using Bsymbolic, and I'd like to see this go to stable. Reviewed-by: Eric Anholt Ditto. Please put in stable. Reviewed-by: Chad Versace Having given my r-b though... we need to alert the Wayland folks to this change. Recent libEGL symbol changes have had unforseen consequences on the Wayland stack. ALERT! Kristian and Joe, we will start building libEGL with -Bsymbolic. Does this break anything for you? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 69874] Automake throws a lot of "[...] option 'subdir-objects' is disabled"
https://bugs.freedesktop.org/show_bug.cgi?id=69874 --- Comment #5 from Emil Velikov --- (In reply to comment #3) > Yeah. And subdir-objects is broken for the way we have our build set up at > the moment (libdricore built from a separate directory, on the same .c files > as core). Once we megadrivers, we can revisit subdir-objects, though I seem > to recall there being a problem with dependencies when moving files, as well. > Yes libdricore is "fun" :) With libdricore aside I've managed to clear out the build systems a bit, and silence most of the automake warnings. Some of it you can check out in the subdir-objects (often rebased) branch at https://github.com/evelikov/Mesa I'm planning to revisit it once the megadriver work has landed. P.S. Here is a list of the odd bits left src/gallium/drivers/r300/ src/glsl/ src/mapi/ src/mapi/ src/mapi/*api/ src/mesa/ -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH V4 02/13] glsl: add texture gather changes
On 09/30/2013 03:08 AM, Chris Forbes wrote: > From: Maxence Le Dore > > V2 [Chris Forbes]: >- Add new pattern, fixup parameter reading. > > V3: Rebase onto new builtins machinery > > Reviewed-by: Kenneth Graunke > --- > src/glsl/builtin_functions.cpp | 35 +++ > src/glsl/glcpp/glcpp-parse.y| 3 +++ > src/glsl/glsl_parser_extras.cpp | 1 + > src/glsl/glsl_parser_extras.h | 2 ++ > src/glsl/ir.cpp | 2 +- > src/glsl/ir.h | 4 +++- > src/glsl/ir_clone.cpp | 1 + > src/glsl/ir_hv_accept.cpp | 1 + > src/glsl/ir_print_visitor.cpp | 3 ++- > src/glsl/ir_reader.cpp | 6 +- > src/glsl/ir_rvalue_visitor.cpp | 1 + > src/glsl/opt_tree_grafting.cpp | 1 + > src/glsl/standalone_scaffolding.cpp | 1 + > src/mesa/program/ir_to_mesa.cpp | 5 + > 14 files changed, 62 insertions(+), 4 deletions(-) > > diff --git a/src/glsl/builtin_functions.cpp b/src/glsl/builtin_functions.cpp > index 72054e0..df735ef 100644 > --- a/src/glsl/builtin_functions.cpp > +++ b/src/glsl/builtin_functions.cpp > @@ -262,6 +262,13 @@ texture_query_lod(const _mesa_glsl_parse_state *state) >state->ARB_texture_query_lod_enable; > } > > +static bool > +texture_gather(const _mesa_glsl_parse_state *state) > +{ > + return state->is_version(400, 0) || > + state->ARB_texture_gather_enable; > +} > + This should be glsl_parser_state::has_texture_gather, like in Ken's f91475d... though it looks like some of the rest of this file could be modified to use that pattern. Hrm... > /* Desktop GL or OES_standard_derivatives + fragment shader only */ > static bool > fs_oes_derivatives(const _mesa_glsl_parse_state *state) > @@ -1816,6 +1823,34 @@ builtin_builder::create_builtins() > _texture(ir_txd, shader_texture_lod_and_rect, > glsl_type::vec4_type, glsl_type::sampler2DRectShadow_type, > glsl_type::vec4_type, TEX_PROJECT), > NULL); > > + add_function("textureGather", > +_texture(ir_tg4, texture_gather, glsl_type::vec4_type, > glsl_type::sampler2D_type, glsl_type::vec2_type), > +_texture(ir_tg4, texture_gather, glsl_type::ivec4_type, > glsl_type::isampler2D_type, glsl_type::vec2_type), > +_texture(ir_tg4, texture_gather, glsl_type::uvec4_type, > glsl_type::usampler2D_type, glsl_type::vec2_type), > + > +_texture(ir_tg4, texture_gather, glsl_type::vec4_type, > glsl_type::sampler2DArray_type, glsl_type::vec3_type), > +_texture(ir_tg4, texture_gather, glsl_type::ivec4_type, > glsl_type::isampler2DArray_type, glsl_type::vec3_type), > +_texture(ir_tg4, texture_gather, glsl_type::uvec4_type, > glsl_type::usampler2DArray_type, glsl_type::vec3_type), > + > +_texture(ir_tg4, texture_gather, glsl_type::vec4_type, > glsl_type::samplerCube_type, glsl_type::vec3_type), > +_texture(ir_tg4, texture_gather, glsl_type::ivec4_type, > glsl_type::isamplerCube_type, glsl_type::vec3_type), > +_texture(ir_tg4, texture_gather, glsl_type::uvec4_type, > glsl_type::usamplerCube_type, glsl_type::vec3_type), > + > +_texture(ir_tg4, texture_gather, glsl_type::vec4_type, > glsl_type::samplerCubeArray_type, glsl_type::vec4_type), > +_texture(ir_tg4, texture_gather, glsl_type::ivec4_type, > glsl_type::isamplerCubeArray_type, glsl_type::vec4_type), > +_texture(ir_tg4, texture_gather, glsl_type::uvec4_type, > glsl_type::usamplerCubeArray_type, glsl_type::vec4_type), > +NULL); > + > + add_function("textureGatherOffset", > +_texture(ir_tg4, texture_gather, glsl_type::vec4_type, > glsl_type::sampler2D_type, glsl_type::vec2_type, TEX_OFFSET), > +_texture(ir_tg4, texture_gather, glsl_type::ivec4_type, > glsl_type::isampler2D_type, glsl_type::vec2_type, TEX_OFFSET), > +_texture(ir_tg4, texture_gather, glsl_type::uvec4_type, > glsl_type::usampler2D_type, glsl_type::vec2_type, TEX_OFFSET), > + > +_texture(ir_tg4, texture_gather, glsl_type::vec4_type, > glsl_type::sampler2DArray_type, glsl_type::vec3_type, TEX_OFFSET), > +_texture(ir_tg4, texture_gather, glsl_type::ivec4_type, > glsl_type::isampler2DArray_type, glsl_type::vec3_type, TEX_OFFSET), > +_texture(ir_tg4, texture_gather, glsl_type::uvec4_type, > glsl_type::usampler2DArray_type, glsl_type::vec3_type, TEX_OFFSET), > +NULL); > + > F(dFdx) > F(dFdy) > F(fwidth) > diff --git a/src/glsl/glcpp/glcpp-parse.y b/src/glsl/glcpp/glcpp-parse.y > index 6eaa5f9..c7ad3e9 100644 > --- a/src/glsl/glcpp/glcpp-parse.y > +++ b/src/glsl/glcpp/glcpp-parse.y > @@ -1248,6 +1248,9 @@ glcpp_parser_create (const struct gl_extensions > *extensions, int api) > > if (ex
Re: [Mesa-dev] [PATCH 21/24] i965/gen7: Handle atomic instructions from the FS back-end.
Paul Berry writes: > On 15 September 2013 00:19, Francisco Jerez wrote: > >>[...] >> diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp >> b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp >> index 762832a..412d27a 100644 >> --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp >> +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp >> @@ -2112,8 +2112,91 @@ fs_visitor::visit(ir_end_primitive *) >> } >> >> void >> +fs_visitor::emit_untyped_atomic(unsigned atomic_op, unsigned surf_index, >> +unsigned offset, fs_reg dst, fs_reg src0, >> +fs_reg src1) >> +{ >> + const unsigned operand_len = dispatch_width / 8; >> + unsigned mlen = 0; >> + >> + /* Initialize the sample mask in the message header. */ >> + emit(MOV(brw_uvec_mrf(8, mlen, 0), brw_imm_ud(0))) >> + ->force_writemask_all = true; >> + emit(MOV(brw_uvec_mrf(1, mlen, 7), >> +retype(brw_vec1_grf(1, 7), BRW_REGISTER_TYPE_UD))) >> + ->force_writemask_all = true; >> > > For fragment shaders that don't use discard (fp->UsesKill == false) this > will do the right thing. > > For fragment shaders that do use discard, we store the current pixel mask > in the f0.1 register, so you'll need to do something like this: > > emit(MOV(brw_uvec_mrf(1, mlen, 7), brw_flag_reg(0, 1))->force_writemask_all > = true; > > Otherwise pixels that have been discarded will erroneously get the atomic > operation applied to them. > Ugh. Thanks for pointing this out, it should be fixed now. :) > Note: if all the pixels within a 2x2 subspan get discarded, we disable that > subspan in the execution mask. So in order to test this effectively, > you'll probably need to write a piglit test that discards only some of the > pixels within a subspan, and not others. > > >> + mlen++; >> + >> + /* Set the atomic operation offset. */ >> + emit(MOV(brw_uvec_mrf(dispatch_width, mlen, 0), brw_imm_ud(offset))); >> + mlen += operand_len; >> + >> + /* Set the atomic operation arguments. */ >> + if (src0.file != BAD_FILE) { >> + emit(MOV(brw_uvec_mrf(dispatch_width, mlen, 0), src0)); >> + mlen += operand_len; >> + } >> + >> + if (src1.file != BAD_FILE) { >> + emit(MOV(brw_uvec_mrf(dispatch_width, mlen, 0), src1)); >> + mlen += operand_len; >> + } >> > > src0 is address and src1 is write data, right? It would be nice to have > that in a comment so that readers don't have to cross reference to the > bspec. > Nope, both are arguments as the comment says, the address is set by the MRF write right before those. > >> + >> + /* Emit the instruction. */ >> + fs_inst inst(SHADER_OPCODE_UNTYPED_ATOMIC, dst, >> +fs_reg(atomic_op), fs_reg(surf_index)); >> + inst.base_mrf = 0; >> + inst.mlen = mlen; >> + emit(inst); >> +} >> + >> +void >> +fs_visitor::emit_untyped_surface_read(unsigned surf_index, unsigned >> offset, >> + fs_reg dst) >> +{ >> + const unsigned operand_len = dispatch_width / 8; >> + unsigned mlen = 0; >> + >> + /* Initialize the sample mask in the message header. */ >> + emit(MOV(brw_uvec_mrf(8, mlen, 0), brw_imm_ud(0))) >> + ->force_writemask_all = true; >> + emit(MOV(brw_uvec_mrf(1, mlen, 7), >> +retype(brw_vec1_grf(1, 7), BRW_REGISTER_TYPE_UD))) >> + ->force_writemask_all = true; >> > > Same comment about discard applies here, although it's less critical > because this is a read operation (I suspect the only effect will be a tiny > performance penalty). > Fixed, thanks. > >> + mlen++; >> + >> + /* Set the surface read offset. */ >> + emit(MOV(brw_uvec_mrf(dispatch_width, mlen, 0), brw_imm_ud(offset))); >> + mlen += operand_len; >> + >> + /* Emit the instruction. */ >> + fs_inst inst(SHADER_OPCODE_UNTYPED_SURFACE_READ, dst, >> fs_reg(surf_index)); >> + inst.base_mrf = 0; >> + inst.mlen = mlen; >> + emit(inst); >> +} >> + >> +void >> fs_visitor::visit(ir_atomic *ir) >> { >> + ir_variable *loc = ir->location->variable_referenced(); >> + unsigned surf_index = SURF_INDEX_WM_ABO(loc->atomic.buffer_index); >> + >> + result = fs_reg(this, ir->type); >> + >> + switch (ir->op) { >> + case ir_atomic_read: >> + emit_untyped_surface_read(surf_index, loc->atomic.offset, result); >> + break; >> + case ir_atomic_inc: >> + emit_untyped_atomic(BRW_AOP_INC, surf_index, loc->atomic.offset, >> + result, fs_reg(), fs_reg()); >> + break; >> + case ir_atomic_dec: >> + emit_untyped_atomic(BRW_AOP_PREDEC, surf_index, loc->atomic.offset, >> + result, fs_reg(), fs_reg()); >> > > These calls to fs_reg() don't look right to me. Don't we need to pass the > address and the increment/decrement amount to emit_untyped_atomic()? > The address *is* passed through the 'loc->atomic.offset' argument. Both registers are empty because the increment/decrement ops take no arguments, the increment a
Re: [Mesa-dev] [PATCH 18/24] i965: Add a 'has_side_effects' back-end instruction predicate.
Paul Berry writes: > On 15 September 2013 00:10, Francisco Jerez wrote: > >> Analogous to the GLSL IR predicate with the same name. This patch >> fixes the three dead code elimination passes and the VEC4/FS >> instruction scheduling passes so they leave instructions with side >> effects alone. >> >> At some point it might be interesting to have the instruction >> scheduler calculate the exact memory dependencies between atomic ops, >> but they're rare enough that it seems unlikely that it will make any >> practical difference. >> > > Does ARB_shader_atomic_counters guarantee that order is properly preserved > between atomic read operations? In other words, if I do this in a shader: > > atomic_uint counter1; > atomic_uint counter2; > > void main() { > ... > uint a = atomicCounter(counter1); > uint b = atomicCounter(counter2); > } > > can I be guaranteed that the read from counter2 will happen after the read > from counter1? I can't tell from reading the spec but I'm inclined to > think we should assume this is guaranteed, just to be on the safe side. > > If we make this assumption, then I believe the has_side_effects() predicate > is not enough to guarantee the proper ordering. We would need the > scheduling code to use a stronger predicate, requires_exact_ordering(), > which returns true for both SHADER_OPCODE_UNTYPED_ATOMIC and > SHADER_OPCODE_UNTYPED_SURFACE_READ, to ensure that atomic counter reads > don't get reordered with respect to each other. > The ARB_shader_atomic_counters extension is very unspecific in that regard. AFAICT the implementation is allowed to do whatever it wants as long as the uniqueness guarantee is preserved. The ARB_shader_image_load_store is much more specific and it doesn't require the implementation to preserve any particular ordering between read operations, so I think it would make sense to have the same behavior for both extensions. > >[...] >> @@ -1943,31 +1943,26 @@ fs_visitor::dead_code_eliminate_local() >> get_dead_code_hash_entry(ht, inst->dst.reg, >> inst->dst.reg_offset); >> >> -if (inst->is_partial_write()) { >> - /* For a partial write, we can't remove any previous dead >> code >> -* candidate, since we're just modifying their result, but >> we can >> -* be dead code eliminiated ourselves. >> -*/ >> - if (entry) { >> - entry->data = inst; >> +if (entry) { >> + if (inst->is_partial_write()) { >> + /* For a partial write, we can't remove any previous >> dead code >> + * candidate, since we're just modifying their result. >> + */ >> > > I'm not terribly familiar with this code, so this may be a stupid question, > but: > > Previous to this patch, if entry was non-NULL and inst->is_partial_write(), > we would set entry->data = inst. With your rewrite, that doesn't happen > anymore. That seems like a problem. > >[...] > > Previously, we would only remove the entry from the hashtable if entry was > non-NULL and !inst->is_partial_write(). Now we remove it whenever entry is > non-NULL, regardless of whether inst->is_partial_write(). This also seems > like a problem. > >>[...] > > Preveiously, we wouldn't insert the instruction in the dead code hash if > entry was non-NULL and inst->is_partial_write(). We no longer do that > check--was that an intentional change? > All these changes were intentional. The old code did the following: - For partial writes with a matching hash table entry for the destination register, the existing entry was replaced with the current instruction. - For partial writes with no matching hash table entry a new entry was created for the current instruction. - For full writes with a matching hash table entry, the previous instruction was dead code-eliminated, and the hash table entry was replaced with the current instruction. - For full writes with no matching hash table entry a new entry was created for the current instruction. The four conditions are preserved with the new code, with the difference that we skip the insertion step for instructions with side effects because they can never be dead code-eliminated. > >> -} >> } >>} >> } >> diff --git a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp >> b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp >> index 5530683..a688336 100644 >> --- a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp >> +++ b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp >> @@ -562,7 +562,8 @@ fs_instruction_scheduler::calculate_deps() >>schedule_node *n = (schedule_node *)node; >>fs_inst *inst = (fs_inst *)n->inst; >> >> - if (inst->opcode == FS_OPCODE_PLACEHOLDER_HALT) >> + if (inst->opcode == FS_OPCODE_PLACEHOLDER_HALT || >> + inst->has_side_ef
Re: [Mesa-dev] [PATCH V4 02/13] glsl: add texture gather changes
On 09/30/2013 11:03 AM, Ian Romanick wrote: > On 09/30/2013 03:08 AM, Chris Forbes wrote: >> From: Maxence Le Dore >> >> V2 [Chris Forbes]: >>- Add new pattern, fixup parameter reading. >> >> V3: Rebase onto new builtins machinery >> >> Reviewed-by: Kenneth Graunke >> --- >> src/glsl/builtin_functions.cpp | 35 +++ >> src/glsl/glcpp/glcpp-parse.y| 3 +++ >> src/glsl/glsl_parser_extras.cpp | 1 + >> src/glsl/glsl_parser_extras.h | 2 ++ >> src/glsl/ir.cpp | 2 +- >> src/glsl/ir.h | 4 +++- >> src/glsl/ir_clone.cpp | 1 + >> src/glsl/ir_hv_accept.cpp | 1 + >> src/glsl/ir_print_visitor.cpp | 3 ++- >> src/glsl/ir_reader.cpp | 6 +- >> src/glsl/ir_rvalue_visitor.cpp | 1 + >> src/glsl/opt_tree_grafting.cpp | 1 + >> src/glsl/standalone_scaffolding.cpp | 1 + >> src/mesa/program/ir_to_mesa.cpp | 5 + >> 14 files changed, 62 insertions(+), 4 deletions(-) >> >> diff --git a/src/glsl/builtin_functions.cpp b/src/glsl/builtin_functions.cpp >> index 72054e0..df735ef 100644 >> --- a/src/glsl/builtin_functions.cpp >> +++ b/src/glsl/builtin_functions.cpp >> @@ -262,6 +262,13 @@ texture_query_lod(const _mesa_glsl_parse_state *state) >>state->ARB_texture_query_lod_enable; >> } >> >> +static bool >> +texture_gather(const _mesa_glsl_parse_state *state) >> +{ >> + return state->is_version(400, 0) || >> + state->ARB_texture_gather_enable; >> +} >> + > > This should be glsl_parser_state::has_texture_gather, like in Ken's > f91475d... though it looks like some of the rest of this file could be > modified to use that pattern. Hrm... Ian, These are accessed via function pointers (in ir_function_signature): /** * A function that returns whether a built-in function is available in the * current shading language (based on version, ES or desktop, and extensions). */ typedef bool (*builtin_available_predicate)(const _mesa_glsl_parse_state *); I don't believe that you can mix pointers to ordinary functions and pointers to class methods. We could move /all/ of these predicates to be members of _mesa_glsl_parse_state, and change the typedef, but...I don't think we get to mix both styles. I would prefer to leave Chris's patch as is, and resolve this as a follow-up series. --Ken ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] [v2] i965: Extract region use from hiz depth buffer
Starting with Ivybridge, the hierarchical had relaxed requirements for its allocation. Following a "simple" formula in the bspec was all you needed to satisfy the requirement. To prepare the code for this, extract all places where the miptree was used, when we really only needed the region. This allows an upcoming patch to simply allocate the region, and not the whole miptree. v2: Don't use intel_region. Instead use bo + stride. We actually do store the stride in libdrm, but it is inaccessible in the current libdrm version. CC: Chad Versace Signed-off-by: Ben Widawsky --- src/mesa/drivers/dri/i965/brw_misc_state.c| 11 +--- src/mesa/drivers/dri/i965/gen6_blorp.cpp | 20 +-- src/mesa/drivers/dri/i965/gen7_blorp.cpp | 6 ++--- src/mesa/drivers/dri/i965/gen7_misc_state.c | 5 ++-- src/mesa/drivers/dri/i965/intel_fbo.c | 4 +-- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 36 +++ src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 6 - 7 files changed, 52 insertions(+), 36 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_misc_state.c b/src/mesa/drivers/dri/i965/brw_misc_state.c index 7f4cd6f..23ffeab 100644 --- a/src/mesa/drivers/dri/i965/brw_misc_state.c +++ b/src/mesa/drivers/dri/i965/brw_misc_state.c @@ -210,8 +210,12 @@ brw_get_depthstencil_tile_masks(struct intel_mipmap_tree *depth_mt, &tile_mask_x, &tile_mask_y, false); if (intel_miptree_slice_has_hiz(depth_mt, depth_level, depth_layer)) { +uint32_t tmp; uint32_t hiz_tile_mask_x, hiz_tile_mask_y; - intel_region_get_tile_masks(depth_mt->hiz_mt->region, +struct intel_region region = { .cpp = depth_mt->cpp }; + +drm_intel_bo_get_tiling(depth_mt->hiz_buffer.bo, ®ion.tiling, &tmp); + intel_region_get_tile_masks(®ion, &hiz_tile_mask_x, &hiz_tile_mask_y, false); /* Each HiZ row represents 2 rows of pixels */ @@ -667,11 +671,10 @@ brw_emit_depth_stencil_hiz(struct brw_context *brw, /* Emit hiz buffer. */ if (hiz) { - struct intel_mipmap_tree *hiz_mt = depth_mt->hiz_mt; BEGIN_BATCH(3); OUT_BATCH((_3DSTATE_HIER_DEPTH_BUFFER << 16) | (3 - 2)); -OUT_BATCH(hiz_mt->region->pitch - 1); -OUT_RELOC(hiz_mt->region->bo, +OUT_BATCH(depth_mt->hiz_buffer.stride - 1); +OUT_RELOC(depth_mt->hiz_buffer.bo, I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER, brw->depthstencil.hiz_offset); ADVANCE_BATCH(); diff --git a/src/mesa/drivers/dri/i965/gen6_blorp.cpp b/src/mesa/drivers/dri/i965/gen6_blorp.cpp index da523e5..fc3a331 100644 --- a/src/mesa/drivers/dri/i965/gen6_blorp.cpp +++ b/src/mesa/drivers/dri/i965/gen6_blorp.cpp @@ -887,16 +887,22 @@ gen6_blorp_emit_depth_stencil_config(struct brw_context *brw, /* 3DSTATE_HIER_DEPTH_BUFFER */ { - struct intel_region *hiz_region = params->depth.mt->hiz_mt->region; - uint32_t hiz_offset = - intel_region_get_aligned_offset(hiz_region, - draw_x & ~tile_mask_x, - (draw_y & ~tile_mask_y) / 2, false); + uint32_t hiz_offset, tmp; + struct intel_mipmap_tree *depth_mt = params->depth.mt; + struct intel_region hiz_region; + + hiz_region.cpp = depth_mt->cpp; + hiz_region.pitch = depth_mt->hiz_buffer.stride; + drm_intel_bo_get_tiling(depth_mt->hiz_buffer.bo, &hiz_region.tiling, &tmp); + + hiz_offset = intel_region_get_aligned_offset(&hiz_region, + draw_x & ~tile_mask_x, + (draw_y & ~tile_mask_y) / 2, false); BEGIN_BATCH(3); OUT_BATCH((_3DSTATE_HIER_DEPTH_BUFFER << 16) | (3 - 2)); - OUT_BATCH(hiz_region->pitch - 1); - OUT_RELOC(hiz_region->bo, + OUT_BATCH(hiz_region.pitch - 1); + OUT_RELOC(depth_mt->hiz_buffer.bo, I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER, hiz_offset); ADVANCE_BATCH(); diff --git a/src/mesa/drivers/dri/i965/gen7_blorp.cpp b/src/mesa/drivers/dri/i965/gen7_blorp.cpp index 9df3d92..379e8ee 100644 --- a/src/mesa/drivers/dri/i965/gen7_blorp.cpp +++ b/src/mesa/drivers/dri/i965/gen7_blorp.cpp @@ -737,13 +737,13 @@ gen7_blorp_emit_depth_stencil_config(struct brw_context *brw, /* 3DSTATE_HIER_DEPTH_BUFFER */ { - struct intel_region *hiz_region = params->depth.mt->hiz_mt->region; + struct intel_mipmap_tree *depth_mt = params->depth.mt; BEGIN_BATCH(3); OUT_BATCH((GEN7_3DSTATE_HIER_DEPTH_BUFFER << 16) | (3 - 2)); OUT_BATCH((mocs << 25) | -(hiz_region->pitch - 1)); - OUT_RELOC(hiz_region->bo, +(depth_mt->hiz_buffer.stride - 1)); + OUT_RELOC(depth_mt->hiz_buffer.bo, I915_GEM_DOMAIN_R
[Mesa-dev] [PATCH 2/2] [v3] i965: Use IVB specific formula for depthbuffer
After the last patch, we can replace the region allocated in the miptree creation with a more straightforward (and hopefully smaller resulting) buffer based on the bspec's allocation formula. Since I am relatively new to this part of the bspec, I would very much appreciate scrutiny during review of this. There were some ambiguities to me which are likely obvious to others. To prove the reduced [GPU] memory usage I created a simple script which polls the memory usage of the process through debugfs ever .1 seconds. The following results show the memory usage difference over 5 runs of xonotic-glx with ultra settings. The data suggests a 10MB savings on average. I've not measured the savings on the CPU side, but I imagine some amount of savings would be present there as well. x master/mem_usage.txt + mine/mem_usage.txt N Min MaxMedian Avg Stddev x 17121 98959360 7.3394995e+08 7.2782234e+08 7.2209615e+08 43633222 + 17166 1.2538266e+08 7.2241562e+08 7.16288e+08 7.1071472e+08 42964578 Below is the FPS data over those same 5 tests. I'm not sure if the decrease is statistically significant to y'all. I don't have any theories about it. x master/xonotic.fps + mine/xonotic.fps N Min MaxMedian Avg Stddev x 5 27.430746 27.524985 27.50568 27.487017 0.039439874 + 5 27.409173 27.461715 27.441207 27.440883 0.021086805 NOTE: There were a couple of places in the arithmetic where I could have taken some shortcuts. In order to make the code match with the spec as much as possible, I've decided not to do this. One shortcut I did make was the tiling type. Digging through the code it looks like you always want Y-tiled, except when it won't fit, in which case you want X-tiled. I wasn't a fan of the existing helper function that's there since it has a few irrelevant parameters for this operation. I suspect people reviewing this might ask me to change this, which is fine; I just wanted to explain the motivation. v2: copy-paste fix where I used I915_TILING_Y where I meant _X. (Topi) v3: Updated to directly use the bo/stride instead of intel_region. (Ken, Chad) Fix the reference count leak on the hiz buffer (Chad) Don't allow fallback to old mt allocation. It should never happen. (Ben) Break out hz_depth/width calculation to separete functions. (Ben) Use cpp = 1, since the calculation takes cpp into account (Ben) x head/xonotic + mine/xonotic N Min MaxMedian Avg Stddev x 5 25.683336 25.898164 25.872499 25.842426 0.089829019 + 5 25.841368 25.934931 25.869051 25.877494 0.039885576 x head/memusage + mine/memusage N Min MaxMedian Avg Stddev x 18036 89432064 8.6380954e+08 7.9515648e+08 7.930405e+08 42774265 + 18030 86548480 8.6262989e+08 7.8178714e+08 7.7978462e+08 42099587 CC: Chad Versace Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67564 Signed-off-by: Ben Widawsky --- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 194 +++--- src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 2 +- 2 files changed, 176 insertions(+), 20 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c index e1da9de..9fc4e97 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c @@ -793,8 +793,12 @@ intel_miptree_release(struct intel_mipmap_tree **mt) intel_region_release(&((*mt)->region)); intel_miptree_release(&(*mt)->stencil_mt); - intel_miptree_release(&(*mt)->hiz_buffer.mt); - (*mt)->hiz_buffer.bo = NULL; + if (&(*mt)->hiz_buffer.mt) + intel_miptree_release(&(*mt)->hiz_buffer.mt); + else { + drm_intel_bo_unreference((*mt)->hiz_buffer.bo); +(*mt)->hiz_buffer.bo = NULL; + } intel_miptree_release(&(*mt)->mcs_mt); intel_miptree_release(&(*mt)->singlesample_mt); intel_resolve_map_clear(&(*mt)->hiz_map); @@ -1271,30 +1275,182 @@ intel_miptree_slice_enable_hiz(struct brw_context *brw, return true; } +#define level(x, l) ((x) >> (l) > 0 ? (x) >> (l) : 1) +static unsigned int caclulate_z_height(const struct intel_mipmap_tree *mt, + const int level) +{ + unsigned int height = level(mt->logical_height0, level); + + /* [DevIVB+]: If the surface is multisampled and it is a depth or stencil +* surface or Multisampled Surface StorageFormat in SURFACE_STATE is +* MSFMT_DEPTH_STENCIL, WL and HL must be adjusted as follows before +* proceeding: */ + switch (mt->num_samples) { + case 4: + height = CEILING(height, 2) * 4; + break; + case 8: + height = CEILING(height, 2) * 4 * mt->level[level].depth;; + break; + case 16: + height = CEILING(height, 2) * 8; +
[Mesa-dev] [PATCH 01/14] glx: Move the driver extension-loading to a helper function.
I'm planning on doing driver extension parsing from 3 places, and making the extension loading step a bit longer. --- src/glx/dri2_glx.c | 6 ++ src/glx/dri_common.c | 14 ++ src/glx/dri_common.h | 2 ++ 3 files changed, 18 insertions(+), 4 deletions(-) diff --git a/src/glx/dri2_glx.c b/src/glx/dri2_glx.c index 07138fb..123c87c 100644 --- a/src/glx/dri2_glx.c +++ b/src/glx/dri2_glx.c @@ -1183,11 +1183,9 @@ dri2CreateScreen(int screen, struct glx_display * priv) goto handle_error; } - extensions = dlsym(psc->driver, __DRI_DRIVER_EXTENSIONS); - if (extensions == NULL) { - ErrorMessageF("driver exports no extensions (%s)\n", dlerror()); + extensions = driGetDriverExtensions(psc->driver); + if (extensions == NULL) goto handle_error; - } for (i = 0; extensions[i]; i++) { if (strcmp(extensions[i]->name, __DRI_CORE) == 0) diff --git a/src/glx/dri_common.c b/src/glx/dri_common.c index 5f199e9..f1d1164 100644 --- a/src/glx/dri_common.c +++ b/src/glx/dri_common.c @@ -187,6 +187,20 @@ driOpenDriver(const char *driverName) return handle; } +_X_HIDDEN const __DRIextension ** +driGetDriverExtensions(void *handle) +{ + const __DRIextension **extensions = NULL; + + extensions = dlsym(handle, __DRI_DRIVER_EXTENSIONS); + if (extensions == NULL) { + ErrorMessageF("driver exports no extensions (%s)\n", dlerror()); + return NULL; + } + + return extensions; +} + static GLboolean __driGetMSCRate(__DRIdrawable *draw, int32_t * numerator, int32_t * denominator, diff --git a/src/glx/dri_common.h b/src/glx/dri_common.h index 2bbffa9..2ebcb81 100644 --- a/src/glx/dri_common.h +++ b/src/glx/dri_common.h @@ -69,6 +69,8 @@ extern void CriticalErrorMessageF(const char *f, ...); extern void *driOpenDriver(const char *driverName); +extern const __DRIextension **driGetDriverExtensions(void *handle); + extern bool dri2_convert_glx_attribs(unsigned num_attribs, const uint32_t *attribs, unsigned *major_ver, unsigned *minor_ver, -- 1.8.4.rc3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] megadrivers series
Here are the megadrivers changes, after the prep series I posted earlier. A few tiny updates to the prep series are available in my tree as "megadriver-prep" and this series is available as "megadrivers-5" FPS improvement on GLB2.7 with INTEL_NO_HW=1: 2.61061% +/- 1.16957% (n=50) One question I have is whether the hardlinks are going to cause problems for packaging. I noticed that when I went and stripped the binaries trying to do a space comparison, I of course got brand new inodes each taking up their own set of disk space. I do really like how hardlinks end up for installing on my test systems -- a single binary I can move around however I need. video from the talk I gave at XDC: http://www.youtube.com/watch?v=0fJq-2haT3Y I think Emil has been looking at doing the gallium side of things, so I haven't pushed forward with that. Note that the megadriver build does require an updated loader. I've done EGL and GLX, but the xserver still needs updating. If I get some acks on the ABI I chose, I'll go do that. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 02/14] dri: Allow config options to be passed to the loader through extensions.
Turns out already we have this nice mechanism for providing optional things from the driver to the loader, and I was going to have to rename the public global symbol to avoid conflicts when doing megadrivers. While the former __driConfigOptions is technically loader interface, this is the only loader that made use of that symbol. Continue paying attention to it if we can't find the new option, to retain compatibility with old drivers. --- include/GL/internal/dri_interface.h | 20 ++-- src/glx/dri_glx.c | 17 ++--- 2 files changed, 28 insertions(+), 9 deletions(-) diff --git a/include/GL/internal/dri_interface.h b/include/GL/internal/dri_interface.h index 709fece..5c53d6e 100644 --- a/include/GL/internal/dri_interface.h +++ b/include/GL/internal/dri_interface.h @@ -330,12 +330,6 @@ struct __DRI2throttleExtensionRec { enum __DRI2throttleReason reason); }; -/** - * XML document describing the configuration options supported by the - * driver. - */ -extern const char __driConfigOptions[]; - /*@}*/ /** @@ -1224,4 +1218,18 @@ struct __DRIrobustnessExtensionRec { __DRIextension base; }; +/** + * DRI config options extension. + * + * This extension provides the XML string containing driver options for use by + * the loader in supporting the driconf application. + */ +#define __DRI_CONFIG_OPTIONS "DRI_ConfigOptions" +#define __DRI_CONFIG_OPTIONS_VERSION 1 + +typedef struct __DRIconfigOptionsExtensionRec { + __DRIextension base; + const char *xml; +} __DRIconfigOptionsExtension; + #endif diff --git a/src/glx/dri_glx.c b/src/glx/dri_glx.c index faed9d0..a1475b0 100644 --- a/src/glx/dri_glx.c +++ b/src/glx/dri_glx.c @@ -184,10 +184,21 @@ _X_EXPORT const char * glXGetDriverConfig(const char *driverName) { void *handle = driOpenDriver(driverName); - if (handle) - return dlsym(handle, "__driConfigOptions"); - else + const __DRIextension **extensions; + + if (!handle) return NULL; + + extensions = driGetDriverExtensions(handle); + if (extensions) { + for (int i = 0; extensions[i]; i++) { + if (strcmp(extensions[i]->name, __DRI_CONFIG_OPTIONS) == 0) +return ((__DRIconfigOptionsExtension *)extensions[i])->xml; + } + } + + /* Fall back to the old method */ + return dlsym(handle, "__driConfigOptions"); } #ifdef XDAMAGE_1_1_INTERFACE -- 1.8.4.rc3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 04/14] glx: Add an optional function call for getting the DRI driver interface.
The previous interface relied on a static struct, which meant tha the driver didn't get a chance to edit the struct before the struct got used. For megadrivers, I want to return a variable struct based on what driver is getting loaded. --- include/GL/internal/dri_interface.h | 13 + src/glx/dri2_glx.c | 2 +- src/glx/dri_common.c| 18 +- src/glx/dri_common.h| 3 ++- src/glx/dri_glx.c | 2 +- src/glx/drisw_glx.c | 6 ++ 6 files changed, 36 insertions(+), 8 deletions(-) diff --git a/include/GL/internal/dri_interface.h b/include/GL/internal/dri_interface.h index 5c53d6e..93b6c0b 100644 --- a/include/GL/internal/dri_interface.h +++ b/include/GL/internal/dri_interface.h @@ -488,6 +488,19 @@ struct __DRIuseInvalidateExtensionRec { #define __DRI_DRIVER_EXTENSIONS "__driDriverExtensions" /** + * This symbol replaces the __DRI_DRIVER_EXTENSIONS symbol, and will be + * suffixed by "_drivername", allowing multiple drivers to be built into one + * library, and also giving the driver the chance to return a variable driver + * extensions struct depending on the driver name being loaded or any other + * system state. + * + * The function prototype is: + * + * const __DRIextension **__driDriverGetExtensions(const char *name); + */ +#define __DRI_DRIVER_GET_EXTENSIONS "__driDriverGetExtensions" + +/** * Tokens for __DRIconfig attribs. A number of attributes defined by * GLX or EGL standards are not in the table, as they must be provided * by the loader. For example, FBConfig ID or visual ID, drawable type. diff --git a/src/glx/dri2_glx.c b/src/glx/dri2_glx.c index 123c87c..7e22906 100644 --- a/src/glx/dri2_glx.c +++ b/src/glx/dri2_glx.c @@ -1183,7 +1183,7 @@ dri2CreateScreen(int screen, struct glx_display * priv) goto handle_error; } - extensions = driGetDriverExtensions(psc->driver); + extensions = driGetDriverExtensions(psc->driver, driverName); if (extensions == NULL) goto handle_error; diff --git a/src/glx/dri_common.c b/src/glx/dri_common.c index f1d1164..16f820f 100644 --- a/src/glx/dri_common.c +++ b/src/glx/dri_common.c @@ -188,9 +188,25 @@ driOpenDriver(const char *driverName) } _X_HIDDEN const __DRIextension ** -driGetDriverExtensions(void *handle) +driGetDriverExtensions(void *handle, const char *driver_name) { const __DRIextension **extensions = NULL; + const __DRIextension **(*get_extensions)(void); + char *get_extensions_name; + + asprintf(&get_extensions_name, "%s_%s", +__DRI_DRIVER_GET_EXTENSIONS, driver_name); + if (get_extensions_name) { + get_extensions = dlsym(handle, get_extensions_name); + if (get_extensions) { + free(get_extensions_name); + return get_extensions(); + } else { + InfoMessageF("driver does not expose %s(): %s\n", + get_extensions_name, dlerror()); + free(get_extensions_name); + } + } extensions = dlsym(handle, __DRI_DRIVER_EXTENSIONS); if (extensions == NULL) { diff --git a/src/glx/dri_common.h b/src/glx/dri_common.h index 2ebcb81..4fe0d3f 100644 --- a/src/glx/dri_common.h +++ b/src/glx/dri_common.h @@ -69,7 +69,8 @@ extern void CriticalErrorMessageF(const char *f, ...); extern void *driOpenDriver(const char *driverName); -extern const __DRIextension **driGetDriverExtensions(void *handle); +extern const __DRIextension ** +driGetDriverExtensions(void *handle, const char *driver_name); extern bool dri2_convert_glx_attribs(unsigned num_attribs, const uint32_t *attribs, diff --git a/src/glx/dri_glx.c b/src/glx/dri_glx.c index a1475b0..0b89e3e 100644 --- a/src/glx/dri_glx.c +++ b/src/glx/dri_glx.c @@ -189,7 +189,7 @@ glXGetDriverConfig(const char *driverName) if (!handle) return NULL; - extensions = driGetDriverExtensions(handle); + extensions = driGetDriverExtensions(handle, driverName); if (extensions) { for (int i = 0; extensions[i]; i++) { if (strcmp(extensions[i]->name, __DRI_CONFIG_OPTIONS) == 0) diff --git a/src/glx/drisw_glx.c b/src/glx/drisw_glx.c index 393be20..a7d0843 100644 --- a/src/glx/drisw_glx.c +++ b/src/glx/drisw_glx.c @@ -664,11 +664,9 @@ driswCreateScreen(int screen, struct glx_display *priv) if (psc->driver == NULL) goto handle_error; - extensions = dlsym(psc->driver, __DRI_DRIVER_EXTENSIONS); - if (extensions == NULL) { - ErrorMessageF("driver exports no extensions (%s)\n", dlerror()); + extensions = driGetDriverExtensions(psc->driver, SWRAST_DRIVER_NAME); + if (extensions == NULL) goto handle_error; - } for (i = 0; extensions[i]; i++) { if (strcmp(extensions[i]->name, __DRI_CORE) == 0) -- 1.8.4.rc3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 03/14] dri: Move driver config options to dri driver extensions.
This way they aren't all sitting in the global namespace (with the same name per driver). --- src/gallium/state_trackers/dri/common/dri_screen.c | 10 +++--- src/gallium/state_trackers/dri/common/dri_screen.h | 2 ++ src/gallium/state_trackers/dri/drm/dri2.c | 1 + src/gallium/state_trackers/dri/sw/drisw.c | 1 + src/mesa/drivers/dri/i915/intel_screen.c | 13 + src/mesa/drivers/dri/i965/intel_screen.c | 12 src/mesa/drivers/dri/radeon/radeon_screen.c| 19 --- 7 files changed, 40 insertions(+), 18 deletions(-) diff --git a/src/gallium/state_trackers/dri/common/dri_screen.c b/src/gallium/state_trackers/dri/common/dri_screen.c index 92cac73..7410dbe 100644 --- a/src/gallium/state_trackers/dri/common/dri_screen.c +++ b/src/gallium/state_trackers/dri/common/dri_screen.c @@ -47,7 +47,10 @@ #undef false -PUBLIC const char __driConfigOptions[] = +const __DRIconfigOptionsExtension gallium_config_options = { + .base = { __DRI_CONFIG_OPTIONS, 1 }, + .xml = + DRI_CONF_BEGIN DRI_CONF_SECTION_QUALITY DRI_CONF_FORCE_S3TC_ENABLE("false") @@ -70,7 +73,8 @@ PUBLIC const char __driConfigOptions[] = DRI_CONF_SECTION_MISCELLANEOUS DRI_CONF_ALWAYS_HAVE_DEPTH_BUFFER("false") DRI_CONF_SECTION_END - DRI_CONF_END; + DRI_CONF_END +}; #define false 0 @@ -415,7 +419,7 @@ dri_init_screen_helper(struct dri_screen *screen, else screen->target = PIPE_TEXTURE_RECT; - driParseOptionInfo(&screen->optionCacheDefaults, __driConfigOptions); + driParseOptionInfo(&screen->optionCacheDefaults, gallium_config_options.xml); driParseConfigFiles(&screen->optionCache, &screen->optionCacheDefaults, diff --git a/src/gallium/state_trackers/dri/common/dri_screen.h b/src/gallium/state_trackers/dri/common/dri_screen.h index 18ede86..f263a90 100644 --- a/src/gallium/state_trackers/dri/common/dri_screen.h +++ b/src/gallium/state_trackers/dri/common/dri_screen.h @@ -133,6 +133,8 @@ dri_destroy_screen_helper(struct dri_screen * screen); void dri_destroy_screen(__DRIscreen * sPriv); +extern const __DRIconfigOptionsExtension gallium_config_options; + #endif /* vim: set sw=3 ts=8 sts=3 expandtab: */ diff --git a/src/gallium/state_trackers/dri/drm/dri2.c b/src/gallium/state_trackers/dri/drm/dri2.c index 5647968..868cd25 100644 --- a/src/gallium/state_trackers/dri/drm/dri2.c +++ b/src/gallium/state_trackers/dri/drm/dri2.c @@ -950,6 +950,7 @@ const struct __DriverAPIRec driDriverAPI = { PUBLIC const __DRIextension *__driDriverExtensions[] = { &driCoreExtension.base, &driDRI2Extension.base, +&gallium_config_options.base, NULL }; diff --git a/src/gallium/state_trackers/dri/sw/drisw.c b/src/gallium/state_trackers/dri/sw/drisw.c index 121a205..9f00a53 100644 --- a/src/gallium/state_trackers/dri/sw/drisw.c +++ b/src/gallium/state_trackers/dri/sw/drisw.c @@ -365,6 +365,7 @@ const struct __DriverAPIRec driDriverAPI = { PUBLIC const __DRIextension *__driDriverExtensions[] = { &driCoreExtension.base, &driSWRastExtension.base, +&gallium_config_options.base, NULL }; diff --git a/src/mesa/drivers/dri/i915/intel_screen.c b/src/mesa/drivers/dri/i915/intel_screen.c index 4f8c342..49bae5d 100644 --- a/src/mesa/drivers/dri/i915/intel_screen.c +++ b/src/mesa/drivers/dri/i915/intel_screen.c @@ -40,8 +40,11 @@ #include "utils.h" #include "xmlpool.h" -PUBLIC const char __driConfigOptions[] = - DRI_CONF_BEGIN +static const __DRIconfigOptionsExtension i915_config_options = { + .base = { __DRI_CONFIG_OPTIONS, 1 }, + .xml = + +DRI_CONF_BEGIN DRI_CONF_SECTION_PERFORMANCE DRI_CONF_VBLANK_MODE(DRI_CONF_VBLANK_ALWAYS_SYNC) /* Options correspond to DRI_CONF_BO_REUSE_DISABLED, @@ -75,7 +78,8 @@ PUBLIC const char __driConfigOptions[] = DRI_CONF_DESC(en, "Perform code generation at shader link time.") DRI_CONF_OPT_END DRI_CONF_SECTION_END -DRI_CONF_END; +DRI_CONF_END +}; #include "intel_batchbuffer.h" #include "intel_buffers.h" @@ -1109,7 +1113,7 @@ __DRIconfig **intelInitScreen2(__DRIscreen *psp) return false; } /* parse information in __driConfigOptions */ - driParseOptionInfo(&intelScreen->optionCache, __driConfigOptions); + driParseOptionInfo(&intelScreen->optionCache, i915_config_options.xml); intelScreen->driScrnPriv = psp; psp->driverPrivate = (void *) intelScreen; @@ -1203,5 +1207,6 @@ const struct __DriverAPIRec driDriverAPI = { PUBLIC const __DRIextension *__driDriverExtensions[] = { &driCoreExtension.base, &driDRI2Extension.base, +&i915_config_options.base, NULL }; diff --git a/src/mesa/drivers/dri/i965/intel_screen.c b/src/mesa/drivers/dri/i965/intel_screen.c index df9edb7..7019008 100644 --- a/src/mesa/drivers/dri/i965/intel_screen.c +++ b/src/mesa/drivers/dri/i965/intel_screen.c @@ -40,8 +40,10 @@ #include "utils.h" #includ
[Mesa-dev] [PATCH 05/14] egl: Add an optional function call for getting the DRI driver interface.
--- src/egl/drivers/dri2/egl_dri2.c | 21 +++-- 1 file changed, 19 insertions(+), 2 deletions(-) diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c index 04ab564..7c07fd6 100644 --- a/src/egl/drivers/dri2/egl_dri2.c +++ b/src/egl/drivers/dri2/egl_dri2.c @@ -377,8 +377,10 @@ static const __DRIextension ** dri2_open_driver(_EGLDisplay *disp) { struct dri2_egl_display *dri2_dpy = disp->DriverData; - const __DRIextension **extensions; + const __DRIextension **extensions = NULL; char path[PATH_MAX], *search_paths, *p, *next, *end; + char *get_extensions_name; + const __DRIextension **(*get_extensions)(void); search_paths = NULL; if (geteuid() == getuid()) { @@ -419,7 +421,22 @@ dri2_open_driver(_EGLDisplay *disp) } _eglLog(_EGL_DEBUG, "DRI2: dlopen(%s)", path); - extensions = dlsym(dri2_dpy->driver, __DRI_DRIVER_EXTENSIONS); + + asprintf(&get_extensions_name, "%s_%s", +__DRI_DRIVER_GET_EXTENSIONS, dri2_dpy->driver_name); + if (get_extensions_name) { + get_extensions = dlsym(dri2_dpy->driver, get_extensions_name); + if (get_extensions) { + extensions = get_extensions(); + } else { + _eglLog(_EGL_DEBUG, "driver does not expose %s(): %s\n", + get_extensions_name, dlerror()); + } + free(get_extensions_name); + } + + if (!extensions) + extensions = dlsym(dri2_dpy->driver, __DRI_DRIVER_EXTENSIONS); if (extensions == NULL) { _eglLog(_EGL_WARNING, "DRI2: driver exports no extensions (%s)", dlerror()); -- 1.8.4.rc3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 13/14] swrast: Build the driver into the shared mesa_dri_drivers.so.
--- configure.ac | 26 +++--- src/mesa/drivers/dri/Makefile.am | 2 ++ src/mesa/drivers/dri/swrast/Makefile.am | 18 +++--- src/mesa/drivers/dri/swrast/Makefile.sources | 6 -- src/mesa/drivers/dri/swrast/swrast.c | 17 ++--- 5 files changed, 30 insertions(+), 39 deletions(-) diff --git a/configure.ac b/configure.ac index 5649aec..81abdf9 100644 --- a/configure.ac +++ b/configure.ac @@ -1016,10 +1016,14 @@ if test "x$enable_dri" = xyes; then LIBS="$save_LIBS" # If we are building any DRI driver other than swrast. -if test -n "$DRI_DIRS" -a x"$DRI_DIRS" != xswrast; then -# ... libdrm is required -if test "x$have_libdrm" != xyes; then -AC_MSG_ERROR([DRI drivers requires libdrm >= $LIBDRM_REQUIRED]) +if test -n "$DRI_DIRS"; then +if test -a x"$DRI_DIRS" != xswrast; then +# ... libdrm is required +if test "x$have_libdrm" != xyes; then +AC_MSG_ERROR([DRI drivers requires libdrm >= $LIBDRM_REQUIRED]) +fi +else +CFLAGS="$CFLAGS -DSWRAST_NO_DRM" fi # ... and build dricommon HAVE_COMMON_DRI=yes @@ -1033,14 +1037,6 @@ if test "x$enable_dri" = xyes; then fi enable_dricore=no -enable_megadriver=no -for driver in $DRI_DIRS; do -if test $driver = "swrast"; then -enable_dricore=yes -else -enable_megadriver=yes -fi -done # megadriver wants to use libmesa.la, while non-megadrivers want to # automatically get libdricore. Some day hopefully we'll transition @@ -1049,10 +1045,10 @@ MEGADRIVER_DRI_LIB_DEPS=$DRI_LIB_DEPS DRI_LIB_DEPS="\$(top_builddir)/src/mesa/libdricore/libdricore${VERSION}.la $DRI_LIB_DEPS" AM_CONDITIONAL(NEED_LIBDRICORE, test "x$enable_dricore" = xyes) -AM_CONDITIONAL(NEED_MEGADRIVER, test "x$enable_megadriver" = xyes) +AM_CONDITIONAL(NEED_MEGADRIVER, test -n "$DRI_DIRS") AM_CONDITIONAL(NEED_LIBMESA, test "x$enable_xlib_glx" = xyes -o \ "x$enable_osmesa" = xyes -o \ - "x$enable_megadriver" = xyes) + -n "$DRI_DIRS") AC_SUBST([EXPAT_INCLUDES]) AC_SUBST([DRI_LIB_DEPS]) AC_SUBST([DRI_DRIVER_LDFLAGS]) @@ -1969,7 +1965,7 @@ AC_SUBST([ELF_LIB]) AM_CONDITIONAL(NEED_LIBPROGRAM, test "x$with_gallium_drivers" != x -o \ "x$enable_xlib_glx" = xyes -o \ "x$enable_osmesa" = xyes -o \ - "x$enable_megadriver" = xyes -o \ + -n "$DRI_DIRS" -o \ "x$enable_gallium_osmesa" = xyes) AM_CONDITIONAL(HAVE_X11_DRIVER, test "x$enable_xlib_glx" = xyes) AM_CONDITIONAL(HAVE_OSMESA, test "x$enable_osmesa" = xyes) diff --git a/src/mesa/drivers/dri/Makefile.am b/src/mesa/drivers/dri/Makefile.am index cc1eb2b..23ff9f7 100644 --- a/src/mesa/drivers/dri/Makefile.am +++ b/src/mesa/drivers/dri/Makefile.am @@ -47,6 +47,8 @@ endif if HAVE_SWRAST_DRI SUBDIRS+=swrast +MEGADRIVERS_DEPS += swrast/libswrast_dri.la +MEGADRIVERS += swrast_dri.so endif pkgconfigdir = $(libdir)/pkgconfig diff --git a/src/mesa/drivers/dri/swrast/Makefile.am b/src/mesa/drivers/dri/swrast/Makefile.am index 2034705..6373b76 100644 --- a/src/mesa/drivers/dri/swrast/Makefile.am +++ b/src/mesa/drivers/dri/swrast/Makefile.am @@ -30,26 +30,14 @@ AM_CFLAGS = \ -I$(top_srcdir)/src/mapi \ -I$(top_srcdir)/src/mesa/ \ -I$(top_srcdir)/src/mesa/drivers/dri/common \ - -DSWRAST_NO_DRM \ $(DEFINES) \ $(VISIBILITY_CFLAGS) dridir = $(DRI_DRIVER_INSTALL_DIR) if HAVE_SWRAST_DRI -dri_LTLIBRARIES = swrast_dri.la +noinst_LTLIBRARIES = libswrast_dri.la endif -swrast_dri_la_SOURCES = \ - $(SWRAST_C_FILES) - -swrast_dri_la_LDFLAGS = $(DRI_DRIVER_LDFLAGS) - -swrast_dri_la_LIBADD = \ - $(DRI_LIB_DEPS) - -# Provide compatibility with scripts for the old Mesa build system for -# a while by putting a link to the driver into /lib of the build tree. -all-local: swrast_dri.la - $(MKDIR_P) $(top_builddir)/$(LIB_DIR); - ln -f .libs/swrast_dri.so $(top_builddir)/$(LIB_DIR)/swrast_dri.so; +libswrast_dri_la_SOURCES = $(SWRAST_C_FILES) +libswrast_dri_la_LIBADD = $(DRI_LIB_DEPS) diff --git a/src/mesa/drivers/dri/swrast/Makefile.sources b/src/mesa/drivers/dri/swrast/Makefile.sources index fc7ef32..70e432f 100644 --- a/src/mesa/drivers/dri/swrast/Makefile.sources +++ b/src/mesa/drivers/dri/swrast/Makefile.sources @@ -1,11 +1,5 @@ SWRAST_DRIVER_FILES = \ swrast.c -SWRAST_COMMON_FILES = \ - ../common/utils.c \ - ../common/dri_util.c \ - ../common/xmlconfig.c - SWRAST_C_FILES = \ - $(SWRAST_COMMON_FILES) \ $(SWRAST_DRIVER_FILES) diff --git a/src/mesa/drivers/dri/swrast/swrast.c b/src/mesa/drivers/dri/s
[Mesa-dev] [PATCH 14/14] mesa: Remove dricore from the build.
No driver uses it any more, and it's been replaced by megadrivers. --- configure.ac | 11 - src/mesa/Makefile.am | 6 +-- src/mesa/drivers/dri/Makefile.am | 2 +- src/mesa/drivers/dri/i965/Makefile.am | 2 +- src/mesa/libdricore/Makefile.am | 85 --- src/mesa/program/Makefile.am | 10 + src/mesa/x86/read_rgba_span_x86.S | 8 7 files changed, 4 insertions(+), 120 deletions(-) delete mode 100644 src/mesa/libdricore/Makefile.am diff --git a/configure.ac b/configure.ac index 81abdf9..ddc4d24 100644 --- a/configure.ac +++ b/configure.ac @@ -1036,15 +1036,6 @@ if test "x$enable_dri" = xyes; then DRI_DRIVER_LDFLAGS="-module -avoid-version -shared -Wl,-Bsymbolic" fi -enable_dricore=no - -# megadriver wants to use libmesa.la, while non-megadrivers want to -# automatically get libdricore. Some day hopefully we'll transition -# everything to megadriver. -MEGADRIVER_DRI_LIB_DEPS=$DRI_LIB_DEPS -DRI_LIB_DEPS="\$(top_builddir)/src/mesa/libdricore/libdricore${VERSION}.la $DRI_LIB_DEPS" - -AM_CONDITIONAL(NEED_LIBDRICORE, test "x$enable_dricore" = xyes) AM_CONDITIONAL(NEED_MEGADRIVER, test -n "$DRI_DIRS") AM_CONDITIONAL(NEED_LIBMESA, test "x$enable_xlib_glx" = xyes -o \ "x$enable_osmesa" = xyes -o \ @@ -1052,7 +1043,6 @@ AM_CONDITIONAL(NEED_LIBMESA, test "x$enable_xlib_glx" = xyes -o \ AC_SUBST([EXPAT_INCLUDES]) AC_SUBST([DRI_LIB_DEPS]) AC_SUBST([DRI_DRIVER_LDFLAGS]) -AC_SUBST([MEGADRIVER_DRI_LIB_DEPS]) AC_SUBST([GALLIUM_DRI_LIB_DEPS]) case $DRI_DIRS in @@ -2128,7 +2118,6 @@ AC_CONFIG_FILES([Makefile src/mesa/drivers/osmesa/Makefile src/mesa/drivers/osmesa/osmesa.pc src/mesa/drivers/x11/Makefile - src/mesa/libdricore/Makefile src/mesa/main/tests/Makefile src/mesa/main/tests/hash_table/Makefile src/mesa/program/Makefile diff --git a/src/mesa/Makefile.am b/src/mesa/Makefile.am index e9c16e7..f86caee 100644 --- a/src/mesa/Makefile.am +++ b/src/mesa/Makefile.am @@ -19,11 +19,7 @@ # FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS # IN THE SOFTWARE. -if NEED_LIBDRICORE -DRICORE_SUBDIR = libdricore -endif - -SUBDIRS = program x86 x86-64 . $(DRICORE_SUBDIR) main/tests +SUBDIRS = program x86 x86-64 . main/tests if HAVE_X11_DRIVER SUBDIRS += drivers/x11 diff --git a/src/mesa/drivers/dri/Makefile.am b/src/mesa/drivers/dri/Makefile.am index 23ff9f7..26f49ec 100644 --- a/src/mesa/drivers/dri/Makefile.am +++ b/src/mesa/drivers/dri/Makefile.am @@ -68,7 +68,7 @@ mesa_dri_drivers_la_LIBADD = \ common/libmegadriver_stub.la \ common/libdricommon.la \ $(MEGADRIVERS_DEPS) \ -$(MEGADRIVER_DRI_LIB_DEPS) \ +$(DRI_LIB_DEPS) \ $() if NEED_MEGADRIVER diff --git a/src/mesa/drivers/dri/i965/Makefile.am b/src/mesa/drivers/dri/i965/Makefile.am index 084b3d1..a54b1cc 100644 --- a/src/mesa/drivers/dri/i965/Makefile.am +++ b/src/mesa/drivers/dri/i965/Makefile.am @@ -48,7 +48,7 @@ TEST_LIBS = \ libi965_dri.la \ ../common/libdricommon.la \ ../common/libmegadriver_stub.la \ - $(MEGADRIVER_DRI_LIB_DEPS) \ + $(DRI_LIB_DEPS) \ ../../../libmesa.la \ -lrt \ ../common/libdri_test_stubs.la diff --git a/src/mesa/libdricore/Makefile.am b/src/mesa/libdricore/Makefile.am deleted file mode 100644 index 686e478..000 --- a/src/mesa/libdricore/Makefile.am +++ /dev/null @@ -1,85 +0,0 @@ -# Copyright © 2012 Intel Corporation -# -# Permission is hereby granted, free of charge, to any person obtaining a -# copy of this software and associated documentation files (the "Software"), -# to deal in the Software without restriction, including without limitation -# the rights to use, copy, modify, merge, publish, distribute, sublicense, -# and/or sell copies of the Software, and to permit persons to whom the -# Software is furnished to do so, subject to the following conditions: -# -# The above copyright notice and this permission notice (including the next -# paragraph) shall be included in all copies or substantial portions of the -# Software. -# -# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR -# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, -# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL -# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER -# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING -# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS -# IN THE SOFTWARE. - -SRCDIR=$(top_srcdir)/src/mesa/ -BUILDDIR=$(top_builddir)/src/mesa/ -include ../Makefile.sources -include ../../glsl/Makefile.sources - -noinst_PROGRAMS = - -AM_CPPFLAGS = \ - $(INCLUDE_DIRS) \ - $(DEFINES) \ - -DU
[Mesa-dev] [PATCH 08/14] i965: Build the driver into a shared mesa_dri_drivers.so .
Previously, we've split things such that mesa core is in libdricore, exposing the whole Mesa core interface in the global namespace, and the i965_dri.so code all links against that. Along with polluting application namespace terribly, it requires extra PLT indirections and prevents LTO. Instead, we can build all of the driver contents into the same .so with just a few symbols exposed to be referenced from the actual driver .so file, allowing LTO and reducing our exposed symbol count massively. --- configure.ac | 29 +++--- src/mesa/drivers/dri/Makefile.am | 54 ++- src/mesa/drivers/dri/common/Makefile.am | 3 ++ src/mesa/drivers/dri/common/dri_util.c| 10 +++-- src/mesa/drivers/dri/common/megadriver_stub.c | 41 src/mesa/drivers/dri/i965/Makefile.am | 27 +++--- src/mesa/drivers/dri/i965/intel_screen.c | 16 ++-- src/mesa/drivers/dri/i965/intel_screen.h | 2 + 8 files changed, 147 insertions(+), 35 deletions(-) create mode 100644 src/mesa/drivers/dri/common/megadriver_stub.c diff --git a/configure.ac b/configure.ac index 1f0a646..bc111f3 100644 --- a/configure.ac +++ b/configure.ac @@ -705,8 +705,6 @@ fi AM_CONDITIONAL(HAVE_DRI_GLX, test "x$enable_glx" = xyes -a \ "x$enable_dri" = xyes) AM_CONDITIONAL(HAVE_DRI, test "x$enable_dri" = xyes) -AM_CONDITIONAL(NEED_LIBMESA, test "x$enable_xlib_glx" = xyes -o \ - "x$enable_osmesa" = xyes) AC_ARG_ENABLE([shared-glapi], [AS_HELP_STRING([--enable-shared-glapi], @@ -858,8 +856,6 @@ AC_SUBST([GLESv1_CM_PC_LIB_PRIV]) AC_SUBST([GLESv2_LIB_DEPS]) AC_SUBST([GLESv2_PC_LIB_PRIV]) -DRI_LIB_DEPS="\$(top_builddir)/src/mesa/libdricore/libdricore${VERSION}.la" - AC_SUBST([HAVE_XF86VIDMODE]) dnl @@ -1035,10 +1031,32 @@ if test "x$enable_dri" = xyes; then DRI_DRIVER_LDFLAGS="-module -avoid-version -shared -Wl,-Bsymbolic" fi -AM_CONDITIONAL(NEED_LIBDRICORE, test -n "$DRI_DIRS") + +enable_dricore=no +enable_megadriver=no +for driver in $DRI_DIRS; do +if test $driver != "i965"; then +enable_dricore=yes +else +enable_megadriver=yes +fi +done + +# megadriver wants to use libmesa.la, while non-megadrivers want to +# automatically get libdricore. Some day hopefully we'll transition +# everything to megadriver. +MEGADRIVER_DRI_LIB_DEPS=$DRI_LIB_DEPS +DRI_LIB_DEPS="\$(top_builddir)/src/mesa/libdricore/libdricore${VERSION}.la $DRI_LIB_DEPS" + +AM_CONDITIONAL(NEED_LIBDRICORE, test "x$enable_dricore" = xyes) +AM_CONDITIONAL(NEED_MEGADRIVER, test "x$enable_megadriver" = xyes) +AM_CONDITIONAL(NEED_LIBMESA, test "x$enable_xlib_glx" = xyes -o \ + "x$enable_osmesa" = xyes -o \ + "x$enable_megadriver" = xyes) AC_SUBST([EXPAT_INCLUDES]) AC_SUBST([DRI_LIB_DEPS]) AC_SUBST([DRI_DRIVER_LDFLAGS]) +AC_SUBST([MEGADRIVER_DRI_LIB_DEPS]) AC_SUBST([GALLIUM_DRI_LIB_DEPS]) case $DRI_DIRS in @@ -1951,6 +1969,7 @@ AC_SUBST([ELF_LIB]) AM_CONDITIONAL(NEED_LIBPROGRAM, test "x$with_gallium_drivers" != x -o \ "x$enable_xlib_glx" = xyes -o \ "x$enable_osmesa" = xyes -o \ + "x$enable_megadriver" = xyes -o \ "x$enable_gallium_osmesa" = xyes) AM_CONDITIONAL(HAVE_X11_DRIVER, test "x$enable_xlib_glx" = xyes) AM_CONDITIONAL(HAVE_OSMESA, test "x$enable_osmesa" = xyes) diff --git a/src/mesa/drivers/dri/Makefile.am b/src/mesa/drivers/dri/Makefile.am index 48d3685..9d15c43 100644 --- a/src/mesa/drivers/dri/Makefile.am +++ b/src/mesa/drivers/dri/Makefile.am @@ -1,4 +1,15 @@ +dridir = $(DRI_DRIVER_INSTALL_DIR) + +AM_CPPFLAGS = \ + -I$(top_srcdir)/src/mesa/ \ + -I$(top_srcdir)/src/mapi/ \ +-I$(top_srcdir)/src/mesa/drivers/dri/common \ +$(LIBDRM_CFLAGS) \ +$() + SUBDIRS = +MEGADRIVERS = +MEGADRIVERS_DEPS = if HAVE_COMMON_DRI SUBDIRS+=common @@ -9,7 +20,9 @@ SUBDIRS+=i915 endif if HAVE_I965_DRI -SUBDIRS+=i965 +SUBDIRS += i965 +MEGADRIVERS_DEPS += i965/libi965_dri.la +MEGADRIVERS += i965_dri.so endif if HAVE_NOUVEAU_DRI @@ -33,3 +46,42 @@ pkgconfig_DATA = dri.pc driincludedir = $(includedir)/GL/internal driinclude_HEADERS = $(top_srcdir)/include/GL/internal/dri_interface.h + +nodist_EXTRA_mesa_dri_drivers_la_SOURCES = dummy.cpp +mesa_dri_drivers_la_SOURCES = +mesa_dri_drivers_la_LDFLAGS = \ +-module -avoid-version -shared \ +-Wl,-Bsymbolic \ +$() +mesa_dri_drivers_la_LIBADD = \ +../../libmesa.la \ +common/libmegadriver_stub.la \ +common/libdricommon.la \ +$(MEGADRIVERS_DEPS) \ +$(MEGADRIVER_DRI_LIB_DEPS) \ +$() + +if NEED_MEGADRIVER +dri_LTLIBRARIES = mesa_dri_drivers.la + +# Add a link to allow setting LD_LIBRARY_PATH/LIB
[Mesa-dev] [PATCH 06/14] dri: Pass in the dlsym()ed driver extension to screen creation.
This will allow a megadrivers build to reference the actual driver being loaded from the shared dri_util screen creation code. --- include/GL/internal/dri_interface.h| 27 +++-- src/egl/drivers/dri2/egl_dri2.c| 27 + src/egl/drivers/dri2/egl_dri2.h| 1 + src/gbm/backends/dri/gbm_dri.c | 15 src/gbm/backends/dri/gbm_driint.h | 1 + src/glx/dri2_glx.c | 23 +++--- src/glx/drisw_glx.c| 13 +++--- src/mesa/drivers/dri/common/dri_util.c | 44 +- 8 files changed, 117 insertions(+), 34 deletions(-) diff --git a/include/GL/internal/dri_interface.h b/include/GL/internal/dri_interface.h index 93b6c0b..e07f669 100644 --- a/include/GL/internal/dri_interface.h +++ b/include/GL/internal/dri_interface.h @@ -713,7 +713,7 @@ struct __DRIlegacyExtensionRec { * conjunction with the core extension. */ #define __DRI_SWRAST "DRI_SWRast" -#define __DRI_SWRAST_VERSION 3 +#define __DRI_SWRAST_VERSION 4 struct __DRIswrastExtensionRec { __DRIextension base; @@ -749,6 +749,18 @@ struct __DRIswrastExtensionRec { const uint32_t *attribs, unsigned *error, void *loaderPrivate); + + /** +* createNewScreen() with the driver extensions passed in. +* +* \since version 4 +*/ + __DRIscreen *(*createNewScreen2)(int screen, +const __DRIextension **loader_extensions, +const __DRIextension **driver_extensions, +const __DRIconfig ***driver_configs, +void *loaderPrivate); + }; /** @@ -831,7 +843,7 @@ struct __DRIdri2LoaderExtensionRec { * constructors for DRI2. */ #define __DRI_DRI2 "DRI_DRI2" -#define __DRI_DRI2_VERSION 3 +#define __DRI_DRI2_VERSION 4 #define __DRI_API_OPENGL 0 /**< OpenGL compatibility profile */ #define __DRI_API_GLES 1 /**< OpenGL ES 1.x */ @@ -939,6 +951,17 @@ struct __DRIdri2ExtensionRec { const uint32_t *attribs, unsigned *error, void *loaderPrivate); + + /** +* createNewScreen with the driver's extension list passed in. +* +* \since version 4 +*/ +__DRIscreen *(*createNewScreen2)(int screen, int fd, + const __DRIextension **loader_extensions, + const __DRIextension **driver_extensions, + const __DRIconfig ***driver_configs, + void *loaderPrivate); }; diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c index 7c07fd6..fb2f028 100644 --- a/src/egl/drivers/dri2/egl_dri2.c +++ b/src/egl/drivers/dri2/egl_dri2.c @@ -460,6 +460,7 @@ dri2_load_driver(_EGLDisplay *disp) dlclose(dri2_dpy->driver); return EGL_FALSE; } + dri2_dpy->driver_extensions = extensions; return EGL_TRUE; } @@ -480,6 +481,7 @@ dri2_load_driver_swrast(_EGLDisplay *disp) dlclose(dri2_dpy->driver); return EGL_FALSE; } + dri2_dpy->driver_extensions = extensions; return EGL_TRUE; } @@ -545,14 +547,29 @@ dri2_create_screen(_EGLDisplay *disp) dri2_dpy = disp->DriverData; if (dri2_dpy->dri2) { + if (dri2_dpy->dri2->base.version >= 4) { dri2_dpy->dri_screen = - dri2_dpy->dri2->createNewScreen(0, dri2_dpy->fd, dri2_dpy->extensions, -&dri2_dpy->driver_configs, disp); + dri2_dpy->dri2->createNewScreen2(0, dri2_dpy->fd, + dri2_dpy->extensions, + dri2_dpy->driver_extensions, + &dri2_dpy->driver_configs, disp); + } else { + dri2_dpy->dri_screen = +dri2_dpy->dri2->createNewScreen(0, dri2_dpy->fd, +dri2_dpy->extensions, +&dri2_dpy->driver_configs, disp); + } } else { assert(dri2_dpy->swrast); - dri2_dpy->dri_screen = - dri2_dpy->swrast->createNewScreen(0, dri2_dpy->extensions, - &dri2_dpy->driver_configs, disp); + if (dri2_dpy->swrast->base.version >= 4) { + dri2_dpy->dri_screen = +dri2_dpy->swrast->createNewScreen2(0, dri2_dpy->extensions, + dri2_dpy->driver_extensions, + &dri2_dpy->driver_configs, disp); + } else { +dri2_dpy->swrast->createNewScreen(0, dri2_dpy->extensions, +
[Mesa-dev] [PATCH 10/14] dri: Add a tool for generating #defines to namespace driver global symbols.
--- src/mesa/drivers/dri/gen-symbol-redefs.py | 68 +++ 1 file changed, 68 insertions(+) create mode 100755 src/mesa/drivers/dri/gen-symbol-redefs.py diff --git a/src/mesa/drivers/dri/gen-symbol-redefs.py b/src/mesa/drivers/dri/gen-symbol-redefs.py new file mode 100755 index 000..ebe4aaa --- /dev/null +++ b/src/mesa/drivers/dri/gen-symbol-redefs.py @@ -0,0 +1,68 @@ +#!/usr/bin/env python +# -*- coding: utf-8 -*- + +# Copyright © 2013 Intel Corporation +# +# Permission is hereby granted, free of charge, to any person obtaining a +# copy of this software and associated documentation files (the "Software"), +# to deal in the Software without restriction, including without limitation +# the rights to use, copy, modify, merge, publish, distribute, sublicense, +# and/or sell copies of the Software, and to permit persons to whom the +# Software is furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice (including the next +# paragraph) shall be included in all copies or substantial portions of the +# Software. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL +# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING +# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS +# IN THE SOFTWARE. + +import sys +import argparse +import re +import subprocess + +# Example usages: +# ./gen-symbol-redefs.py i915/.libs/libi915_dri.a old_ i915 i830 +# ./gen-symbol-redefs.py r200/.libs/libr200_dri.a r200_ r200 + +argparser = argparse.ArgumentParser(description="Generates #defines to hide driver global symbols outside of a driver's namespace.") +argparser.add_argument("file", +metavar = 'file', +help='libdrivername.a file to read') +argparser.add_argument("newprefix", +metavar = 'newprefix', +help='New prefix to give non-driver global symbols') +argparser.add_argument('prefixes', + metavar='prefix', + nargs='*', + help='driver-specific prefixes') +args = argparser.parse_args() + +stdout = subprocess.check_output(['nm', args.file]) + +for line in stdout.splitlines(): +m = re.match("[0-9a-z]+ [BT] (.*)", line) +if not m: +continue + +symbol = m.group(1) + +has_good_prefix = re.match(args.newprefix, symbol) != None +for prefix in args.prefixes: +if re.match(prefix, symbol): +has_good_prefix = True +break +if has_good_prefix: +continue + +# This is the single public entrypoint. +if re.match("__driDriverGetExtensions", symbol): +continue + +print '#define {0:35} {1}{0}'.format(symbol, args.newprefix) -- 1.8.4.rc3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 07/14] dri: Implement a DRI vtable extension to replace the global driDriverAPI.
As we move to megadrivers, we are unable to build multiple drivers with the same public global symbol per driver (Think an X Server with an intel and a nouveau driver, and the X Server implementing indirect for both -- we have to actually talk to the right driver). By slipping the driDriverAPI vtable into the driver's extension list, we can replace the usage of the global symbol with usage of the loader-dlsym()ed driver information. --- include/GL/internal/dri_interface.h| 17 + src/mesa/drivers/dri/common/dri_util.c | 11 +++ 2 files changed, 28 insertions(+) diff --git a/include/GL/internal/dri_interface.h b/include/GL/internal/dri_interface.h index e07f669..957dd8c 100644 --- a/include/GL/internal/dri_interface.h +++ b/include/GL/internal/dri_interface.h @@ -1268,4 +1268,21 @@ typedef struct __DRIconfigOptionsExtensionRec { const char *xml; } __DRIconfigOptionsExtension; +/** + * This extension provides a driver vtable to a set of common driver helper + * functions (driCoreExtension, driDRI2Extension) within the driver + * implementation, as opposed to having to pass them through a global + * variable. + * + * It is not intended to be public API to the actual loader, and the vtable + * layout may change at any time. + */ +#define __DRI_DRIVER_VTABLE "DRI_DriverVtable" +#define __DRI_DRIVER_VTABLE_VERSION 1 + +typedef struct __DRIDriverVtableExtensionRec { +__DRIextension base; +const struct __DriverAPIRec *vtable; +} __DRIDriverVtableExtension; + #endif diff --git a/src/mesa/drivers/dri/common/dri_util.c b/src/mesa/drivers/dri/common/dri_util.c index 283e158..9a99ea9 100644 --- a/src/mesa/drivers/dri/common/dri_util.c +++ b/src/mesa/drivers/dri/common/dri_util.c @@ -100,8 +100,19 @@ dri2CreateNewScreen2(int scrn, int fd, if (!psp) return NULL; +/* By default, use the global driDriverAPI symbol (non-megadrivers). */ psp->driver = &driDriverAPI; +/* If the driver exposes its vtable through its extensions list + * (megadrivers), use that instead. + */ +for (int i = 0; driver_extensions[i]; i++) { + if (strcmp(driver_extensions[i]->name, __DRI_DRIVER_VTABLE) == 0) { + psp->driver = + ((__DRIDriverVtableExtension *)driver_extensions[i])->vtable; + } +} + setupLoaderExtensions(psp, extensions); #ifndef SWRAST_NO_DRM -- 1.8.4.rc3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 09/14] nouveau: Build the driver into the shared mesa_dri_drivers.so.
--- configure.ac | 2 +- src/mesa/drivers/dri/Makefile.am | 2 ++ src/mesa/drivers/dri/nouveau/Makefile.am | 23 ++- src/mesa/drivers/dri/nouveau/nouveau_screen.c | 15 +-- src/mesa/drivers/dri/nouveau/nouveau_screen.h | 2 ++ 5 files changed, 24 insertions(+), 20 deletions(-) diff --git a/configure.ac b/configure.ac index bc111f3..92f6a26 100644 --- a/configure.ac +++ b/configure.ac @@ -1035,7 +1035,7 @@ fi enable_dricore=no enable_megadriver=no for driver in $DRI_DIRS; do -if test $driver != "i965"; then +if test $driver != "i965" -a $driver != "nouveau"; then enable_dricore=yes else enable_megadriver=yes diff --git a/src/mesa/drivers/dri/Makefile.am b/src/mesa/drivers/dri/Makefile.am index 9d15c43..6152fcc 100644 --- a/src/mesa/drivers/dri/Makefile.am +++ b/src/mesa/drivers/dri/Makefile.am @@ -27,6 +27,8 @@ endif if HAVE_NOUVEAU_DRI SUBDIRS+=nouveau +MEGADRIVERS_DEPS = nouveau/libnouveau_dri.la +MEGADRIVERS += nouveau_vieux_dri.so endif if HAVE_R200_DRI diff --git a/src/mesa/drivers/dri/nouveau/Makefile.am b/src/mesa/drivers/dri/nouveau/Makefile.am index 7172e62..90dfd64 100644 --- a/src/mesa/drivers/dri/nouveau/Makefile.am +++ b/src/mesa/drivers/dri/nouveau/Makefile.am @@ -23,6 +23,8 @@ include Makefile.sources +if HAVE_NOUVEAU_DRI + AM_CFLAGS = \ -I$(top_srcdir)/include \ -I$(top_srcdir)/src/ \ @@ -35,21 +37,8 @@ AM_CFLAGS = \ dridir = $(DRI_DRIVER_INSTALL_DIR) -if HAVE_NOUVEAU_DRI -dri_LTLIBRARIES = nouveau_vieux_dri.la -endif +noinst_LTLIBRARIES = libnouveau_dri.la +libnouveau_dri_la_SOURCES = $(NOUVEAU_C_FILES) +libnouveau_dri_la_LIBADD = $(NOUVEAU_LIBS) -nouveau_vieux_dri_la_SOURCES = \ - $(NOUVEAU_C_FILES) - -nouveau_vieux_dri_la_LDFLAGS = $(DRI_DRIVER_LDFLAGS) -nouveau_vieux_dri_la_LIBADD = \ - ../common/libdricommon.la \ - $(DRI_LIB_DEPS) \ - $(NOUVEAU_LIBS) - -# Provide compatibility with scripts for the old Mesa build system for -# a while by putting a link to the driver into /lib of the build tree. -all-local: nouveau_vieux_dri.la - $(MKDIR_P) $(top_builddir)/$(LIB_DIR); - ln -f .libs/nouveau_vieux_dri.so $(top_builddir)/$(LIB_DIR)/nouveau_vieux_dri.so; +endif diff --git a/src/mesa/drivers/dri/nouveau/nouveau_screen.c b/src/mesa/drivers/dri/nouveau/nouveau_screen.c index 6816406..fa2bfd2 100644 --- a/src/mesa/drivers/dri/nouveau/nouveau_screen.c +++ b/src/mesa/drivers/dri/nouveau/nouveau_screen.c @@ -246,7 +246,7 @@ static const __DRIextension *nouveau_screen_extensions[] = { NULL }; -const struct __DriverAPIRec driDriverAPI = { +const struct __DriverAPIRec nouveau_driver_api = { .InitScreen = nouveau_init_screen2, .DestroyScreen = nouveau_destroy_screen, .CreateBuffer= nouveau_create_buffer, @@ -257,9 +257,20 @@ const struct __DriverAPIRec driDriverAPI = { .UnbindContext = nouveau_context_unbind, }; +static const struct __DRIDriverVtableExtensionRec nouveau_vtable = { + .base = { __DRI_DRIVER_VTABLE, 1 }, + .vtable = &nouveau_driver_api, +}; + /* This is the table of extensions that the loader will dlsym() for. */ -PUBLIC const __DRIextension *__driDriverExtensions[] = { +PUBLIC const __DRIextension *nouveau_driver_extensions[] = { &driCoreExtension.base, &driDRI2Extension.base, + &nouveau_vtable.base, NULL }; + +PUBLIC const __DRIextension **__driDriverGetExtensions_nouveau_vieux(void) +{ + return nouveau_driver_extensions; +} diff --git a/src/mesa/drivers/dri/nouveau/nouveau_screen.h b/src/mesa/drivers/dri/nouveau/nouveau_screen.h index bcf57e2..45b1ee9 100644 --- a/src/mesa/drivers/dri/nouveau/nouveau_screen.h +++ b/src/mesa/drivers/dri/nouveau/nouveau_screen.h @@ -27,6 +27,8 @@ #ifndef __NOUVEAU_SCREEN_H__ #define __NOUVEAU_SCREEN_H__ +const __DRIextension **__driDriverGetExtensions_nouveau_vieux(void); + struct nouveau_context; struct nouveau_screen { -- 1.8.4.rc3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 11/14] i915: Build the driver into the shared mesa_dri_drivers.so.
i915 has symbols for formerly-shared code that conflict with i965, so we define them away using gen-symbol-redefs.py. Options considered: - This option. Downsides: The symbols in profiling and debugging don't match the source. The symbol list may change in the future and we won't notice without manually running the tool again. - Use objcopy --localize-hidden to automatically demote our symbols to locals. This didn't work on i965 due to c++ weak symbols (which can't be localized), but could work on i915. We could do it on i915 only, but it does produce libtool warnings at link time due to libtool not knowing if the resulting .o file is safe to link (stupid libtool). Plus you end up with different symbols of the same name, which is confusing for debugging too. On the other hand, no future symbol conflicts long term. - Write our own libelf tool that handles c++ weak symbols like we want and apply it to all drivers. All the downsides of above, but applies uniformly across drivers. - Edit the files to just rename all the i915 or i965 symbols that conflict. There are on the order of 100 that have a prefix we used to share, so it would take a bit of typing. Fewest downsides, but still can have conflicts long term. Ultimately, this is the least invasive change at the moment, and we can see if the "more symbol conflicts appear later" thing is a real concern or not. Note that the ability to compile a version of i915 without INTEL_DEBUG env support is dropped. It's too useful. --- configure.ac | 2 +- src/mesa/drivers/dri/Makefile.am | 2 + src/mesa/drivers/dri/i915/Makefile.am | 16 +--- src/mesa/drivers/dri/i915/intel_context.c | 2 - src/mesa/drivers/dri/i915/intel_mipmap_tree.h | 1 + src/mesa/drivers/dri/i915/intel_screen.c | 15 +++- src/mesa/drivers/dri/i915/intel_screen.h | 103 ++ 7 files changed, 123 insertions(+), 18 deletions(-) diff --git a/configure.ac b/configure.ac index 92f6a26..87c353a 100644 --- a/configure.ac +++ b/configure.ac @@ -1035,7 +1035,7 @@ fi enable_dricore=no enable_megadriver=no for driver in $DRI_DIRS; do -if test $driver != "i965" -a $driver != "nouveau"; then +if test $driver != "i965" -a $driver != "nouveau" -a $driver != "i915"; then enable_dricore=yes else enable_megadriver=yes diff --git a/src/mesa/drivers/dri/Makefile.am b/src/mesa/drivers/dri/Makefile.am index 6152fcc..5aff40a 100644 --- a/src/mesa/drivers/dri/Makefile.am +++ b/src/mesa/drivers/dri/Makefile.am @@ -17,6 +17,8 @@ endif if HAVE_I915_DRI SUBDIRS+=i915 +MEGADRIVERS_DEPS += i915/libi915_dri.la +MEGADRIVERS += i915_dri.so endif if HAVE_I965_DRI diff --git a/src/mesa/drivers/dri/i915/Makefile.am b/src/mesa/drivers/dri/i915/Makefile.am index 46dd4c2..93ae663 100644 --- a/src/mesa/drivers/dri/i915/Makefile.am +++ b/src/mesa/drivers/dri/i915/Makefile.am @@ -38,18 +38,8 @@ AM_CFLAGS = \ dridir = $(DRI_DRIVER_INSTALL_DIR) if HAVE_I915_DRI -dri_LTLIBRARIES = i915_dri.la +noinst_LTLIBRARIES = libi915_dri.la endif -i915_dri_la_SOURCES = $(i915_FILES) -i915_dri_la_LDFLAGS = $(DRI_DRIVER_LDFLAGS) -i915_dri_la_LIBADD = \ - ../common/libdricommon.la \ - $(DRI_LIB_DEPS) \ - $(INTEL_LIBS) - -# Provide compatibility with scripts for the old Mesa build system for -# a while by putting a link to the driver into /lib of the build tree. -all-local: i915_dri.la - $(MKDIR_P) $(top_builddir)/$(LIB_DIR); - ln -f .libs/i915_dri.so $(top_builddir)/$(LIB_DIR)/i915_dri.so; +libi915_dri_la_SOURCES = $(i915_FILES) +libi915_dri_la_LIBADD = $(INTEL_LIBS) diff --git a/src/mesa/drivers/dri/i915/intel_context.c b/src/mesa/drivers/dri/i915/intel_context.c index d25358b..2748514 100644 --- a/src/mesa/drivers/dri/i915/intel_context.c +++ b/src/mesa/drivers/dri/i915/intel_context.c @@ -58,9 +58,7 @@ #include "utils.h" #include "../glsl/ralloc.h" -#ifndef INTEL_DEBUG int INTEL_DEBUG = (0); -#endif static const GLubyte * diff --git a/src/mesa/drivers/dri/i915/intel_mipmap_tree.h b/src/mesa/drivers/dri/i915/intel_mipmap_tree.h index 2b2a644..1142af6 100644 --- a/src/mesa/drivers/dri/i915/intel_mipmap_tree.h +++ b/src/mesa/drivers/dri/i915/intel_mipmap_tree.h @@ -30,6 +30,7 @@ #include +#include "intel_screen.h" #include "intel_regions.h" #ifdef __cplusplus diff --git a/src/mesa/drivers/dri/i915/intel_screen.c b/src/mesa/drivers/dri/i915/intel_screen.c index 49bae5d..dcf5997 100644 --- a/src/mesa/drivers/dri/i915/intel_screen.c +++ b/src/mesa/drivers/dri/i915/intel_screen.c @@ -1190,7 +1190,7 @@ intelReleaseBuffer(__DRIscreen *screen, __DRIbuffer *buffer) } -const struct __DriverAPIRec driDriverAPI = { +static const struct __DriverAPIRec i915_driver_api = { .InitScreen = intelInitScreen2, .DestroyScreen = intelDestroyScreen, .CreateContext = intelCreateContext, @@ -1203,10 +
[Mesa-dev] [PATCH 12/14] radeon: Build the driver into the shared mesa_dri_drivers.so.
This required some reordering of headers to ensure that the symbol name redefines happened before any prototypes. --- configure.ac | 2 +- src/mesa/drivers/dri/Makefile.am | 4 + src/mesa/drivers/dri/r200/Makefile.am | 18 +--- src/mesa/drivers/dri/radeon/Makefile.am| 18 +--- .../drivers/dri/radeon/radeon_buffer_objects.c | 3 +- .../drivers/dri/radeon/radeon_common_context.h | 2 +- src/mesa/drivers/dri/radeon/radeon_debug.c | 2 +- src/mesa/drivers/dri/radeon/radeon_fog.c | 1 + src/mesa/drivers/dri/radeon/radeon_pixel_read.c| 2 +- src/mesa/drivers/dri/radeon/radeon_screen.c| 22 - src/mesa/drivers/dri/radeon/radeon_screen.h| 98 ++ src/mesa/drivers/dri/radeon/radeon_tile.c | 1 + 12 files changed, 135 insertions(+), 38 deletions(-) diff --git a/configure.ac b/configure.ac index 87c353a..5649aec 100644 --- a/configure.ac +++ b/configure.ac @@ -1035,7 +1035,7 @@ fi enable_dricore=no enable_megadriver=no for driver in $DRI_DIRS; do -if test $driver != "i965" -a $driver != "nouveau" -a $driver != "i915"; then +if test $driver = "swrast"; then enable_dricore=yes else enable_megadriver=yes diff --git a/src/mesa/drivers/dri/Makefile.am b/src/mesa/drivers/dri/Makefile.am index 5aff40a..cc1eb2b 100644 --- a/src/mesa/drivers/dri/Makefile.am +++ b/src/mesa/drivers/dri/Makefile.am @@ -35,10 +35,14 @@ endif if HAVE_R200_DRI SUBDIRS+=r200 +MEGADRIVERS_DEPS += r200/libr200_dri.la +MEGADRIVERS += r200_dri.so endif if HAVE_RADEON_DRI SUBDIRS+=radeon +MEGADRIVERS_DEPS += radeon/libradeon_dri.la +MEGADRIVERS += radeon_dri.so endif if HAVE_SWRAST_DRI diff --git a/src/mesa/drivers/dri/r200/Makefile.am b/src/mesa/drivers/dri/r200/Makefile.am index fc0482a..be405d7 100644 --- a/src/mesa/drivers/dri/r200/Makefile.am +++ b/src/mesa/drivers/dri/r200/Makefile.am @@ -39,20 +39,8 @@ AM_CFLAGS = \ dridir = $(DRI_DRIVER_INSTALL_DIR) if HAVE_R200_DRI -dri_LTLIBRARIES = r200_dri.la +noinst_LTLIBRARIES = libr200_dri.la endif -r200_dri_la_SOURCES = \ -$(R200_C_FILES) - -r200_dri_la_LDFLAGS = $(DRI_DRIVER_LDFLAGS) -r200_dri_la_LIBADD = \ - ../common/libdricommon.la \ - $(DRI_LIB_DEPS) \ - $(RADEON_LIBS) - -# Provide compatibility with scripts for the old Mesa build system for -# a while by putting a link to the driver into /lib of the build tree. -all-local: r200_dri.la - $(MKDIR_P) $(top_builddir)/$(LIB_DIR); - ln -f .libs/r200_dri.so $(top_builddir)/$(LIB_DIR)/r200_dri.so; +libr200_dri_la_SOURCES = $(R200_C_FILES) +libr200_dri_la_LIBADD = $(RADEON_LIBS) diff --git a/src/mesa/drivers/dri/radeon/Makefile.am b/src/mesa/drivers/dri/radeon/Makefile.am index d13b803..89102df 100644 --- a/src/mesa/drivers/dri/radeon/Makefile.am +++ b/src/mesa/drivers/dri/radeon/Makefile.am @@ -39,20 +39,8 @@ AM_CFLAGS = \ dridir = $(DRI_DRIVER_INSTALL_DIR) if HAVE_RADEON_DRI -dri_LTLIBRARIES = radeon_dri.la +noinst_LTLIBRARIES = libradeon_dri.la endif -radeon_dri_la_SOURCES = \ -$(RADEON_C_FILES) - -radeon_dri_la_LDFLAGS = $(DRI_DRIVER_LDFLAGS) -radeon_dri_la_LIBADD = \ - ../common/libdricommon.la \ - $(DRI_LIB_DEPS) \ - $(RADEON_LIBS) - -# Provide compatibility with scripts for the old Mesa build system for -# a while by putting a link to the driver into /lib of the build tree. -all-local: radeon_dri.la - $(MKDIR_P) $(top_builddir)/$(LIB_DIR); - ln -f .libs/radeon_dri.so $(top_builddir)/$(LIB_DIR)/radeon_dri.so; +libradeon_dri_la_SOURCES = $(RADEON_C_FILES) +libradeon_dri_la_LIBADD = $(RADEON_LIBS) diff --git a/src/mesa/drivers/dri/radeon/radeon_buffer_objects.c b/src/mesa/drivers/dri/radeon/radeon_buffer_objects.c index 5abc52b..40a16c3 100644 --- a/src/mesa/drivers/dri/radeon/radeon_buffer_objects.c +++ b/src/mesa/drivers/dri/radeon/radeon_buffer_objects.c @@ -25,13 +25,12 @@ * */ -#include "radeon_buffer_objects.h" - #include "main/imports.h" #include "main/mtypes.h" #include "main/bufferobj.h" #include "radeon_common.h" +#include "radeon_buffer_objects.h" struct radeon_buffer_object * get_radeon_buffer_object(struct gl_buffer_object *obj) diff --git a/src/mesa/drivers/dri/radeon/radeon_common_context.h b/src/mesa/drivers/dri/radeon/radeon_common_context.h index 8437f34..ab55071 100644 --- a/src/mesa/drivers/dri/radeon/radeon_common_context.h +++ b/src/mesa/drivers/dri/radeon/radeon_common_context.h @@ -7,8 +7,8 @@ #include "tnl/t_context.h" #include "main/colormac.h" -#include "radeon_debug.h" #include "radeon_screen.h" +#include "radeon_debug.h" #include "radeon_drm.h" #include "dri_util.h" #include "tnl/t_vertex.h" diff --git a/src/mesa/drivers/dri/radeon/radeon_debug.c b/src/mesa/drivers/dri/radeon/radeon_debug.c index dd0afb8..7ddba1a 100644 --- a/src/mesa/drivers/dri/radeon/radeon_debug.c +++ b/src/mesa/drivers/dri/ra
[Mesa-dev] [PATCH] R600: Add a ldptr intrinsic to support MSAA.
--- lib/Target/R600/R600ISelLowering.cpp | 6 +- lib/Target/R600/R600Instructions.td | 4 lib/Target/R600/R600Intrinsics.td| 1 + 3 files changed, 10 insertions(+), 1 deletion(-) diff --git a/lib/Target/R600/R600ISelLowering.cpp b/lib/Target/R600/R600ISelLowering.cpp index 126db73..a6778a4 100644 --- a/lib/Target/R600/R600ISelLowering.cpp +++ b/lib/Target/R600/R600ISelLowering.cpp @@ -590,7 +590,8 @@ SDValue R600TargetLowering::LowerOperation(SDValue Op, SelectionDAG &DAG) const case AMDGPUIntrinsic::R600_txf: case AMDGPUIntrinsic::R600_txq: case AMDGPUIntrinsic::R600_ddx: -case AMDGPUIntrinsic::R600_ddy: { +case AMDGPUIntrinsic::R600_ddy: +case AMDGPUIntrinsic::R600_ldptr: { unsigned TextureOp; switch (IntrinsicID) { case AMDGPUIntrinsic::R600_tex: @@ -623,6 +624,9 @@ SDValue R600TargetLowering::LowerOperation(SDValue Op, SelectionDAG &DAG) const case AMDGPUIntrinsic::R600_ddy: TextureOp = 9; break; + case AMDGPUIntrinsic::R600_ldptr: +TextureOp = 10; +break; default: llvm_unreachable("Unknow Texture Operation"); } diff --git a/lib/Target/R600/R600Instructions.td b/lib/Target/R600/R600Instructions.td index 82ecbad..9dc9303 100644 --- a/lib/Target/R600/R600Instructions.td +++ b/lib/Target/R600/R600Instructions.td @@ -881,6 +881,9 @@ def TEX_SAMPLE_C_L : R600_TEX <0x19, "TEX_SAMPLE_C_L">; def TEX_SAMPLE_LB : R600_TEX <0x12, "TEX_SAMPLE_LB">; def TEX_SAMPLE_C_LB : R600_TEX <0x1A, "TEX_SAMPLE_C_LB">; def TEX_LD : R600_TEX <0x03, "TEX_LD">; +def TEX_LDPTR : R600_TEX <0x03, "TEX_LDPTR"> { + let Inst{6-5} = 1; +} def TEX_GET_TEXTURE_RESINFO : R600_TEX <0x04, "TEX_GET_TEXTURE_RESINFO">; def TEX_GET_GRADIENTS_H : R600_TEX <0x07, "TEX_GET_GRADIENTS_H">; def TEX_GET_GRADIENTS_V : R600_TEX <0x08, "TEX_GET_GRADIENTS_V">; @@ -899,6 +902,7 @@ defm : TexPattern<6, TEX_LD, v4i32>; defm : TexPattern<7, TEX_GET_TEXTURE_RESINFO, v4i32>; defm : TexPattern<8, TEX_GET_GRADIENTS_H>; defm : TexPattern<9, TEX_GET_GRADIENTS_V>; +defm : TexPattern<10, TEX_LDPTR, v4i32>; //===--===// // Helper classes for common instructions diff --git a/lib/Target/R600/R600Intrinsics.td b/lib/Target/R600/R600Intrinsics.td index 58d86b6..b5cb369 100644 --- a/lib/Target/R600/R600Intrinsics.td +++ b/lib/Target/R600/R600Intrinsics.td @@ -52,6 +52,7 @@ let TargetPrefix = "R600", isTarget = 1 in { def int_R600_txb : TextureIntrinsicFloatInput; def int_R600_txbc : TextureIntrinsicFloatInput; def int_R600_txf : TextureIntrinsicInt32Input; + def int_R600_ldptr : TextureIntrinsicInt32Input; def int_R600_txq : TextureIntrinsicInt32Input; def int_R600_ddx : TextureIntrinsicFloatInput; def int_R600_ddy : TextureIntrinsicFloatInput; -- 1.8.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/4] r600g/llvm: fix txq for texture buffer
--- src/gallium/drivers/r600/r600_llvm.c | 7 +-- src/gallium/drivers/r600/r600_shader.c | 1 + src/gallium/drivers/radeon/radeon_llvm.h | 1 + 3 files changed, 7 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/r600/r600_llvm.c b/src/gallium/drivers/r600/r600_llvm.c index 03a68e4..54291a1 100644 --- a/src/gallium/drivers/r600/r600_llvm.c +++ b/src/gallium/drivers/r600/r600_llvm.c @@ -23,6 +23,7 @@ #define CONSTANT_BUFFER_0_ADDR_SPACE 8 #define CONSTANT_BUFFER_1_ADDR_SPACE (CONSTANT_BUFFER_0_ADDR_SPACE + R600_UCP_CONST_BUFFER) #define CONSTANT_TXQ_BUFFER (CONSTANT_BUFFER_0_ADDR_SPACE + R600_TXQ_CONST_BUFFER) +#define LLVM_R600_BUFFER_INFO_CONST_BUFFER (CONSTANT_BUFFER_0_ADDR_SPACE + R600_BUFFER_INFO_CONST_BUFFER) static LLVMValueRef llvm_load_const_buffer( struct lp_build_tgsi_context * bld_base, @@ -410,8 +411,10 @@ static void llvm_emit_tex( if (emit_data->inst->Texture.Texture == TGSI_TEXTURE_BUFFER) { switch (emit_data->inst->Instruction.Opcode) { case TGSI_OPCODE_TXQ: { - LLVMValueRef offset = lp_build_const_int32(bld_base->base.gallivm, 1); - LLVMValueRef cvecval = llvm_load_const_buffer(bld_base, offset, R600_BUFFER_INFO_CONST_BUFFER); + struct radeon_llvm_context * ctx = radeon_llvm_context(bld_base); + ctx->uses_tex_buffers = true; + LLVMValueRef offset = lp_build_const_int32(bld_base->base.gallivm, 0); + LLVMValueRef cvecval = llvm_load_const_buffer(bld_base, offset, LLVM_R600_BUFFER_INFO_CONST_BUFFER); emit_data->output[0] = cvecval; return; } diff --git a/src/gallium/drivers/r600/r600_shader.c b/src/gallium/drivers/r600/r600_shader.c index ce15cd7..e8e1333 100644 --- a/src/gallium/drivers/r600/r600_shader.c +++ b/src/gallium/drivers/r600/r600_shader.c @@ -1139,6 +1139,7 @@ static int r600_shader_from_tgsi(struct r600_screen *rscreen, radeon_llvm_ctx.alpha_to_one = key.alpha_to_one; mod = r600_tgsi_llvm(&radeon_llvm_ctx, tokens); ctx.shader->has_txq_cube_array_z_comp = radeon_llvm_ctx.has_txq_cube_array_z_comp; + ctx.shader->uses_tex_buffers = radeon_llvm_ctx.uses_tex_buffers; if (r600_llvm_compile(mod, rscreen->b.family, ctx.bc, &use_kill, dump)) { radeon_llvm_dispose(&radeon_llvm_ctx); diff --git a/src/gallium/drivers/radeon/radeon_llvm.h b/src/gallium/drivers/radeon/radeon_llvm.h index 14a8c34..345ae70 100644 --- a/src/gallium/drivers/radeon/radeon_llvm.h +++ b/src/gallium/drivers/radeon/radeon_llvm.h @@ -67,6 +67,7 @@ struct radeon_llvm_context { unsigned fs_color_all; unsigned alpha_to_one; unsigned has_txq_cube_array_z_comp; + unsigned uses_tex_buffers; /*=== Front end configuration ===*/ -- 1.8.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/4] r600g/llvm: fix sample cube shadow
--- src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c index 8ff9abd..ac2e511 100644 --- a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c +++ b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c @@ -654,7 +654,8 @@ void radeon_llvm_emit_prepare_cube_coords( opcode == TGSI_OPCODE_TXB2 || opcode == TGSI_OPCODE_TXL2) { coords[3] = coords_arg[4]; - } else if (opcode == TGSI_OPCODE_TXB || + } else if (opcode == TGSI_OPCODE_TEX || + opcode == TGSI_OPCODE_TXB || opcode == TGSI_OPCODE_TXL) { coords[3] = coords_arg[3]; } -- 1.8.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/4] r600g/llvm: Undef z and w component of 2D TXP inst
--- src/gallium/drivers/r600/r600_llvm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/drivers/r600/r600_llvm.c b/src/gallium/drivers/r600/r600_llvm.c index 54291a1..c700f26 100644 --- a/src/gallium/drivers/r600/r600_llvm.c +++ b/src/gallium/drivers/r600/r600_llvm.c @@ -431,7 +431,7 @@ static void llvm_emit_tex( } } - if (emit_data->inst->Instruction.Opcode == TGSI_OPCODE_TEX) { + if (emit_data->inst->Instruction.Opcode == TGSI_OPCODE_TEX || emit_data->inst->Instruction.Opcode == TGSI_OPCODE_TXP) { LLVMValueRef Vector[4] = { LLVMBuildExtractElement(gallivm->builder, emit_data->args[0], lp_build_const_int32(gallivm, 0), ""), -- 1.8.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/4] r600/llvm: Adds support for MSAA
--- src/gallium/drivers/r600/r600_llvm.c | 47 +++- src/gallium/drivers/r600/r600_shader.c | 1 + src/gallium/drivers/radeon/radeon_llvm.h | 1 + 3 files changed, 48 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/r600/r600_llvm.c b/src/gallium/drivers/r600/r600_llvm.c index c700f26..b955de1 100644 --- a/src/gallium/drivers/r600/r600_llvm.c +++ b/src/gallium/drivers/r600/r600_llvm.c @@ -405,8 +405,9 @@ static void llvm_emit_tex( struct lp_build_emit_data * emit_data) { struct gallivm_state * gallivm = bld_base->base.gallivm; - LLVMValueRef args[6]; + LLVMValueRef args[7]; unsigned c, sampler_src; + struct radeon_llvm_context * ctx = radeon_llvm_context(bld_base); if (emit_data->inst->Texture.Texture == TGSI_TEXTURE_BUFFER) { switch (emit_data->inst->Instruction.Opcode) { @@ -478,6 +479,50 @@ static void llvm_emit_tex( args[c++] = lp_build_const_int32(gallivm, emit_data->inst->Texture.Texture); +if (emit_data->inst->Instruction.Opcode == TGSI_OPCODE_TXF && +(emit_data->inst->Texture.Texture == TGSI_TEXTURE_2D_MSAA || + emit_data->inst->Texture.Texture == TGSI_TEXTURE_2D_ARRAY_MSAA)) { +switch (emit_data->inst->Texture.Texture) { +case TGSI_TEXTURE_2D_MSAA: + args[6] = lp_build_const_int32(gallivm, TGSI_TEXTURE_2D); + break; +case TGSI_TEXTURE_2D_ARRAY_MSAA: + args[6] = lp_build_const_int32(gallivm, TGSI_TEXTURE_2D_ARRAY); + break; +default: + break; +} + +if (ctx->has_compressed_msaa_texturing) { + LLVMValueRef ldptr_args[10] = { + args[0], // Coord + args[1], // Offset X + args[2], // Offset Y + args[3], // Offset Z + args[4], + args[5], + lp_build_const_int32(gallivm, 1), + lp_build_const_int32(gallivm, 1), + lp_build_const_int32(gallivm, 1), + lp_build_const_int32(gallivm, 1) + }; + LLVMValueRef ptr = build_intrinsic(gallivm->builder, + "llvm.R600.ldptr", + emit_data->dst_type, ldptr_args, 10, LLVMReadNoneAttribute); + LLVMValueRef Tmp = LLVMBuildExtractElement(gallivm->builder, args[0], + lp_build_const_int32(gallivm, 3), ""); + Tmp = LLVMBuildMul(gallivm->builder, Tmp, lp_build_const_int32(gallivm, 4), ""); + LLVMValueRef ResX = LLVMBuildExtractElement(gallivm->builder, ptr, + lp_build_const_int32(gallivm, 0), ""); + ResX = LLVMBuildBitCast(gallivm->builder, ResX, bld_base->base.int_elem_type, ""); + Tmp = LLVMBuildLShr(gallivm->builder, ResX, Tmp, ""); + Tmp = LLVMBuildAnd(gallivm->builder, Tmp, lp_build_const_int32(gallivm, 0xF), ""); + args[0] = LLVMBuildInsertElement(gallivm->builder, args[0], Tmp, lp_build_const_int32(gallivm, 3), ""); + args[c++] = lp_build_const_int32(gallivm, + emit_data->inst->Texture.Texture); +} +} + emit_data->output[0] = build_intrinsic(gallivm->builder, action->intr_name, emit_data->dst_type, args, c, LLVMReadNoneAttribute); diff --git a/src/gallium/drivers/r600/r600_shader.c b/src/gallium/drivers/r600/r600_shader.c index e8e1333..9ef8a8c 100644 --- a/src/gallium/drivers/r600/r600_shader.c +++ b/src/gallium/drivers/r600/r600_shader.c @@ -1137,6 +1137,7 @@ static int r600_shader_from_tgsi(struct r600_screen *rscreen, radeon_llvm_ctx.stream_outputs = &so; radeon_llvm_ctx.clip_vertex = ctx.cv_output; radeon_llvm_ctx.alpha_to_one = key.alpha_to_one; + radeon_llvm_ctx.has_compressed_msaa_texturing = ctx.bc->has_compressed_msaa_texturing; mod = r600_tgsi_llvm(&radeon_llvm_ctx, tokens); ctx.shader->has_txq_cube_array_z_comp = radeon_llvm_ctx.has_txq_cube_array_z_comp; ctx.shader->uses_tex_buffers = radeon_llvm_ctx.uses_tex_buffers; diff --git a/src/gallium/drivers/radeon/radeon_llvm.h b/src/gallium/drivers/radeon/radeon_llvm.h index 345ae70..ef09dc8 100644 --- a/src/gallium/drivers/radeon/radeon_llvm.h +++ b/src/gallium/drivers/radeon/radeon_llvm.h @@ -68,6 +68,7 @@ struct radeon_llvm_context { unsigned alpha_to_one; unsigned has_txq_cube_array_z_comp; unsigned uses_tex_buffers; + unsigned has_compressed_msaa_texturing; /*=== Front end configuration ===*/ -- 1.8.3.1 ___ mesa-dev mailing list mesa-dev@lists.f
Re: [Mesa-dev] [PATCH] gallium: include u_surface.h instead of u_rect.h
Am 30.09.2013 17:14, schrieb Brian Paul: > u_rect.h was including u_surface.h just to avoid touching a bunch > of other source files after some functions were moved from u_rect.h > to u_surface.h. This patch cleans up that hack. > --- > src/gallium/auxiliary/util/u_format.c|2 +- > src/gallium/auxiliary/util/u_rect.h |6 -- > src/gallium/auxiliary/util/u_tile.c |2 +- > src/gallium/auxiliary/vl/vl_mpeg12_decoder.c |2 +- > 4 files changed, 3 insertions(+), 9 deletions(-) > > diff --git a/src/gallium/auxiliary/util/u_format.c > b/src/gallium/auxiliary/util/u_format.c > index 08ef6ab..a8aa571 100644 > --- a/src/gallium/auxiliary/util/u_format.c > +++ b/src/gallium/auxiliary/util/u_format.c > @@ -34,9 +34,9 @@ > > #include "u_math.h" > #include "u_memory.h" > -#include "u_rect.h" > #include "u_format.h" > #include "u_format_s3tc.h" > +#include "u_surface.h" > > #include "pipe/p_defines.h" > > diff --git a/src/gallium/auxiliary/util/u_rect.h > b/src/gallium/auxiliary/util/u_rect.h > index 10909b2..c141550 100644 > --- a/src/gallium/auxiliary/util/u_rect.h > +++ b/src/gallium/auxiliary/util/u_rect.h > @@ -83,10 +83,4 @@ u_rect_possible_intersection(const struct u_rect *a, > } > #endif > > - > -/* Include pipe copy/fill rect helpers declarations for backwards > compatibility > - */ > -#include "util/u_surface.h" > - > - > #endif /* U_RECT_H */ > diff --git a/src/gallium/auxiliary/util/u_tile.c > b/src/gallium/auxiliary/util/u_tile.c > index 62298cd..fb80aec 100644 > --- a/src/gallium/auxiliary/util/u_tile.c > +++ b/src/gallium/auxiliary/util/u_tile.c > @@ -37,7 +37,7 @@ > #include "util/u_format.h" > #include "util/u_math.h" > #include "util/u_memory.h" > -#include "util/u_rect.h" > +#include "util/u_surface.h" > #include "util/u_tile.h" > > > diff --git a/src/gallium/auxiliary/vl/vl_mpeg12_decoder.c > b/src/gallium/auxiliary/vl/vl_mpeg12_decoder.c > index f838e74..f91f90b 100644 > --- a/src/gallium/auxiliary/vl/vl_mpeg12_decoder.c > +++ b/src/gallium/auxiliary/vl/vl_mpeg12_decoder.c > @@ -29,8 +29,8 @@ > #include > > #include "util/u_memory.h" > -#include "util/u_rect.h" > #include "util/u_sampler.h" > +#include "util/u_surface.h" > #include "util/u_video.h" > > #include "vl_mpeg12_decoder.h" > Reviewed-by: Roland Scheidegger Roland ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/gs: Fix incorrect numbering of DWORDs in 3DSTATE_GS
On 09/27/2013 09:42 PM, Paul Berry wrote: In commit 247f90c77e8f3894e963d796628246ba0bde27b5 (i965/gs: Set control data header size/format appropriately for EndPrimitive()), I incorrectly numbered the DWORDs in the 3DSTATE_GS command starting from 1 instead of starting from 0. This caused the control data format to be programmed into the wrong DWORD, resulting in corruption in some geometry shaders that used an output type of points. This patch numbers the DWORDs starting from 0, as we do for all other commands, which causes the control data format to be programmed into the correct DWORD. --- src/mesa/drivers/dri/i965/gen7_gs_state.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) Reviewed-by: Chad Versace ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] Use -Bsymbolic when linking libEGL.so
Eric Anholt writes: > Could you print which thing is trying to get resolved early? I see a > few egl* calls within main/egl*.c (eglQueryAPI(), eglGetDisplay(), > eglWaitClient()), and I'm wondering if not having Bsymbolic on them is > causing an RTLD_NOW (perhaps by the glaze-find-libgl.c?) to try to call > the ifuncs early. The first one was eglCreateImageKHR. I tested by removing that from Glaze, and then saw some other ifunc getting resolved. I continued to remove calls, (not always one-by-one), but never succeeded at finding any non-zero-size subset of egl calls that Glaze could wrap and still working without -Bsymbolic. > We should certainly be using Bsymbolic, and I'd like to see this go to > stable. OK. I'll do that. Thanks, -Carl pgpRoxREzSV9m.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 28/29] i965: Drop random 32-bit assembly implementation of memcpy().
On 09/27/2013 04:46 PM, Kenneth Graunke wrote: > This was only used for uploading batchbuffer data, and only on 32-bit > systems. If this is actually useful, we might want to use it more > widely. But more than likely, it isn't. This probably is still useful, alas. The glibc memcpy wants to do an Atom-friendly backwards walk of the addresses. For some kinds of mappings (uncached?), this breaks write combining and ruins performance. We didn't need this for 64-bit because, at the time, there were no 64-bit Atoms. I don't know what glibc does for memcpy on 64-bit these days. We should at least measure this in some 32-bit games on a couple platforms. > Signed-off-by: Kenneth Graunke > --- > src/mesa/drivers/dri/i965/intel_batchbuffer.c | 2 +- > src/mesa/drivers/dri/i965/intel_context.h | 28 > --- > 2 files changed, 1 insertion(+), 29 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c > b/src/mesa/drivers/dri/i965/intel_batchbuffer.c > index 77f3ada..a1e5ffc 100644 > --- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c > +++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c > @@ -343,7 +343,7 @@ intel_batchbuffer_data(struct brw_context *brw, > { > assert((bytes & 3) == 0); > intel_batchbuffer_require_space(brw, bytes, is_blit); > - __memcpy(brw->batch.map + brw->batch.used, data, bytes); > + memcpy(brw->batch.map + brw->batch.used, data, bytes); > brw->batch.used += bytes >> 2; > } > > diff --git a/src/mesa/drivers/dri/i965/intel_context.h > b/src/mesa/drivers/dri/i965/intel_context.h > index 58a6e86..89cbb7f 100644 > --- a/src/mesa/drivers/dri/i965/intel_context.h > +++ b/src/mesa/drivers/dri/i965/intel_context.h > @@ -97,34 +97,6 @@ struct intel_batchbuffer { > }; > > /* > - * From linux kernel i386 header files, copes with odd sizes better > - * than COPY_DWORDS would: > - * XXX Put this in src/mesa/main/imports.h ??? > - */ > -#if defined(i386) || defined(__i386__) > -static INLINE void * __memcpy(void * to, const void * from, size_t n) > -{ > - int d0, d1, d2; > - __asm__ __volatile__( > - "rep ; movsl\n\t" > - "testb $2,%b4\n\t" > - "je 1f\n\t" > - "movsw\n" > - "1:\ttestb $1,%b4\n\t" > - "je 2f\n\t" > - "movsb\n" > - "2:" > - : "=&c" (d0), "=&D" (d1), "=&S" (d2) > - :"0" (n/4), "q" (n),"1" ((long) to),"2" ((long) from) > - : "memory"); > - return (to); > -} > -#else > -#define __memcpy(a,b,c) memcpy(a,b,c) > -#endif > - > - > -/* > * intel_context.c: > */ > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 29/29] i965: Merge intel_context.h into brw_context.h.
On 09/27/2013 04:46 PM, Kenneth Graunke wrote: > Signed-off-by: Kenneth Graunke > --- > src/mesa/drivers/dri/i965/brw_context.h | 95 +++- > src/mesa/drivers/dri/i965/intel_context.h | 142 > -- > 2 files changed, 93 insertions(+), 144 deletions(-) > delete mode 100644 src/mesa/drivers/dri/i965/intel_context.h > > diff --git a/src/mesa/drivers/dri/i965/brw_context.h > b/src/mesa/drivers/dri/i965/brw_context.h > index 4d4502a..d4e41a1 100644 > --- a/src/mesa/drivers/dri/i965/brw_context.h > +++ b/src/mesa/drivers/dri/i965/brw_context.h > @@ -33,15 +33,35 @@ > #ifndef BRWCONTEXT_INC > #define BRWCONTEXT_INC > > -#include "intel_context.h" > -#include "brw_structs.h" > +#include > +#include > #include "main/imports.h" > #include "main/macros.h" > +#include "main/mm.h" > +#include "main/mtypes.h" > +#include "brw_structs.h" > > #ifdef __cplusplus > extern "C" { > + /* Evil hack for using libdrm in a c++ compiler. */ > +#define virtual virt > #endif > > +#include "drm.h" > +#include "intel_bufmgr.h" > +#include "i915_drm.h" > +#ifdef __cplusplus > + #undef virtual > +} > +#endif > + > +#ifdef __cplusplus > +extern "C" { > +#endif > +#include "intel_debug.h" > +#include "intel_screen.h" > +#include "intel_tex_obj.h" > + > /* Glossary: > * > * URB - uniform resource buffer. A mid-sized buffer which is > @@ -119,6 +139,9 @@ extern "C" { > * Handles blending and (presumably) depth and stencil testing. > */ > > +#define INTEL_WRITE_PART 0x1 > +#define INTEL_WRITE_FULL 0x2 > +#define INTEL_READ0x4 > > #define BRW_MAX_CURBE(32*16) > > @@ -859,6 +882,39 @@ struct brw_query_object { > int last_index; > }; > > +struct intel_sync_object { > + struct gl_sync_object Base; > + > + /** Batch associated with this sync object */ > + drm_intel_bo *bo; > +}; > + > +struct intel_batchbuffer { > + /** Current batchbuffer being queued up. */ > + drm_intel_bo *bo; > + /** Last BO submitted to the hardware. Used for glFinish(). */ > + drm_intel_bo *last_bo; > + /** BO for post-sync nonzero writes for gen6 workaround. */ > + drm_intel_bo *workaround_bo; > + bool need_workaround_flush; > + > + struct cached_batch_item *cached_items; > + > + uint16_t emit, total; > + uint16_t used, reserved_space; > + uint32_t *map; > + uint32_t *cpu_map; > +#define BATCH_SZ (8192*sizeof(uint32_t)) > + > + uint32_t state_batch_offset; > + bool is_blit; > + bool needs_sol_reset; > + > + struct { > + uint16_t used; > + int reloc_count; > + } saved; > +}; > > /** > * Data shared between brw_context::vs and brw_context::gs > @@ -1369,14 +1425,37 @@ struct brw_context >GLint x, GLint y, GLsizei width, GLsizei height); > }; > > +static INLINE bool > +is_power_of_two(uint32_t value) > +{ > + return (value & (value - 1)) == 0; > +} Gallium has util_is_power_of_two. It seems like these could be merged... > + > /*== > * brw_vtbl.c > */ > void brwInitVtbl( struct brw_context *brw ); > > +/* brw_clear.c */ > +extern void intelInitClearFuncs(struct dd_function_table *functions); > + > /*== > * brw_context.c > */ > +extern void intelFinish(struct gl_context * ctx); > + > +enum { > + DRI_CONF_BO_REUSE_DISABLED, > + DRI_CONF_BO_REUSE_ALL > +}; > + > +void intel_update_renderbuffers(__DRIcontext *context, > +__DRIdrawable *drawable); > +void intel_prepare_render(struct brw_context *brw); > + > +void intel_resolve_for_dri2_flush(struct brw_context *brw, > + __DRIdrawable *drawable); > + > bool brwCreateContext(int api, > const struct gl_config *mesaVis, > __DRIcontext *driContextPriv, > @@ -1482,6 +1561,18 @@ bool brw_is_hiz_depth_format(struct brw_context *ctx, > gl_format format); > bool brw_render_target_supported(struct brw_context *brw, > struct gl_renderbuffer *rb); > > +/* intel_extensions.c */ > +extern void intelInitExtensions(struct gl_context *ctx); > + > +/* intel_state.c */ > +extern int intel_translate_shadow_compare_func(GLenum func); > +extern int intel_translate_compare_func(GLenum func); > +extern int intel_translate_stencil_op(GLenum op); > +extern int intel_translate_logic_op(GLenum opcode); > + > +/* intel_syncobj.c */ > +void intel_init_syncobj_functions(struct dd_function_table *functions); > + > /* gen6_sol.c */ > void > brw_begin_transform_feedback(struct gl_context *ctx, GLenum mode, > diff --git a/src/mesa/drivers/dri/i965/intel_context.h > b/src/mesa/drivers/dri/i965/intel_context.h > deleted file mode 100644 > index 89cbb7f..000 > --- a/src/mesa/drivers/dri/i965/intel_context.h > +++ /dev/null > @@ -1,142 +0,0 @@ > -/*
Re: [Mesa-dev] Janitorial work: no more intel_context.[ch]; tidying
On 09/27/2013 04:45 PM, Kenneth Graunke wrote: > This series combines brw_context.[ch] and intel_context.[ch], > and cleans up our context creation code quite a bit. A bunch of > functionality was awkwardly split between the two sets of files; > now it's all in one place. > > While this series is large, it should be fairly easy reading. > Patch 28 does have one functional change on 32-bit systems - it removes > a handcoded assembly version of memcpy. This has not been tested. > > Available as 'tidying6' in my tree. Based on Eric's megadriver-prep series. Patches 1-8, 10-14, 16, 17, 19-27, and 29 are Reviewed-by: Ian Romanick Patch 18 is Acked-by: Ian Romanick because I didn't look at the changes very closely. :) I'll leave 9 and 15 to someone else, and we need some data for 28. > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] [v4] i965: Use IVB specific formula for depthbuffer
After the last patch, we can replace the region allocated in the miptree creation with a more straightforward (and hopefully smaller resulting) buffer based on the bspec's allocation formula. Since I am relatively new to this part of the bspec, I would very much appreciate scrutiny during review of this. There were some ambiguities to me which are likely obvious to others. To prove the reduced [GPU] memory usage I created a simple script which polls the memory usage of the process through debugfs ever .1 seconds. The following results show the memory usage difference over 5 runs of xonotic-glx with ultra settings. The data suggests a 10MB savings on average. I've not measured the savings on the CPU side, but I imagine some amount of savings would be present there as well. x master/mem_usage.txt + mine/mem_usage.txt N Min MaxMedian Avg Stddev x 17121 98959360 7.3394995e+08 7.2782234e+08 7.2209615e+08 43633222 + 17166 1.2538266e+08 7.2241562e+08 7.16288e+08 7.1071472e+08 42964578 Below is the FPS data over those same 5 tests. I'm not sure if the decrease is statistically significant to y'all. I don't have any theories about it. x master/xonotic.fps + mine/xonotic.fps N Min MaxMedian Avg Stddev x 5 27.430746 27.524985 27.50568 27.487017 0.039439874 + 5 27.409173 27.461715 27.441207 27.440883 0.021086805 NOTE: There were a couple of places in the arithmetic where I could have taken some shortcuts. In order to make the code match with the spec as much as possible, I've decided not to do this. One shortcut I did make was the tiling type. Digging through the code it looks like you always want Y-tiled, except when it won't fit, in which case you want X-tiled. I wasn't a fan of the existing helper function that's there since it has a few irrelevant parameters for this operation. I suspect people reviewing this might ask me to change this, which is fine; I just wanted to explain the motivation. v2: copy-paste fix where I used I915_TILING_Y where I meant _X. (Topi) v3: Updated to directly use the bo/stride instead of intel_region. (Ken, Chad) Fix the reference count leak on the hiz buffer (Chad) Don't allow fallback to old mt allocation. It should never happen. (Ben) Break out hz_depth/width calculation to separete functions. (Ben) Use cpp = 1, since the calculation takes cpp into account (Ben) x head/xonotic + mine/xonotic N Min MaxMedian Avg Stddev x 5 25.683336 25.898164 25.872499 25.842426 0.089829019 + 5 25.841368 25.934931 25.869051 25.877494 0.039885576 x head/memusage + mine/memusage N Min MaxMedian Avg Stddev x 18036 89432064 8.6380954e+08 7.9515648e+08 7.930405e+08 42774265 + 18030 86548480 8.6262989e+08 7.8178714e+08 7.7978462e+08 42099587 v4: Don't make the physical size calculation. It is unnecessary and just confusion on my part (Chad) BO size before: 10485760 BO size after: 1310720 This savings of 8.75MB is 1/8 the original size. I can recalculate the average again if requested. CC: Chad Versace Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67564 Signed-off-by: Ben Widawsky --- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 161 +++--- src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 2 +- 2 files changed, 143 insertions(+), 20 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c index e1da9de..7430ba4 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c @@ -793,8 +793,12 @@ intel_miptree_release(struct intel_mipmap_tree **mt) intel_region_release(&((*mt)->region)); intel_miptree_release(&(*mt)->stencil_mt); - intel_miptree_release(&(*mt)->hiz_buffer.mt); - (*mt)->hiz_buffer.bo = NULL; + if (&(*mt)->hiz_buffer.mt) + intel_miptree_release(&(*mt)->hiz_buffer.mt); + else { + drm_intel_bo_unreference((*mt)->hiz_buffer.bo); +(*mt)->hiz_buffer.bo = NULL; + } intel_miptree_release(&(*mt)->mcs_mt); intel_miptree_release(&(*mt)->singlesample_mt); intel_resolve_map_clear(&(*mt)->hiz_map); @@ -1271,30 +1275,149 @@ intel_miptree_slice_enable_hiz(struct brw_context *brw, return true; } +static unsigned int calculate_z_height(const struct intel_mipmap_tree *mt, + const int level) +{ + unsigned int height = minify(mt->logical_height0, level); + + /* The value of Z_Height and Z_Width must each be multiplied by 2 before +* being applied to the table below if Number of Multisamples is set to +* NUMSAMPLES_4. The value of Z_Height must be multiplied by 2 and Z_Width +* must be multiplied by 4 before being applied to the table below if Number +
Re: [Mesa-dev] [PATCH V4 00/13] ARB_texture_gather
On 09/30/2013 03:08 AM, Chris Forbes wrote: > This series adds support for ARB_texture_gather in core mesa and in i965 for > Gen7+. > Notable changes from V3: > > - Only emit extra surface state, recompiles, etc if the shader actually uses > gather4. > - Use SCS to accomplish the workaround on Haswell [will need testing] > > Cc: Kenneth Graunke With the two small issues (that I accidentally sent to the V3 patches) in patch 1 and 2 fixed, those patches are Reviewed-by: Ian Romanick > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 28/29] i965: Drop random 32-bit assembly implementation of memcpy().
On Tue, Oct 1, 2013 at 2:27 AM, Ian Romanick wrote: > On 09/27/2013 04:46 PM, Kenneth Graunke wrote: >> This was only used for uploading batchbuffer data, and only on 32-bit >> systems. If this is actually useful, we might want to use it more >> widely. But more than likely, it isn't. > > This probably is still useful, alas. The glibc memcpy wants to do an > Atom-friendly backwards walk of the addresses. Erm... just curious: Are you sure this is done for Atom ? Originally such copy-from-highest-to-lowest-address copying is (should be: "was") done to enable overlapping copies... but at least POSIX mandates that |memcpy()| is not required to support overlapping copies and users should use |memmove()| instead in such cases (for example Solaris uses the POSIX interpretation in this case... and AFAIK Apple OSX even hits you with an |abort()| if you attempt an overlapping copy with |memcpy()| (or |strcpy()|) (and AFAIK "valgrind" will complain about such abuse of |memcpy()|/|strcpy()|/|stpcpy()|, too)). > For some kinds of > mappings (uncached?), this breaks write combining and ruins performance. That more or less breaks performance _everywhere_ because automatic prefetch obtains the next cache line and not the previous one. Bye, Roland -- __ . . __ (o.\ \/ /.o) roland.ma...@nrubsig.org \__\/\/__/ MPEG specialist, C&&JAVA&&Sun&&Unix programmer /O /==\ O\ TEL +49 641 3992797 (;O/ \/ \O;) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 01/29] i965: Rename brwCreateContext's error parameter to dri_ctx_error.
On 09/27/2013 04:45 PM, Kenneth Graunke wrote: > "error" is a very generic name. dri_ctx_error is the name used in > intelInitContext(), which is more specific. > > Signed-off-by: Kenneth Graunke > --- > src/mesa/drivers/dri/i965/brw_context.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_context.c > b/src/mesa/drivers/dri/i965/brw_context.c > index 230e0bb..75034d3 100644 > --- a/src/mesa/drivers/dri/i965/brw_context.c > +++ b/src/mesa/drivers/dri/i965/brw_context.c > @@ -281,7 +281,7 @@ brwCreateContext(int api, > unsigned major_version, > unsigned minor_version, > uint32_t flags, > - unsigned *error, > + unsigned *dri_ctx_error, >void *sharedContextPrivate) If you were to sneak in fixing this one line of bad whitespace, I'd look the other way... > { > __DRIscreen *sPriv = driContextPriv->driScreenPriv; > @@ -291,7 +291,7 @@ brwCreateContext(int api, > struct brw_context *brw = rzalloc(NULL, struct brw_context); > if (!brw) { >printf("%s: failed to alloc context\n", __FUNCTION__); > - *error = __DRI_CTX_ERROR_NO_MEMORY; > + *dri_ctx_error = __DRI_CTX_ERROR_NO_MEMORY; >return false; > } > > @@ -309,7 +309,7 @@ brwCreateContext(int api, > if (!intelInitContext( brw, api, major_version, minor_version, >mesaVis, driContextPriv, > sharedContextPrivate, &functions, > - error)) { > + dri_ctx_error)) { >intelDestroyContext(driContextPriv); >return false; > } > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 14/29] i965: Remove the brw_context::emit_state_always flag.
On 09/27/2013 04:45 PM, Kenneth Graunke wrote: > This was always set to false, and is only used for debugging. > To enable it, simply change the if (0) block and recompile. So, the difference is that you could change emit_state_always in GDB, but I'm not sure that matters. > Signed-off-by: Kenneth Graunke > --- > src/mesa/drivers/dri/i965/brw_context.c | 2 -- > src/mesa/drivers/dri/i965/brw_context.h | 2 -- > src/mesa/drivers/dri/i965/brw_state_upload.c | 3 ++- > 3 files changed, 2 insertions(+), 5 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_context.c > b/src/mesa/drivers/dri/i965/brw_context.c > index 125dbec..cca7145 100644 > --- a/src/mesa/drivers/dri/i965/brw_context.c > +++ b/src/mesa/drivers/dri/i965/brw_context.c > @@ -471,8 +471,6 @@ brwCreateContext(int api, > */ > STATIC_ASSERT(BRW_NUM_STATE_BITS <= 8 * sizeof(brw->state.dirty.brw)); > > - brw->emit_state_always = 0; > - > brw->batch.need_workaround_flush = true; > > ctx->VertexProgram._MaintainTnlProgram = true; > diff --git a/src/mesa/drivers/dri/i965/brw_context.h > b/src/mesa/drivers/dri/i965/brw_context.h > index 656fb3c..7e15186 100644 > --- a/src/mesa/drivers/dri/i965/brw_context.h > +++ b/src/mesa/drivers/dri/i965/brw_context.h > @@ -1021,8 +1021,6 @@ struct brw_context > > uint32_t max_gtt_map_object_size; > > - bool emit_state_always; > - > int gen; > int gt; > > diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c > b/src/mesa/drivers/dri/i965/brw_state_upload.c > index d7fe319..60c8b5e 100644 > --- a/src/mesa/drivers/dri/i965/brw_state_upload.c > +++ b/src/mesa/drivers/dri/i965/brw_state_upload.c > @@ -471,7 +471,8 @@ void brw_upload_state(struct brw_context *brw) > state->brw |= ctx->NewDriverState; > ctx->NewDriverState = 0; > > - if (brw->emit_state_always) { > + if (0) { > + /* Always re-emit all state. */ >state->mesa |= ~0; >state->brw |= ~0; >state->cache |= ~0; > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 20/29] i965: Make brwInitFunctions take brw_context rather than intel_screen.
On 09/27/2013 04:45 PM, Kenneth Graunke wrote: > It actually just wants generation checking, and brw->gen is the usual > way of doing that. In the future, we'll also want to check brw->hw_ctx, > which isn't available from the screen. > > While we're changing the function signature, convert from studly caps to > our usual naming conventions. It was camel case (humps in the middle). Studly caps starts with a cap. Because it's studly. > Signed-off-by: Kenneth Graunke > --- > src/mesa/drivers/dri/i965/brw_context.c | 12 ++-- > 1 file changed, 6 insertions(+), 6 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_context.c > b/src/mesa/drivers/dri/i965/brw_context.c > index 6b6bea8..41117cb 100644 > --- a/src/mesa/drivers/dri/i965/brw_context.c > +++ b/src/mesa/drivers/dri/i965/brw_context.c > @@ -209,8 +209,8 @@ intelFinish(struct gl_context * ctx) > } > > static void > -brwInitDriverFunctions(struct intel_screen *screen, > - struct dd_function_table *functions) > +brw_init_driver_functions(struct brw_context *brw, > + struct dd_function_table *functions) > { > _mesa_init_driver_functions(functions); > > @@ -232,14 +232,14 @@ brwInitDriverFunctions(struct intel_screen *screen, > > brwInitFragProgFuncs( functions ); > brw_init_common_queryobj_functions(functions); > - if (screen->devinfo->gen >= 6) > + if (brw->gen >= 6) >gen6_init_queryobj_functions(functions); > else >gen4_init_queryobj_functions(functions); > > functions->QuerySamplesForFormat = brw_query_samples_for_format; > > - if (screen->devinfo->gen >= 7) { > + if (brw->gen >= 7) { >functions->BeginTransformFeedback = gen7_begin_transform_feedback; >functions->EndTransformFeedback = gen7_end_transform_feedback; > } else { > @@ -247,7 +247,7 @@ brwInitDriverFunctions(struct intel_screen *screen, >functions->EndTransformFeedback = brw_end_transform_feedback; > } > > - if (screen->devinfo->gen >= 6) > + if (brw->gen >= 6) >functions->GetSamplePosition = gen6_get_sample_position; > } > > @@ -516,7 +516,7 @@ brwCreateContext(int api, > > brwInitVtbl( brw ); > > - brwInitDriverFunctions(screen, &functions); > + brw_init_driver_functions(brw, &functions); > > struct gl_context *ctx = &brw->ctx; > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 12/29] i965: Move device quirks to brw_device_info.
On 09/27/2013 04:45 PM, Kenneth Graunke wrote: > Signed-off-by: Kenneth Graunke > --- > src/mesa/drivers/dri/i965/brw_context.c | 10 +++--- > src/mesa/drivers/dri/i965/brw_device_info.c | 9 - > src/mesa/drivers/dri/i965/brw_device_info.h | 16 > 3 files changed, 27 insertions(+), 8 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_context.c > b/src/mesa/drivers/dri/i965/brw_context.c > index 266f504..53073aa 100644 > --- a/src/mesa/drivers/dri/i965/brw_context.c > +++ b/src/mesa/drivers/dri/i965/brw_context.c > @@ -362,6 +362,9 @@ brwCreateContext(int api, > brw->has_llc = devinfo->has_llc; > brw->has_hiz = devinfo->has_hiz_and_separate_stencil; > brw->has_separate_stencil = devinfo->has_hiz_and_separate_stencil; > + brw->has_negative_rhw_bug = devinfo->has_negative_rhw_bug; > + brw->needs_unlit_centroid_workaround = > + devinfo->needs_unlit_centroid_workaround; > > brw->must_use_separate_stencil = screen->hw_must_use_separate_stencil; > brw->has_swizzling = screen->hw_has_swizzling; > @@ -451,13 +454,6 @@ brwCreateContext(int api, > if (brw->gen == 6) >brw->urb.gen6_gs_previously_active = false; > > - if (brw->gen == 4 && !brw->is_g4x) > - brw->has_negative_rhw_bug = true; > - > - if (brw->gen <= 7) { > - brw->needs_unlit_centroid_workaround = true; > - } > - > brw->prim_restart.in_progress = false; > brw->prim_restart.enable_cut_index = false; > > diff --git a/src/mesa/drivers/dri/i965/brw_device_info.c > b/src/mesa/drivers/dri/i965/brw_device_info.c > index 7dad8ba8..a215917 100644 > --- a/src/mesa/drivers/dri/i965/brw_device_info.c > +++ b/src/mesa/drivers/dri/i965/brw_device_info.c > @@ -27,6 +27,8 @@ > > static const struct brw_device_info brw_device_info_i965 = { > .gen = 4, > + .has_negative_rhw_bug = true, > + .needs_unlit_centroid_workaround = true, > .max_vs_threads = 16, > .max_gs_threads = 2, > .max_wm_threads = 8 * 4, > @@ -37,6 +39,7 @@ static const struct brw_device_info brw_device_info_i965 = { > > static const struct brw_device_info brw_device_info_g4x = { > .gen = 4, > + .needs_unlit_centroid_workaround = true, > .is_g4x = true, > .max_vs_threads = 32, > .max_gs_threads = 2, > @@ -48,6 +51,7 @@ static const struct brw_device_info brw_device_info_g4x = { > > static const struct brw_device_info brw_device_info_ilk = { > .gen = 5, > + .needs_unlit_centroid_workaround = true, > .max_vs_threads = 72, > .max_gs_threads = 32, > .max_wm_threads = 12 * 6, > @@ -61,6 +65,7 @@ static const struct brw_device_info brw_device_info_snb_gt1 > = { > .gt = 2, > .has_hiz_and_separate_stencil = true, > .has_llc = true, > + .needs_unlit_centroid_workaround = true, > .max_vs_threads = 24, > .max_gs_threads = 21, /* conservative; 24 if rendering disabled. */ > .max_wm_threads = 40, > @@ -77,6 +82,7 @@ static const struct brw_device_info brw_device_info_snb_gt2 > = { > .gt = 2, > .has_hiz_and_separate_stencil = true, > .has_llc = true, > + .needs_unlit_centroid_workaround = true, > .max_vs_threads = 60, > .max_gs_threads = 60, > .max_wm_threads = 80, > @@ -92,7 +98,8 @@ static const struct brw_device_info brw_device_info_snb_gt2 > = { > .gen = 7,\ > .has_hiz_and_separate_stencil = true,\ > .must_use_separate_stencil = true, \ > - .has_llc = true > + .has_llc = true, \ > + .needs_unlit_centroid_workaround = true > > static const struct brw_device_info brw_device_info_ivb_gt1 = { > GEN7_FEATURES, .is_ivybridge = true, .gt = 1, > diff --git a/src/mesa/drivers/dri/i965/brw_device_info.h > b/src/mesa/drivers/dri/i965/brw_device_info.h > index 0f4c282..39f4d57 100644 > --- a/src/mesa/drivers/dri/i965/brw_device_info.h > +++ b/src/mesa/drivers/dri/i965/brw_device_info.h > @@ -41,6 +41,22 @@ struct brw_device_info > bool has_llc; > > /** > +* Quirks: > +* @{ > +*/ > + bool has_negative_rhw_bug; > + > + /** > +* Some versions of Gen hardware don't do centroid interpolation correctly > +* on unlit pixels, causing incorrect values for derivatives near triangle > +* edges. Enabling this flag causes the fragment shader to use > +* non-centroid interpolation for unlit pixels, at the expense of two > extra > +* fragment shader instructions. > +*/ > + bool needs_unlit_centroid_workaround; > + /** @} */ I believe you want only one * here. > + > + /** > * GPU Limits: > * @{ > */ > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 02/29] i965: Pull out INTEL_DEBUG handling into new intel_debug.[ch] files.
On 09/27/2013 04:45 PM, Kenneth Graunke wrote: > Now that there isn't an intel_context structure, the split between > brw_context.[ch] and intel_context.[ch] is rather awkward and arbitrary. > Removing intel_context.[ch] seems desirable, but not everything really > belongs in brw_context.[ch], either. > > Moving INTEL_DEBUG handling into separate intel_debug.[ch] files should > make them relatively easy to find. I'm not sure I have a preference, but it seems like this should be brw_debug.[ch]. > Signed-off-by: Kenneth Graunke > --- > src/mesa/drivers/dri/i965/Makefile.sources | 1 + > src/mesa/drivers/dri/i965/intel_context.c | 53 +- > src/mesa/drivers/dri/i965/intel_context.h | 77 +--- > src/mesa/drivers/dri/i965/intel_debug.c| 88 +++ > src/mesa/drivers/dri/i965/intel_debug.h| 109 > + > 5 files changed, 200 insertions(+), 128 deletions(-) > create mode 100644 src/mesa/drivers/dri/i965/intel_debug.c > create mode 100644 src/mesa/drivers/dri/i965/intel_debug.h > > diff --git a/src/mesa/drivers/dri/i965/Makefile.sources > b/src/mesa/drivers/dri/i965/Makefile.sources > index f521daa..8da5643 100644 > --- a/src/mesa/drivers/dri/i965/Makefile.sources > +++ b/src/mesa/drivers/dri/i965/Makefile.sources > @@ -8,6 +8,7 @@ i965_FILES = \ > intel_buffer_objects.c \ > intel_buffers.c \ > intel_context.c \ > + intel_debug.c \ > intel_extensions.c \ > intel_fbo.c \ > intel_mipmap_tree.c \ > diff --git a/src/mesa/drivers/dri/i965/intel_context.c > b/src/mesa/drivers/dri/i965/intel_context.c > index 850d9a0..5a73e28 100644 > --- a/src/mesa/drivers/dri/i965/intel_context.c > +++ b/src/mesa/drivers/dri/i965/intel_context.c > @@ -55,11 +55,6 @@ > #include "utils.h" > #include "../glsl/ralloc.h" > > -#ifndef INTEL_DEBUG > -int INTEL_DEBUG = (0); > -#endif > - > - > static const GLubyte * > intelGetString(struct gl_context * ctx, GLenum name) > { > @@ -304,40 +299,6 @@ intel_viewport(struct gl_context *ctx, GLint x, GLint y, > GLsizei w, GLsizei h) > } > } > > -static const struct dri_debug_control debug_control[] = { > - { "tex", DEBUG_TEXTURE}, > - { "state", DEBUG_STATE}, > - { "ioctl", DEBUG_IOCTL}, > - { "blit", DEBUG_BLIT}, > - { "mip", DEBUG_MIPTREE}, > - { "fall", DEBUG_PERF}, > - { "perf", DEBUG_PERF}, > - { "bat", DEBUG_BATCH}, > - { "pix", DEBUG_PIXEL}, > - { "buf", DEBUG_BUFMGR}, > - { "reg", DEBUG_REGION}, > - { "fbo", DEBUG_FBO}, > - { "fs",DEBUG_WM }, > - { "gs",DEBUG_GS}, > - { "sync", DEBUG_SYNC}, > - { "prim", DEBUG_PRIMS }, > - { "vert", DEBUG_VERTS }, > - { "dri", DEBUG_DRI }, > - { "sf",DEBUG_SF }, > - { "stats", DEBUG_STATS }, > - { "wm",DEBUG_WM }, > - { "urb", DEBUG_URB }, > - { "vs",DEBUG_VS }, > - { "clip", DEBUG_CLIP }, > - { "aub", DEBUG_AUB }, > - { "shader_time", DEBUG_SHADER_TIME }, > - { "no16", DEBUG_NO16 }, > - { "blorp", DEBUG_BLORP }, > - { "vue", DEBUG_VUE }, > - { NULL,0 } > -}; > - > - > static void > intelInvalidateState(struct gl_context * ctx, GLuint new_state) > { > @@ -517,19 +478,7 @@ intelInitContext(struct brw_context *brw, > > intelInitExtensions(ctx); > > - INTEL_DEBUG = driParseDebugString(getenv("INTEL_DEBUG"), debug_control); > - if (INTEL_DEBUG & DEBUG_BUFMGR) > - dri_bufmgr_set_debug(brw->bufmgr, true); > - if ((INTEL_DEBUG & DEBUG_SHADER_TIME) && brw->gen < 7) { > - fprintf(stderr, > - "shader_time debugging requires gen7 (Ivybridge) or > better.\n"); > - INTEL_DEBUG &= ~DEBUG_SHADER_TIME; > - } > - if (INTEL_DEBUG & DEBUG_PERF) > - brw->perf_debug = true; > - > - if (INTEL_DEBUG & DEBUG_AUB) > - drm_intel_bufmgr_gem_set_aub_dump(brw->bufmgr, true); > + brw_process_intel_debug_variable(brw); > > intel_batchbuffer_init(brw); > > diff --git a/src/mesa/drivers/dri/i965/intel_context.h > b/src/mesa/drivers/dri/i965/intel_context.h > index f35dafa..9ec8c63 100644 > --- a/src/mesa/drivers/dri/i965/intel_context.h > +++ b/src/mesa/drivers/dri/i965/intel_context.h > @@ -44,6 +44,7 @@ extern "C" { > #include "intel_bufmgr.h" > > #include "intel_screen.h" > +#include "intel_debug.h" > #include "intel_tex_obj.h" > #include "i915_drm.h" > > @@ -160,82 +161,6 @@ static INLINE void * __memcpy(void * to, const void * > from, size_t n) > > > /* > - * Debugging: > - */ > -extern int INTEL_DEBUG; > - > -#define DEBUG_TEXTURE0x1 > -#define DEBUG_STATE 0x2 > -#define DEBUG_IOCTL 0x4 > -#define DEBUG_BLIT 0x8 > -#define DEBUG_MIPTREE 0x10 > -#define DEBUG_PERF 0x20 > -#define DEBUG_BATCH 0x80 > -#define DEBUG_PIXEL 0x100 > -#define DEBUG_BUFMGR0x200 > -#define DEBUG_REGION0x400 > -#define DEBUG_FBO 0x800 > -#define DEBUG_GS0x100
Re: [Mesa-dev] Janitorial work: no more intel_context.[ch]; tidying
On 09/27/2013 06:24 PM, Emil Velikov wrote: > On 28/09/13 00:45, Kenneth Graunke wrote: >> This series combines brw_context.[ch] and intel_context.[ch], >> and cleans up our context creation code quite a bit. A bunch of >> functionality was awkwardly split between the two sets of files; >> now it's all in one place. >> >> While this series is large, it should be fairly easy reading. >> Patch 28 does have one functional change on 32-bit systems - it removes >> a handcoded assembly version of memcpy. This has not been tested. >> > Hi Kenneth > > Hope you can bare with me and a couple of silly questions :) > > * With the recent split of the intel driver codebase, the new i965 > headers has been getting a bunch of #pragma once over the standard > #ifndef _HEADER_H_... Are those intentional ? I started using that some in the GLSL compiler code. The pragma is not a standard part of C++ or C99, but it is supported by GCC, clang, and the MS compilers[1]. According to the web (which is never wrong), the pragma is faster on MS compilers and roughly the same on GCC. I chose it for a couple reasons: - It was less to type, and it was less error prone. - When changing names of .h files, which I did a lot in the early compiler days, I didn't have to change anything in the file. FWIW... the Google style guide[2] recommends against pragma once, but the Chromium style guide[3] recommends using both pragma once and the ifdef/define/endif guards. We should probably make some decision about this as a group and add the recommendation to Mesa's style guide. [1] http://en.wikipedia.org/wiki/Pragma_once [2] http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml?showone=Windows_Code#Windows_Code [3] http://dev.chromium.org/developers/coding-style > * In patch 29 the drm* headers are included quoted, over angle brackets. > I realise that's a very pedantic point, just curious is it just a > copy'n'paste thing or was it planned. > > * The inline function is_power_of_two() in patch 29 is used by both > intel drivers. Possibly move it to macros.h ? Gallium has it's > equivalent in auxiliary/util/u_math.h - util_is_power_of_two() > > Thanks > Emil >> Available as 'tidying6' in my tree. Based on Eric's megadriver-prep series. >> >> ___ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> http://lists.freedesktop.org/mailman/listinfo/mesa-dev >> > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH V3 02/11] glsl: add texture gather changes
On 09/15/2013 02:58 AM, Chris Forbes wrote: > From: Maxence Le Dore > > V2 [Chris Forbes]: >- Add new pattern, fixup parameter reading. > > V3: Rebase onto new builtins machinery > > Reviewed-by: Kenneth Graunke > --- > src/glsl/builtin_functions.cpp | 35 +++ > src/glsl/glcpp/glcpp-parse.y| 3 +++ > src/glsl/glsl_parser_extras.cpp | 1 + > src/glsl/glsl_parser_extras.h | 2 ++ > src/glsl/ir.cpp | 2 +- > src/glsl/ir.h | 4 +++- > src/glsl/ir_clone.cpp | 1 + > src/glsl/ir_hv_accept.cpp | 1 + > src/glsl/ir_print_visitor.cpp | 3 ++- > src/glsl/ir_reader.cpp | 6 +- > src/glsl/ir_rvalue_visitor.cpp | 1 + > src/glsl/opt_tree_grafting.cpp | 1 + > src/glsl/standalone_scaffolding.cpp | 1 + > src/mesa/program/ir_to_mesa.cpp | 5 + > 14 files changed, 62 insertions(+), 4 deletions(-) > > diff --git a/src/glsl/builtin_functions.cpp b/src/glsl/builtin_functions.cpp > index 528af0d..a7d454c 100644 > --- a/src/glsl/builtin_functions.cpp > +++ b/src/glsl/builtin_functions.cpp > @@ -262,6 +262,13 @@ texture_query_lod(const _mesa_glsl_parse_state *state) >state->ARB_texture_query_lod_enable; > } > > +static bool > +texture_gather(const _mesa_glsl_parse_state *state) > +{ > + return state->is_version(400, 0) || > + state->ARB_texture_gather_enable; > +} > + > /* Desktop GL or OES_standard_derivatives + fragment shader only */ > static bool > fs_oes_derivatives(const _mesa_glsl_parse_state *state) > @@ -1807,6 +1814,34 @@ builtin_builder::create_builtins() > _texture(ir_txd, shader_texture_lod_and_rect, > glsl_type::vec4_type, glsl_type::sampler2DRectShadow_type, > glsl_type::vec4_type, TEX_PROJECT), > NULL); > > + add_function("textureGather", > +_texture(ir_tg4, texture_gather, glsl_type::vec4_type, > glsl_type::sampler2D_type, glsl_type::vec2_type), > +_texture(ir_tg4, texture_gather, glsl_type::ivec4_type, > glsl_type::isampler2D_type, glsl_type::vec2_type), > +_texture(ir_tg4, texture_gather, glsl_type::uvec4_type, > glsl_type::usampler2D_type, glsl_type::vec2_type), > + > +_texture(ir_tg4, texture_gather, glsl_type::vec4_type, > glsl_type::sampler2DArray_type, glsl_type::vec3_type), > +_texture(ir_tg4, texture_gather, glsl_type::ivec4_type, > glsl_type::isampler2DArray_type, glsl_type::vec3_type), > +_texture(ir_tg4, texture_gather, glsl_type::uvec4_type, > glsl_type::usampler2DArray_type, glsl_type::vec3_type), > + > +_texture(ir_tg4, texture_gather, glsl_type::vec4_type, > glsl_type::samplerCube_type, glsl_type::vec3_type), > +_texture(ir_tg4, texture_gather, glsl_type::ivec4_type, > glsl_type::isamplerCube_type, glsl_type::vec3_type), > +_texture(ir_tg4, texture_gather, glsl_type::uvec4_type, > glsl_type::usamplerCube_type, glsl_type::vec3_type), > + > +_texture(ir_tg4, texture_gather, glsl_type::vec4_type, > glsl_type::samplerCubeArray_type, glsl_type::vec4_type), > +_texture(ir_tg4, texture_gather, glsl_type::ivec4_type, > glsl_type::isamplerCubeArray_type, glsl_type::vec4_type), > +_texture(ir_tg4, texture_gather, glsl_type::uvec4_type, > glsl_type::usamplerCubeArray_type, glsl_type::vec4_type), > +NULL); > + > + add_function("textureGatherOffset", > +_texture(ir_tg4, texture_gather, glsl_type::vec4_type, > glsl_type::sampler2D_type, glsl_type::vec2_type, TEX_OFFSET), > +_texture(ir_tg4, texture_gather, glsl_type::ivec4_type, > glsl_type::isampler2D_type, glsl_type::vec2_type, TEX_OFFSET), > +_texture(ir_tg4, texture_gather, glsl_type::uvec4_type, > glsl_type::usampler2D_type, glsl_type::vec2_type, TEX_OFFSET), > + > +_texture(ir_tg4, texture_gather, glsl_type::vec4_type, > glsl_type::sampler2DArray_type, glsl_type::vec3_type, TEX_OFFSET), > +_texture(ir_tg4, texture_gather, glsl_type::ivec4_type, > glsl_type::isampler2DArray_type, glsl_type::vec3_type, TEX_OFFSET), > +_texture(ir_tg4, texture_gather, glsl_type::uvec4_type, > glsl_type::usampler2DArray_type, glsl_type::vec3_type, TEX_OFFSET), > +NULL); > + > F(dFdx) > F(dFdy) > F(fwidth) > diff --git a/src/glsl/glcpp/glcpp-parse.y b/src/glsl/glcpp/glcpp-parse.y > index 6eaa5f9..c7ad3e9 100644 > --- a/src/glsl/glcpp/glcpp-parse.y > +++ b/src/glsl/glcpp/glcpp-parse.y > @@ -1248,6 +1248,9 @@ glcpp_parser_create (const struct gl_extensions > *extensions, int api) > > if (extensions->EXT_shader_integer_mix) >add_builtin_define(parser, "GL_EXT_shader_integer_mix", 1); > + > + if (extensions->ARB_texture_gather) > +
Re: [Mesa-dev] [PATCH V3 01/11] mesa: add texture gather changes
On 09/15/2013 02:58 AM, Chris Forbes wrote: > From: Maxence Le Dore > > Reviewed-by: Kenneth Graunke > --- > src/mapi/glapi/gen/ARB_texture_gather.xml | 14 ++ > src/mapi/glapi/gen/gl_API.xml | 2 +- > src/mesa/main/context.c | 4 > src/mesa/main/extensions.c| 1 + > src/mesa/main/get.c | 1 + > src/mesa/main/get_hash_params.py | 6 ++ > src/mesa/main/mtypes.h| 6 ++ > src/mesa/main/tests/enum_strings.cpp | 3 +++ > 8 files changed, 36 insertions(+), 1 deletion(-) > create mode 100644 src/mapi/glapi/gen/ARB_texture_gather.xml > > diff --git a/src/mapi/glapi/gen/ARB_texture_gather.xml > b/src/mapi/glapi/gen/ARB_texture_gather.xml > new file mode 100644 > index 000..cd331ac > --- /dev/null > +++ b/src/mapi/glapi/gen/ARB_texture_gather.xml > @@ -0,0 +1,14 @@ > + > + > + > + > + > + > + > + > + > + > + > + > + > + > \ No newline at end of file Add the missing newline. :) > diff --git a/src/mapi/glapi/gen/gl_API.xml b/src/mapi/glapi/gen/gl_API.xml > index 71aa9a7..b1dcf13 100644 > --- a/src/mapi/glapi/gen/gl_API.xml > +++ b/src/mapi/glapi/gen/gl_API.xml > @@ -8189,7 +8189,7 @@ > > > xmlns:xi="http://www.w3.org/2001/XInclude"/> > - > + xmlns:xi="http://www.w3.org/2001/XInclude"/> > > > > diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c > index d726d11..ab8137c 100644 > --- a/src/mesa/main/context.c > +++ b/src/mesa/main/context.c > @@ -645,6 +645,10 @@ _mesa_init_constants(struct gl_context *ctx) > ctx->Const.MinProgramTexelOffset = -8; > ctx->Const.MaxProgramTexelOffset = 7; > > + /* GL_ARB_texture_gather */ > + ctx->Const.MinProgramTextureGatherOffset = -8; > + ctx->Const.MaxProgramTextureGatherOffset = 7; > + > /* GL_ARB_robustness */ > ctx->Const.ResetStrategy = GL_NO_RESET_NOTIFICATION_ARB; > > diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c > index 34615e3..337f3ee 100644 > --- a/src/mesa/main/extensions.c > +++ b/src/mesa/main/extensions.c > @@ -142,6 +142,7 @@ static const struct extension extension_table[] = { > { "GL_ARB_texture_env_crossbar", > o(ARB_texture_env_crossbar),GLL,2001 }, > { "GL_ARB_texture_env_dot3",o(ARB_texture_env_dot3), > GLL,2001 }, > { "GL_ARB_texture_float", o(ARB_texture_float), > GL, 2004 }, > + { "GL_ARB_texture_gather", o(ARB_texture_gather), > GL, 2009 }, > { "GL_ARB_texture_mirrored_repeat", o(dummy_true), > GLL,2001 }, > { "GL_ARB_texture_multisample", > o(ARB_texture_multisample), GL, 2009 }, > { "GL_ARB_texture_non_power_of_two", > o(ARB_texture_non_power_of_two),GL, 2003 }, > diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c > index 4f6f59a..f07455e 100644 > --- a/src/mesa/main/get.c > +++ b/src/mesa/main/get.c > @@ -366,6 +366,7 @@ EXTRA_EXT(ARB_map_buffer_alignment); > EXTRA_EXT(ARB_texture_cube_map_array); > EXTRA_EXT(ARB_texture_buffer_range); > EXTRA_EXT(ARB_texture_multisample); > +EXTRA_EXT(ARB_texture_gather); > > static const int > extra_ARB_color_buffer_float_or_glcore[] = { > diff --git a/src/mesa/main/get_hash_params.py > b/src/mesa/main/get_hash_params.py > index 30855c3..987d4a0 100644 > --- a/src/mesa/main/get_hash_params.py > +++ b/src/mesa/main/get_hash_params.py > @@ -718,6 +718,12 @@ descriptor=[ > > # GL_ARB_texture_cube_map_array >[ "TEXTURE_BINDING_CUBE_MAP_ARRAY_ARB", "LOC_CUSTOM, TYPE_INT, > TEXTURE_CUBE_ARRAY_INDEX, extra_ARB_texture_cube_map_array" ], > + > +# GL_ARB_texture_gather > + [ "MIN_PROGRAM_TEXTURE_GATHER_OFFSET_ARB", > "CONTEXT_INT(Const.MinProgramTextureGatherOffset), extra_ARB_texture_gather"], > + [ "MAX_PROGRAM_TEXTURE_GATHER_OFFSET_ARB", > "CONTEXT_INT(Const.MaxProgramTextureGatherOffset), extra_ARB_texture_gather"], > + [ "MAX_PROGRAM_TEXTURE_GATHER_COMPONENTS_ARB", > "CONTEXT_INT(Const.MaxProgramTextureGatherComponents), > extra_ARB_texture_gather"], > + > ]}, > > # Enums restricted to OpenGL Core profile > diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h > index 6d700ec..e24052f 100644 > --- a/src/mesa/main/mtypes.h > +++ b/src/mesa/main/mtypes.h > @@ -2973,6 +2973,11 @@ struct gl_constants > /** GL_EXT_gpu_shader4 */ > GLint MinProgramTexelOffset, MaxProgramTexelOffset; > > + /** GL_ARB_texture_gather */ > + GLuint MinProgramTextureGatherOffset; > + GLuint MaxProgramTextureGatherOffset; > + GLuint MaxProgramTextureGatherComponents; > + > /* GL_ARB_robustness */ > GLenum ResetStrategy; > > @@ -3102,6 +3107,7 @@ struct gl_extensions > GLboolean ARB_texture_env_crossbar; >