----- Original Message ----- > On Sat, Feb 18, 2012 at 4:20 AM, Jose Fonseca <[email protected]> > wrote: > > ----- Original Message ----- > >> On Fri, Feb 17, 2012 at 9:46 PM, Jose Fonseca > >> <[email protected]> > >> wrote: > >> > Dave, > >> > > >> > Ideally there should be only one lp_build_mod() which will > >> > invoke > >> > LLVMBuildSRem or LLVMBuildURem depending on the value of > >> > bld->type.sign. The point being that this allows the same code > >> > generation logic to seemingly target any type without having to > >> > worry too much which target it is targeting. > >> > >> Yeah I agree with this for now, but I'm starting to think a lot of > >> this stuff is redunant once I looked at what Tom has done. > >> > >> The thing is TGSI doesn't have that many crazy options where you > >> are > >> going to be targetting instructions at the wrong type, and > >> wrapping > >> all the basic llvm interfaces with an extra type layer seems to me > >> long term like a waste of time. > > > > So far llvmpipe's TGSI->LLVM IR has only been targetting floating > > point SIMD instructions. > > > > But truth is that many simple fragment shaders can be partially > > done with 8bit and 16bit SIMD integers, if values are represented > > in 8bit unorm and 16 bit unorms. The throughput for these will be > > much higher, as not only we can squeeze more elements, they take > > less cycles, and the hardware has several arithmetic units. > > > > The point of those lp_build_xxx functions is to handle this > > transparently. See, e.g., how lp_build_mul handles fixed point. > > Currently this is only used for blending, but the hope is to > > eventually use it on TGSI translation of simple fragment shaders. > > > > Maybe not the case for the desktop GPUs, but I also heard that some > > low powered devices have shader engines w/ 8bit unorms. > > > > But of course, not all opcodes can be done correctly: and URem/SRem > > might not be one we care. > > > >> I'm happy for now to finish the integer support in the same style > >> as > >> the current code, but I think moving forward afterwards it might > >> be > >> worth investigating a more direct instruction emission scheme. > > > > If you wanna invoke LLVMBuildURem/LLVMBuildSRem directly from tgsi > > translation I'm fine with it. We can always generalize > > > >> Perhaps > >> Tom can comment also from his experience. > > > > BTW, Tom, I just now noticed that there are two action versions for > > add: > > > > /* TGSI_OPCODE_ADD (CPU Only) */ > > static void > > add_emit_cpu( > > const struct lp_build_tgsi_action * action, > > struct lp_build_tgsi_context * bld_base, > > struct lp_build_emit_data * emit_data) > > { > > emit_data->output[emit_data->chan] = > > lp_build_add(&bld_base->base, > > emit_data->args[0], > > emit_data->args[1]); > > } > > > > /* TGSI_OPCODE_ADD */ > > static void > > add_emit( > > const struct lp_build_tgsi_action * action, > > struct lp_build_tgsi_context * bld_base, > > struct lp_build_emit_data * emit_data) > > { > > emit_data->output[emit_data->chan] = LLVMBuildFAdd( > > bld_base->base.gallivm->builder, > > emit_data->args[0], > > emit_data->args[1], ""); > > } > > > > Why is this necessary? lp_build_add will already call LLVMBuildFAdd > > internally as appropriate. > > > > Is this because some of the functions in lp_bld_arit.c will emit > > x86 intrinsics? If so then a "no-x86-intrinsic" flag in the build > > context would achieve the same effect with less code duplication. > > > > If possible I'd prefer a single version of these actions. If not, > > then I'd prefer have them split: lp_build_action_cpu.c and > > lp_build_action_gpu. > > Yes, this is why a split them up. I can add that flag and merge the > actions together.
That would be nice. Thanks. Jose _______________________________________________ mesa-dev mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/mesa-dev
