2014-05-15 15:27 GMT+04:00 Richard Biener <richard.guent...@gmail.com>: > On Thu, May 15, 2014 at 1:07 PM, Ilya Enkovich <enkovich....@gmail.com> wrote: >> 2014-05-14 19:09 GMT+04:00 H.J. Lu <hjl.to...@gmail.com>: >>> On Wed, May 14, 2014 at 1:18 AM, Ilya Enkovich <enkovich....@gmail.com> >>> wrote: >>>> 2014-05-13 23:21 GMT+04:00 Jeff Law <l...@redhat.com>: >>>>> On 05/13/14 02:38, Ilya Enkovich wrote: >>>>>>>> >>>>>>>> propagate constant bounds value and remove checks in called function). >>>>>>> >>>>>>> >>>>>>> So from a linking standpoint, presumably you have to mangle the >>>>>>> instrumented >>>>>>> caller/callee in some manner. Right? Or are you dynamically >>>>>>> dispatching >>>>>>> somehow? >>>>>> >>>>>> >>>>>> Originally the idea was o have instrumented clone to have the same >>>>>> assembler name as the original function. Since instrumented code is >>>>>> fully compatible with not instrumented code, we always emit only one >>>>>> version. Usage of the same assembler name allows instrumented and not >>>>>> instrumented calls to look similar in assembler. It worked fine until >>>>>> I tried it with LTO where assembler name is used as a unique >>>>>> identifier. With linker resolutions files it became even more harder >>>>>> to use such approach. To resolve these issues I started to use new >>>>>> assembler name with postfix, but linked with the original name using >>>>>> IDENTIFIER_TRANSPARENT_ALIAS. It gives different assembler names for >>>>>> clones and originals during compilation, but both clone and original >>>>>> functions have similar name in output assembler. >>>>> >>>>> OK. So if I read that correctly, it implies that the existence of bounds >>>>> information does not change the signature of the callee. This is >>>>> obviously >>>>> important for C++. >>>>> >>>>> Sounds like I need to sit down with the branch and see how this works in >>>>> the >>>>> new scheme. >>>> >>>> Both mpx branch and Wiki >>>> (http://gcc.gnu.org/wiki/Intel%20MPX%20support%20in%20the%20GCC%20compiler) >>>> page are up-to-date now and may be tried out either in NOP mode or >>>> with simulator. Let me know if you have any troubles with using it. >>>> >>> >>> I built it. But "-fcheck-pointer-bounds -mmpx" doesn't generate >>> MPX enabled executable which runs on both MPX-enabled and >>> non MPX-enabled hardwares. I didn't see any MPX run-time library. >> >> Just checked out the branch and checked generated code. >> >> #cat test.c >> int >> test (int *p, int i) >> { >> return p[i]; >> } >> #gcc -fcheck-pointer-bounds -mmpx test.c -S -O2 >> #cat test.s >> .file "test.c" >> .section .text.unlikely,"ax",@progbits >> .LCOLDB0: >> .text >> .LHOTB0: >> .p2align 4,,15 >> .globl test >> .type test, @function >> test: >> .LFB1: >> .cfi_startproc >> movslq %esi, %rsi >> leaq (%rdi,%rsi,4), %rax >> bndcl (%rax), %bnd0 >> bndcu 3(%rax), %bnd0 >> movl (%rax), %eax >> bnd ret >> .cfi_endproc >> ... >> >> Checks are here. What do you see in your test? > > Wow, that's quite an overhead compared to the non-instrumented variant > > movslq %esi, %rsi > movl (%rdi,%rsi,4), %eax > ret >
Overhead is actually two instructions - checks for lower and upper bounds. lea instruction is probably a miss-optimization. Checks are cheap instructions and do not introduce new dependencies for the load which is the heaviest here. BTW checks are not the main reason for overhead in an instrumented code, it is a bounds tables management (store/load bounds for stored/loaded pointers) which is. Anyway it is too early to speak about overhead until we have hardware to measure it. > I thought bounds-checking was done with some clever prefixes thus > that > > movslq %esi, %rsi > bndmovl (%rdi,%rsi,4), %eax, %bnd0 > bnd ret > > would be possible (well, replace with valid ISA). Doubt it would be possible to encode it keeping backward compatible with existing hardware. Also putting all logic into one instruction does not mean it is executed faster than a sequence of instructions, especially on out-of-order CPUs. Ilya > > Richard. > >> Ilya >> >>> >>> -- >>> H.J.