GCC's ICF vs. gold's ICF

2019-01-15 Thread Frank Tetzel
Hi,

why is the ICF pass in gcc not folding member functions which depend on
a template parameter but happen to generate identical code?
Is it because it is not identical on the IR level in the compiler?
Can I somehow dump the IR in text form?

The ICF pass in the gold linker can do it on binary level which is kind
of mentioned in manpage of gcc. I'm just interested in why the compiler
cannot do it on its own.

There is a test program in a blog post I wrote [1].

Best regards,
Frank

[1] 
https://tetzank.github.io/posts/identical-code-folding/#consolidating-independent-member-functions


Re: GCC's ICF vs. gold's ICF

2019-01-15 Thread Frank Tetzel
> > why is the ICF pass in gcc not folding member functions which
> > depend on a template parameter but happen to generate identical
> > code? Is it because it is not identical on the IR level in the
> > compiler? Can I somehow dump the IR in text form?  
> 
> You can look at the ICF dump generated when you pass
> -fdump-ipa-icf-details
> 
> And yes, ICF has to consider IL differences that may result in
> different allowed followup optimizations while when the IL is final
> (such as link-time) no such considerations have to be made.  A very
> simple example would be signed vs. unsigned integer multiplication
> where from the former IL overflow would be undefined and
> optimizations can exploit that while not for the latter.

Thanks for the information. If I read the dump correctly, it also
considers the return type and that seems to be the problem in my tiny
test program.

snippet from dump:

  group: with 1 classes:
class with id: 1, hash: 2170673536, items: 2
  MyArray::operator[](unsigned int)/4 MyArray::operator[](unsigned int)/3 
  false returned: 'alias sets are different' (compatible_types_p:244)
  false returned: 'result types are different' (equals_wpa:676)

The body of the functions look identical, but the return type differs.
So in C++, ICF is "disabled" for templated functions with a template
parameter as return type.

But why is the return type preventing code folding? Because we do not
know the calling convention at this point in time?


Re: ROP attack mitigation

2019-05-13 Thread Frank Tetzel
> > Every year there are new vulnerabilities, and some of them are
> > possible because of ROP attacks.
> > There are a couple things that come to mind to thwart ROP attacks
> > that would rely on compiler changes in particular, instead of a
> > kernel change or hardware change.
> > 
> > I was wondering if anyone in the GCC project might be willing to
> > consider these in order to improve computer security?
> > 
> > Idea 1.
> > Use separate stacks for local variables and stack data.
> > - Use RSP for return addresses, as usual.
> > - Use RBP for functions' local storage i.e. to point to a second
> > downward-growing stack that is in the heap, far from the return
> > stack. This way, a buffer overflow (stack smashing attack) would
> > wipe out functions' data but not return addresses, therefore no ROP.
> > I realize that use of two stacks is an old idea, but now that we
> > have many more general purpose registers, using RBP in this way
> > should be economical.
> > 
> > Idea 2.
> > ROP depends on gadgets, which are short segments of code ending in
> > RET. The tricky thing about mitigating ROP is that gadgets are in
> > unexpected places i.e. not the end of the function where the main
> > RET instruction is. Wherever a RET opcode byte is found, even if it
> > is inside of an immediate value, it can be used to form a gadget.
> > Therefore to reduce the number of available ROP gadgets, first make
> > sure that all immediate values in the program lack bytes that
> > include RET opcodes, so simply replace 0xCB and 0xCA bytes (on x86)
> > with other values. For example: add RAX, 0xCB01
> > becomes
> >add RAX, 0xC001
> >add RAX, 0x0B00
> > Similarly if a branch instruction's relative offset contains a 0xCB
> > or 0xCA in the low offset byte, put a few NOPs in front of it to
> > ensure that byte will not be a RET. It might be tricky to avoid the
> > ModRM byte 0xCB (as in ADD BL, CL) but it can be done.
> > 
> > My 2 cents.  
> Bernd and others within Red Hat looked at ROP mitigation a while back.
> We ultimately stopped development of gadget spoilage due to the
> emergence of CET and instead we've put our focus there.
> 
> While CET won't help older processors, it's my belief it's the right
> first major step in ROP mitigation.  At this point we believe RHEL 8
> is fully ready to use CET when processors hit the streets.  I haven't
> tracked Fedora as closely, but if it's not ready it shouldn't be hard
> to make it ready.  I suspect other distros have similar plans around
> CET.
> 
> I'm not aware of anyone seriously looking at separate stacks for GCC
> right now.  To some degree the stack protector serves the same purpose
> without separating stacks.


Are you talking about Intel CET[1]? It includes a shadow stack for
return addresses. CALL pushes to both stacks, normal and shadow, and
RET pops from both and does a comparison before jumping.

The other feature of CET is Indirect Branch Tracking which enforces
that every legal target of an indirect call/jump starts with an
'endbranch' instruction. This should limit the number of useful gadgets.

I'm not a security expert. I just came across the 'endbr64' instruction
when disassemblying the other day and was curious.


[1] 
https://software.intel.com/sites/default/files/managed/4d/2a/control-flow-enforcement-technology-preview.pdf