Re: SPARC LEON3 and CAS instruction

2014-04-28 Thread Sebastian Huber

On 2014-04-25 18:31, Eric Botcazou wrote:

recent GCC versions support the C11 atomic operations for the SPARC LEON3
processor via the CASA instruction.  GCC emits CASA instructions with an ASI
of 0x80.  I think this is due to the usage of "cas" if I get the stuff in
sync.md right:

"(define_insn "*atomic_compare_and_swap_1"
[(set (match_operand:I48MODE 0 "register_operand" "=r")
(match_operand:I48MODE 1 "mem_noofs_operand" "+w"))
 (set (match_dup 1)
(unspec_volatile:I48MODE
  [(match_operand:I48MODE 2 "register_operand" "r")
   (match_operand:I48MODE 3 "register_operand" "0")]
  UNSPECV_CAS))]
"(TARGET_V9 || TARGET_LEON3) && (mode != DImode || TARGET_ARCH64)"
"cas\t%1, %2, %0"
[(set_attr "type" "multi")])"


Right, this is a bug, both in the compiler and the assembler, since an ASI of
0x80 is not allowed for SPARC-V8.


Ok, I didn't notice this before since it worked well on the LEON4FT in 
supervisor mode.  I work currently with the XtratuM hypervisor and noticed this 
problem with CAS in user mode.





According to the LEON3 manual we have:

"6.2.7 Compare and Swap instruction (CASA)

LEON4 implements the SPARC V9 Compare and Swap Alternative (CASA)
instruction. The CASA operates as described in the SPARC V9 manual. The
instruction is privileged, except when setting ASI = 0xA (user data)."

I would like to use atomic operations in user mode.  Is it possible to add a
machine option to GCC to use an ASI of 0x0A for the atomic operations via
CASA on LEON3?


Yes, I guess we actually want to emit an ASI of either 0xA (user data) or 0xB
(supervisor data), predicated on -muser-mode.  I'll prepare a patch.



Thanks, since this -muser-mode seems to be something new, maybe we should 
instead use -mcas=supervisor|user to make it more specific?


--
Sebastian Huber, embedded brains GmbH

Address : Dornierstr. 4, D-82178 Puchheim, Germany
Phone   : +49 89 189 47 41-16
Fax : +49 89 189 47 41-09
E-Mail  : sebastian.hu...@embedded-brains.de
PGP : Public key available on request.

Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.


Re: SPARC LEON3 and CAS instruction

2014-04-28 Thread Eric Botcazou
> Thanks, since this -muser-mode seems to be something new, maybe we should
> instead use -mcas=supervisor|user to make it more specific?

I don't think so, we might need to extend its purview in the future.

-- 
Eric Botcazou


Re: SPARC LEON3 and CAS instruction

2014-04-28 Thread Sebastian Huber

On 2014-04-28 10:02, Eric Botcazou wrote:

Thanks, since this -muser-mode seems to be something new, maybe we should
>instead use -mcas=supervisor|user to make it more specific?

I don't think so, we might need to extend its purview in the future.


Ok, this makes sense.  Which default to you have in mind for the -muser-mode 
option?


--
Sebastian Huber, embedded brains GmbH

Address : Dornierstr. 4, D-82178 Puchheim, Germany
Phone   : +49 89 189 47 41-16
Fax : +49 89 189 47 41-09
E-Mail  : sebastian.hu...@embedded-brains.de
PGP : Public key available on request.

Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.


Re: SPARC LEON3 and CAS instruction

2014-04-28 Thread Eric Botcazou
> Ok, this makes sense.  Which default to you have in mind for the -muser-mode
> option?

-mno-user-mode the default, it's usually what's done in this case I think.

-- 
Eric Botcazou


Re: Support to OpenACC and OpenMP 4.0 in GCC 4.9

2014-04-28 Thread Paolo Leoni
Hi Thomas,
thank you for your response.
I'm primarily a user of OpenACC.

Sorry, I'm a little newbie about GCC building, could you post here a
simple list of commands in order to building GCC 4.9 with openacc
support?

I've found this:
http://gcc.1065356.n5.nabble.com/OpenACC-or-OpenMP-4-0-target-directives-td986517.html

but I get an error when I type "git fetch"...

Thank you in any case.

Paolo L.


Re: How to access points-to information for function pointers

2014-04-28 Thread Richard Biener
On Sat, Apr 26, 2014 at 4:07 PM, Richard Biener
 wrote:
> On April 26, 2014 12:31:34 PM CEST, Swati Rathi  
> wrote:
>>
>>On Friday 25 April 2014 11:11 PM, Richard Biener wrote:
>>> On April 25, 2014 5:54:09 PM CEST, Swati Rathi
>> wrote:
 Hello,

 I am trying to print points-to information for SSA variables as
>>below.

for (i = 1; i < num_ssa_names; i++)
  {
tree ptr = ssa_name (i);
struct ptr_info_def *pi;

if (ptr == NULL_TREE
|| SSA_NAME_IN_FREE_LIST (ptr))
  continue;

pi = SSA_NAME_PTR_INFO (ptr);
if (pi)
  dump_points_to_info_for (file, ptr);
  }

 -
 My test program is given below :

 int main()
 {
int *p, i, j;
void (*fp1)();

if (i)
{
  p = &i;
  fp1 = fun1;
}
 else
{
  p = &j;
  fp1 = fun2;
}

 fp1();

printf ("\n%d %d\n", *p, i);
return 0;
 }
 -
 I get the output as :-

 p_1, points-to vars: { i j }
 fp1_2, points-to vars: { }
 -

 Why is the pointees for function pointer not getting dumped?
>>> It's just not saved.
>>
>>Can we modify the code to preserve values for function pointer SSA
>>names?
>
> Sure.

Index: gcc/tree-ssa-structalias.c
===
--- gcc/tree-ssa-structalias.c  (revision 209782)
+++ gcc/tree-ssa-structalias.c  (working copy)
@@ -6032,7 +6032,8 @@ set_uids_in_ptset (bitmap into, bitmap f

   if (TREE_CODE (vi->decl) == VAR_DECL
  || TREE_CODE (vi->decl) == PARM_DECL
- || TREE_CODE (vi->decl) == RESULT_DECL)
+ || TREE_CODE (vi->decl) == RESULT_DECL
+ || TREE_CODE (vi->decl) == FUNCTION_DECL)
{
  /* If we are in IPA mode we will not recompute points-to
 sets after inlining so make sure they stay valid.  */

note that there isn't a convenient way to go back from a bit in the
points-to bitmap to the actual FUNCTION_DECL refered to.

Richard.

>>What is the reason that it is not preserved for function pointers?
>
> Nobody uses this information.
>
>>Another alternative approach would be to replicate the code (of
>>pass_ipa_pta) and use the information before deleting it.
>>
>>Is there any other way to access this information?
>
> You can of course recompute it when needed.
>
> Richard.
>
>>>
 How can I access this information?


 Regards,
 Swati
>>>
>
>


Announcement of Wide-Int Reviewers

2014-04-28 Thread David Edelsohn
I am pleased to announce that the GCC Steering Committee has
appointed Richard Sandiford, Mike Stump and Ken Zadeck as Wide-Int Reviewers.

Please join me in congratulating Richard, Mike and Ken
on their new role.  Please update your listings in the MAINTAINERS file.

Happy hacking!
David



Re: Support to OpenACC and OpenMP 4.0 in GCC 4.9

2014-04-28 Thread Thomas Schwinge
Hi Paolo!

On Mon, 28 Apr 2014 10:31:15 +0200, Paolo Leoni 
 wrote:
> I'm primarily a user of OpenACC.

Be aware that there is not yet any actual offloading implemented on the
gomp-4_0-branch.

> Sorry, I'm a little newbie about GCC building, could you post here a
> simple list of commands in order to building GCC 4.9 with openacc
> support?

There is no OpenACC support in the released GCC 4.9, but only in the two
development branches mentioned before.

How to build GCC depends on your environment; ,
»Documentation«, »Installation« -> .
Possibly a »[...]/configure --prefix=[...]« and the usual »make«
invocations will be enough.

> I've found this:
> http://gcc.1065356.n5.nabble.com/OpenACC-or-OpenMP-4-0-target-directives-td986517.html
> 
> but I get an error when I type "git fetch"...

Always paste the exact command you used, and the error message you got.
Anyway, if the Git instructions don't work for you:
, »"Live" Sources« -> .


Grüße,
 Thomas


pgpfaj6CwgfLW.pgp
Description: PGP signature


Re: c++ compiler results

2014-04-28 Thread Plinky Dermis
#include 
#include 
#include 
#include 
#include 

// goal: make this code work with constexpr, tested on g++ 4.7.2 (-std=c++11)


//#define constexpr


std::string operator"" _s(const char *literal_string, size_t chars)
{
  return std::string(literal_string, chars);
}


template 
constexpr bool typematch(const T& a, const U& b)
{
  return typeid(a) == typeid(b);
}


template 
struct MatchTypes_t
{
  MatchTypes_t() = default;


  template
  constexpr const void *matchType(const M& match, const std::tuple& t) 
const
  {
const size_t tsize = std::tuple_size>::value;
if (typematch(std::get(t), match)) {
  constexpr auto find = MatchTypes_t();
  return find.matchType(match, t)
? nullptr : static_cast(&(std::get(t)));
} else {
  constexpr auto find = MatchTypes_t();
  return find.matchType(match, t);
}
  }
};


template <>
struct MatchTypes_t<0>
{
  template
  constexpr const void *matchType(const M& match, const std::tuple& t) 
const
  {
return nullptr;
  }
};


namespace std {


template 
const M& get(std::tuple& t, std::exception e = std::bad_typeid(), M dummy 
= M()) throw(std::exception)
{
  const size_t tsize = std::tuple_size>::value;
  constexpr auto find = MatchTypes_t();
  constexpr const void *ptr = find.matchType(dummy, t);
  if (!ptr) { // static_assert(ptr, "bad_typeid or more than one of that type");
throw e;
  }
  return *(static_cast(ptr));
}


} // namespace std


int main()
{
  auto vary = std::make_tuple('A', "mixed set of"_s, -1.2, 2);

  const auto &s = std::get(vary);
  const auto d = std::get(vary);
  return ((s == "mixed set of") && (d == -1.2)) ? 0 : 1;
} 

status of wide-int patch.

2014-04-28 Thread Kenneth Zadeck
At this point we have believe that we have addressed all of the concerns 
that the community has made about the wide-int branch.   We have also 
had each of the sections of the branch approved by the area maintainers.


We are awaiting a clean build on the arm and are currently retesting 
x86-64, s390, and p7 but assuming that those are clean, we are ready to 
merge this branch into trunk in the next day or so.Other port 
maintainers may wish consider testing on the branch before we commit.   
Otherwise we will fix any regressions after the merge.


Thanks for all of the help we have received along the way.

Kenny


OpenMP: Lowering to builtin function calls in the front ends, instead of creating trivial tree nodes?

2014-04-28 Thread Thomas Schwinge
Hi!

While working on some OpenACC constructs in the C front end (notably
those tagged as »Executable Directives«, OpenACC 2.0, 2.12), Jim has
noticed that for a certain class of OpenMP constructs (corresponding in
"style" to the OpenACC Executable Directives), these are directly lowered
to builtin function calls in the front ends, instead of creating trivial
tree nodes, and passing these nodes through all the following compiler
infrastructure, unaltered.  These are, for example,
gcc/c-family/c-omp:c_finish_omp_barrier, or
gcc/c/c-typeck.c:c_finish_omp_cancel.  The same is done in the Fortran
front end, for example, fortran/trans-openmp.c:gfc_trans_omp_barrier.
Such Open* constructs semantically map to runtime library function calls,
without needing any, say, variable remapping in structured blocks
attached to the constructs, for example.

When contributing their OpenACC Fortran front end work to the
gomp-4_0-branch, Samsung, on the other hand, have added new tree nodes,
such as OACC_WAIT, which then are to be translated into runtime library
calls in the omp-low.c's expansion routines.  (I'm not criticizing that;
it's of course a sensible thing to do, to handle all Open* constructs the
same way.)

We now wonder which style is to prefer?  Lowering early into builtin
function calls reduces the number of required tree node codes as well as
some boiler-plate code in the middle end, but potentially adds code
duplication in several front ends (and possibly elsewhere in the middle
end code, too) for building argument lists for runtime library calls, and
so on, which is avoided by centralizing all the library function calls in
one place, in omp-low.c's expansion routines.


Grüße,
 Thomas


pgphh8InAu2Ig.pgp
Description: PGP signature


Re: OpenMP: Lowering to builtin function calls in the front ends, instead of creating trivial tree nodes?

2014-04-28 Thread Jakub Jelinek
On Mon, Apr 28, 2014 at 07:47:13PM +0200, Thomas Schwinge wrote:
> While working on some OpenACC constructs in the C front end (notably
> those tagged as »Executable Directives«, OpenACC 2.0, 2.12), Jim has
> noticed that for a certain class of OpenMP constructs (corresponding in
> "style" to the OpenACC Executable Directives), these are directly lowered
> to builtin function calls in the front ends, instead of creating trivial
> tree nodes, and passing these nodes through all the following compiler
> infrastructure, unaltered.  These are, for example,
> gcc/c-family/c-omp:c_finish_omp_barrier, or
> gcc/c/c-typeck.c:c_finish_omp_cancel.  The same is done in the Fortran
> front end, for example, fortran/trans-openmp.c:gfc_trans_omp_barrier.
> Such Open* constructs semantically map to runtime library function calls,
> without needing any, say, variable remapping in structured blocks
> attached to the constructs, for example.
> 
> When contributing their OpenACC Fortran front end work to the
> gomp-4_0-branch, Samsung, on the other hand, have added new tree nodes,
> such as OACC_WAIT, which then are to be translated into runtime library
> calls in the omp-low.c's expansion routines.  (I'm not criticizing that;
> it's of course a sensible thing to do, to handle all Open* constructs the
> same way.)
> 
> We now wonder which style is to prefer?  Lowering early into builtin
> function calls reduces the number of required tree node codes as well as
> some boiler-plate code in the middle end, but potentially adds code
> duplication in several front ends (and possibly elsewhere in the middle
> end code, too) for building argument lists for runtime library calls, and
> so on, which is avoided by centralizing all the library function calls in
> one place, in omp-low.c's expansion routines.

Depends.  The reason we lower some constructs immediately is that the
middle-end either doesn't care about them at all, or only minimally, and the
code to generate the builtin calls in the frontends is smaller than the
amount of code that would be needed to support the extra tree AND gimple
code, often by significant amount.
Even taskgroup construct initially has been just lowered to 2 function calls
in the FEs, but then I found I need to treat it as a region for cancellation
diagnostics in the middle-end, which is why it is a tree/gimple code now.

So, I'd suggest if you don't need new tree/gimple codes and the lowering in
FEs is sufficiently small to lower early, if you need it during
gimplification, cfg creation, omp lowering or expansion, consider
tree/gimple codes.

Jakub


Re: Add stdatomic.h

2014-04-28 Thread Joseph S. Myers
On Sun, 20 Apr 2014, Sebastian Huber wrote:

> Hello,
> 
> I test currently the GCC 4.9 release branch.  Should  work with
> C++?  I use GCC as a cross-compiler for RTEMS targets.  RTEMS uses Newlib as C

 is very C-specific; C++ programs are expected to use 
 instead (although it may be possible for programs to use a common 
subset that works in both languages in some cases).

-- 
Joseph S. Myers
jos...@codesourcery.com


Improving Asan code on ARM targets

2014-04-28 Thread Yury Gribov

Hi all,

I've recently noticed that GCC generates suboptimal code for Asan on ARM 
targets. E.g. for a 4-byte memory access check


(shadow_val != 0) & (last_byte >= shadow_val)

we get the following sequence:

movr2, r0, lsr #3
andr3, r0, #7
addr3, r3, #3
addr2, r2, #536870912
ldrbr2, [r2]@ zero_extendqisi2
sxtbr2, r2
cmpr3, r2
movltr3, #0
movger3, #1
cmpr2, #0
moveqr3, #0
cmpr3, #0
bne.L5
ldrr0, [r0]

Obviously a shorter code is possible:

movr3, r0, lsr #3
andr1, r0, #7
addr1, r1, #4
addr3, r3, #536870912
ldrbr3, [r3]@ zero_extendqisi2
sxtbr3, r3
cmpr3, #0
cmpner1, r3
bgt.L5
ldrr0, [r0]

A 30% improvement looked quite important given that Asan usually 
increases code-size by 1.5-2x so I decided to investigate this. It 
turned out that ARM backend already has full support for dominated 
comparisons (cmp-cmpne-bgt sequence above) and can generate efficient 
code if we provide it with a slightly more explicit gimple sequence:


(shadow_val != 0) & (last_byte + 1 > shadow_val)

Ideally backend should be able perform this transform itself. But I'm 
not sure this is possible: it needs to know that last_range + 1 can not 
overflow and this info is not available in RTL (because we don't have 
VRP pass there).


I have attached a simple patch which changes Asan pass to generate the 
ARM-friendly code. I've only bootstrapped/regtested on x64 but I can 
perform additional tests on ARM if the patch make sense. As far as I can 
tell it does not worsen sanitized code on other platforms (x86/x64) 
while significantly improving ARM (15% less code for bzip).


The patch is certainly not ideal:
* it makes target-specific changes in machine-independent code
* it does not help with 1-byte accesses (forwprop pass thinks that it's 
always beneficial to convert x + 1 > y to x >= y so it reverts my change)
* it only improves Asan code whereas it would be great if ARM backend 
could improve generic RTL code
but it achieves significant improvement on ARM without hurting other 
platforms.


So my questions are:
* is this kind of target-specific tweaking acceptable in middle-end?
* if not - what would be a better option?

-Y
2014-04-29  Yury Gribov  

	* asan.c (build_check_stmt): Change generated code to improve
	code generated for ARM.

diff --git a/gcc/asan.c b/gcc/asan.c
index d7c282e..f00705a 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -1543,18 +1543,17 @@ build_check_stmt (location_t location, tree base, gimple_stmt_iterator *iter,
 {
   /* Slow path for 1, 2 and 4 byte accesses.
 	 Test (shadow != 0)
-	  & ((base_addr & 7) + (size_in_bytes - 1)) >= shadow).  */
+	  & ((base_addr & 7) + size_in_bytes) > shadow).  */
   gimple_seq seq = NULL;
   gimple shadow_test = build_assign (NE_EXPR, shadow, 0);
   gimple_seq_add_stmt (&seq, shadow_test);
   gimple_seq_add_stmt (&seq, build_assign (BIT_AND_EXPR, base_addr, 7));
   gimple_seq_add_stmt (&seq, build_type_cast (shadow_type,
   gimple_seq_last (seq)));
-  if (size_in_bytes > 1)
-gimple_seq_add_stmt (&seq,
- build_assign (PLUS_EXPR, gimple_seq_last (seq),
-   size_in_bytes - 1));
-  gimple_seq_add_stmt (&seq, build_assign (GE_EXPR, gimple_seq_last (seq),
+  gimple_seq_add_stmt (&seq,
+   build_assign (PLUS_EXPR, gimple_seq_last (seq),
+ size_in_bytes));
+  gimple_seq_add_stmt (&seq, build_assign (GT_EXPR, gimple_seq_last (seq),
shadow));
   gimple_seq_add_stmt (&seq, build_assign (BIT_AND_EXPR, shadow_test,
gimple_seq_last (seq)));


Re: Improving Asan code on ARM targets

2014-04-28 Thread Andrew Pinski
On Mon, Apr 28, 2014 at 10:50 PM, Yury Gribov  wrote:
> Hi all,
>
> I've recently noticed that GCC generates suboptimal code for Asan on ARM
> targets. E.g. for a 4-byte memory access check
>
> (shadow_val != 0) & (last_byte >= shadow_val)
>
> we get the following sequence:
>
> movr2, r0, lsr #3
> andr3, r0, #7
> addr3, r3, #3
> addr2, r2, #536870912
> ldrbr2, [r2]@ zero_extendqisi2
> sxtbr2, r2
> cmpr3, r2
> movltr3, #0
> movger3, #1
> cmpr2, #0
> moveqr3, #0
> cmpr3, #0
> bne.L5
> ldrr0, [r0]
>
> Obviously a shorter code is possible:
>
> movr3, r0, lsr #3
> andr1, r0, #7
> addr1, r1, #4
> addr3, r3, #536870912
> ldrbr3, [r3]@ zero_extendqisi2
> sxtbr3, r3
> cmpr3, #0
> cmpner1, r3
> bgt.L5
> ldrr0, [r0]

Does the patch series at located at:
http://gcc.gnu.org/ml/gcc-patches/2014-02/msg01407.html
http://gcc.gnu.org/ml/gcc-patches/2014-02/msg01405.html
Fix this code generation issue?  I suspect it does and improves more
than just the above code.

Thanks,
Andrew Pinski

>
> A 30% improvement looked quite important given that Asan usually increases
> code-size by 1.5-2x so I decided to investigate this. It turned out that ARM
> backend already has full support for dominated comparisons (cmp-cmpne-bgt
> sequence above) and can generate efficient code if we provide it with a
> slightly more explicit gimple sequence:
>
> (shadow_val != 0) & (last_byte + 1 > shadow_val)
>
> Ideally backend should be able perform this transform itself. But I'm not
> sure this is possible: it needs to know that last_range + 1 can not overflow
> and this info is not available in RTL (because we don't have VRP pass
> there).
>
> I have attached a simple patch which changes Asan pass to generate the
> ARM-friendly code. I've only bootstrapped/regtested on x64 but I can perform
> additional tests on ARM if the patch make sense. As far as I can tell it
> does not worsen sanitized code on other platforms (x86/x64) while
> significantly improving ARM (15% less code for bzip).
>
> The patch is certainly not ideal:
> * it makes target-specific changes in machine-independent code
> * it does not help with 1-byte accesses (forwprop pass thinks that it's
> always beneficial to convert x + 1 > y to x >= y so it reverts my change)
> * it only improves Asan code whereas it would be great if ARM backend could
> improve generic RTL code
> but it achieves significant improvement on ARM without hurting other
> platforms.
>
> So my questions are:
> * is this kind of target-specific tweaking acceptable in middle-end?
> * if not - what would be a better option?
>
> -Y


Re: How to access points-to information for function pointers

2014-04-28 Thread Swati Rathi

On Monday 28 April 2014 02:46 PM, Richard Biener wrote:

On Sat, Apr 26, 2014 at 4:07 PM, Richard Biener
 wrote:

On April 26, 2014 12:31:34 PM CEST, Swati Rathi  
wrote:

On Friday 25 April 2014 11:11 PM, Richard Biener wrote:

On April 25, 2014 5:54:09 PM CEST, Swati Rathi

 wrote:

Hello,

I am trying to print points-to information for SSA variables as

below.

for (i = 1; i < num_ssa_names; i++)
  {
tree ptr = ssa_name (i);
struct ptr_info_def *pi;

if (ptr == NULL_TREE
|| SSA_NAME_IN_FREE_LIST (ptr))
  continue;

pi = SSA_NAME_PTR_INFO (ptr);
if (pi)
  dump_points_to_info_for (file, ptr);
  }

-
My test program is given below :

int main()
{
int *p, i, j;
void (*fp1)();

if (i)
{
  p = &i;
  fp1 = fun1;
}
else
{
  p = &j;
  fp1 = fun2;
}

fp1();

printf ("\n%d %d\n", *p, i);
return 0;
}
-
I get the output as :-

p_1, points-to vars: { i j }
fp1_2, points-to vars: { }
-

Why is the pointees for function pointer not getting dumped?

It's just not saved.

Can we modify the code to preserve values for function pointer SSA
names?

Sure.

Index: gcc/tree-ssa-structalias.c
===
--- gcc/tree-ssa-structalias.c  (revision 209782)
+++ gcc/tree-ssa-structalias.c  (working copy)
@@ -6032,7 +6032,8 @@ set_uids_in_ptset (bitmap into, bitmap f

if (TREE_CODE (vi->decl) == VAR_DECL
   || TREE_CODE (vi->decl) == PARM_DECL
- || TREE_CODE (vi->decl) == RESULT_DECL)
+ || TREE_CODE (vi->decl) == RESULT_DECL
+ || TREE_CODE (vi->decl) == FUNCTION_DECL)
 {
   /* If we are in IPA mode we will not recompute points-to
  sets after inlining so make sure they stay valid.  */

Thanks a lot. :) This is of great help.


note that there isn't a convenient way to go back from a bit in the
points-to bitmap to the actual FUNCTION_DECL refered to.

The bitmap is set by identifying the bit using  DECL_PT_UID.
For variables, referenced_var_lookup returns the associated variable.

For FUNCTION_DECL's, all we need to do is store a mapping between uid 
and FUNCTION_DECL.

Is this correct?



Richard.


What is the reason that it is not preserved for function pointers?

Nobody uses this information.


Another alternative approach would be to replicate the code (of
pass_ipa_pta) and use the information before deleting it.

Is there any other way to access this information?

You can of course recompute it when needed.

Richard.


How can I access this information?


Regards,
Swati






Re: Improving Asan code on ARM targets

2014-04-28 Thread Konstantin Serebryany
+ eugeni.stepanov

On Tue, Apr 29, 2014 at 10:04 AM, Andrew Pinski  wrote:
> On Mon, Apr 28, 2014 at 10:50 PM, Yury Gribov  wrote:
>> Hi all,
>>
>> I've recently noticed that GCC generates suboptimal code for Asan on ARM
>> targets. E.g. for a 4-byte memory access check
>>
>> (shadow_val != 0) & (last_byte >= shadow_val)
>>
>> we get the following sequence:
>>
>> movr2, r0, lsr #3
>> andr3, r0, #7
>> addr3, r3, #3
>> addr2, r2, #536870912
>> ldrbr2, [r2]@ zero_extendqisi2
>> sxtbr2, r2
>> cmpr3, r2
>> movltr3, #0
>> movger3, #1
>> cmpr2, #0
>> moveqr3, #0
>> cmpr3, #0
>> bne.L5
>> ldrr0, [r0]
>>
>> Obviously a shorter code is possible:
>>
>> movr3, r0, lsr #3
>> andr1, r0, #7
>> addr1, r1, #4
>> addr3, r3, #536870912
>> ldrbr3, [r3]@ zero_extendqisi2
>> sxtbr3, r3
>> cmpr3, #0
>> cmpner1, r3
>> bgt.L5
>> ldrr0, [r0]
>
> Does the patch series at located at:
> http://gcc.gnu.org/ml/gcc-patches/2014-02/msg01407.html
> http://gcc.gnu.org/ml/gcc-patches/2014-02/msg01405.html
> Fix this code generation issue?  I suspect it does and improves more
> than just the above code.
>
> Thanks,
> Andrew Pinski
>
>>
>> A 30% improvement looked quite important given that Asan usually increases
>> code-size by 1.5-2x so I decided to investigate this. It turned out that ARM
>> backend already has full support for dominated comparisons (cmp-cmpne-bgt
>> sequence above) and can generate efficient code if we provide it with a
>> slightly more explicit gimple sequence:
>>
>> (shadow_val != 0) & (last_byte + 1 > shadow_val)
>>
>> Ideally backend should be able perform this transform itself. But I'm not
>> sure this is possible: it needs to know that last_range + 1 can not overflow
>> and this info is not available in RTL (because we don't have VRP pass
>> there).
>>
>> I have attached a simple patch which changes Asan pass to generate the
>> ARM-friendly code. I've only bootstrapped/regtested on x64 but I can perform
>> additional tests on ARM if the patch make sense. As far as I can tell it
>> does not worsen sanitized code on other platforms (x86/x64) while
>> significantly improving ARM (15% less code for bzip).
>>
>> The patch is certainly not ideal:
>> * it makes target-specific changes in machine-independent code
>> * it does not help with 1-byte accesses (forwprop pass thinks that it's
>> always beneficial to convert x + 1 > y to x >= y so it reverts my change)
>> * it only improves Asan code whereas it would be great if ARM backend could
>> improve generic RTL code
>> but it achieves significant improvement on ARM without hurting other
>> platforms.
>>
>> So my questions are:
>> * is this kind of target-specific tweaking acceptable in middle-end?
>> * if not - what would be a better option?
>>
>> -Y