date:20140605

Re: RTL representation of i386 shrdl instruction is incorrect?

2014-06-05 Thread Richard Biener

On Thu, Jun 5, 2014 at 12:03 AM, Niranjan Hasabnis
 wrote:
> Hello,
>
> I was studying i386 machine description for my research purpose,
> and I stumbled upon following MD entry for 'shrdl' x86 instruction.
> It is obtained from the most recent i386.md file.
>
> (define_insn "x86_shrd"
>   [(set (match_operand:SI 0 "nonimmediate_operand" "+r*m")
> (ior:SI (ashiftrt:SI (match_dup 0)
>  (match_operand:QI 2 "nonmemory_operand" "Ic"))
> (ashift:SI (match_operand:SI 1 "register_operand" "r")
>  (minus:QI (const_int 32) (match_dup 2)
>(clobber (reg:CC FLAGS_REG))]
>   ""
>   "shrd{l}\t{%s2%1, %0|%0, %1, %2}"
>   [(set_attr "type" "ishift")
>(set_attr "prefix_0f" "1")
>(set_attr "mode" "SI")
>(set_attr "pent_pair" "np")
>(set_attr "athlon_decode" "vector")
>(set_attr "amdfam10_decode" "vector")
>(set_attr "bdver1_decode" "vector")])
>
> It seems to me that the RTL representation for 'shrdl' is incorrect.
>
> Semantics of shrdl instruction as per Intel manual is:
> "The instruction shifts the first operand (destination operand) to the right
> the number of bits specified by the third operand (count operand).
> The second operand (source operand) provides bits to shift in from the
> left (starting with the most significant bit of the destination operand)."
> And the way RTL does it is by inclusive-or of arithmetically
> right-shifted destination and left-shifted source operand.
>
> But the problem is that: in case of a destination (reg/mem) containing
> negative value, arithmetically right-shifted destination will have top bits
> set to 1. Inclusive-or with such a value is going to generate a
> result with top bits set to 1 instead of moving contents of source
> into top bits of destination.
>
> E.g., when ebx = b72f60d0, ebp = bfcbd2c8
> shrdl $16, %ebp, %ebx (ebx is dest, ebp is src)
> produces 0xd2c8b72f in ebx.
> But the corresponding RTL produces 0xb72f in ebx.
>
> So it seems to me that instead of 'ashiftrt', RTL should have 'lshiftrt'.
> Can anyone help me with this confusion?

The way I read your explanation you are correct.  It should be possible
to write a testcase that is miscompiled - just try to produce the
matched RTL pattern in C and feed it with operands at runtime that
end up producing a bogus value when shrdl is used.

Oh, and you might want to file a bugreport then ;)

Richard.

> --
>
> Thanks,
> Niranjan Hasabnis,
> PhD student,
> Secure Systems Lab,
> Stony brook University,
> Stony brook, NY.

Gimplilfy ICE in gnat.dg/array18.adb

2014-06-05 Thread BELBACHIR Selim

Hi,

On my private port, I'm unable to debug an ICE on GCC4.7.3 (GNAT 7.1.2) during 
the internal test testsuite/gnat.dg/array18.adb.

Here is the test source code:
-
with Array18_Pkg; use Array18_Pkg;

procedure Array18 is
   A : String (1 .. 1);
begin
   A := F;
end;
-
-
package Array18_Pkg is

   function N return Positive;

   subtype S is String (1 .. N);

   function F return S;

end Array18_Pkg;
-

The size of the String returned by 'F' can't be known at compile time. GNAT 
will compare the size of the returned string to the size of 'A' at runtime (to 
call last_chance_handler or not).


The ICE is an assert inside force_constant_size of gimplify.c (at line 717)  :
---
static void
force_constant_size (tree var)
{
  /* The only attempt we make is by querying the maximum size of objects
 of the variable's type.  */

  HOST_WIDE_INT max_size;

  gcc_assert (TREE_CODE (var) == VAR_DECL);

  max_size = max_int_size_in_bytes (TREE_TYPE (var));

  gcc_assert (max_size >= 0); 
max_size = -1 !!

  DECL_SIZE_UNIT (var)
= build_int_cst (TREE_TYPE (DECL_SIZE_UNIT (var)), max_size);
  DECL_SIZE (var)
= build_int_cst (TREE_TYPE (DECL_SIZE (var)), max_size * BITS_PER_UNIT);
}
---

The 'var' parameter contains  :
---
 
unit size 
align 8 symtab 0 alias set -1 canonical type 0x2ab02348 
precision 8 min  max  context 
pointer_to_this >
sizes-gimplified visited nonaliased-component BLK
size 
readonly visited
arg 0 
readonly visited arg 0 > arg 1 >
unit size 
readonly visited arg 0 >
align 8 symtab 0 alias set 0 canonical type 0x2abb2f18
domain 
sizes-gimplified visited SI
size 
unit size 
align 32 symtab 0 alias set -1 canonical type 0x2abb2e70 
precision 32 min  max  
index type 
chain > context 

chain >
used ignored BLK file 
/vues_statiques/FPGA/belbachir/prism/compiler/gcc_test/internal/../../gcc/gcc/testsuite/gnat.dg/array18.adb
 line 9 col 9 size  unit size 
align 8>
---



I used -fdump-tree-all option and run my cross compiler and the native x86_64 
compiler (same GCC/GNAT version) to compare the dumps

I don't have the same results event in the first dump :

>> X86_64 array18.adb.003t.original :
--
Array18 ()
{
  typedef array18__TTaSP1___XDLU_1__1 array18__TTaSP1___XDLU_1__1;
  typedef  struct ;
  typedef character array18__TaS[1:1];
  character a[1:1];
  character R.0[1:(sizetype) (integer) array18_pkg__R1s];

typedef array18__TTaSP1___XDLU_1__1 array18__TTaSP1___XDLU_1__1;
typedef  struct ;
typedef character array18__TaS[1:1];
character a[1:1];
  if (array18_pkg__R1s != 1)
{
  .gnat_last_chance_handler ("array18.adb", 9);
}
  else
{

}
character R.0[1:(sizetype) (integer) array18_pkg__R1s];
  R.0 = array18_pkg.f ();
  a = VIEW_CONVERT_EXPR(R.0);
  return;
}


_GLOBAL.SZ0.ada_array18 (integer p0, integer p1)
{
  return p1 <= p0 ? (bitsizetype) sizetype) p0 - (sizetype) p1) + 1) * 8) : 
0;


_GLOBAL.SZ1.ada_array18 (integer p0, integer p1)
{
  return p1 <= p0 ? ((sizetype) p0 - (sizetype) p1) + 1 : 0;
--


>> MyPort  array18.adb.003t.original :
--
Array18 ()
{
  typedef array18__TTaSP1___XDLU_1__1 array18__TTaSP1___XDLU_1__1;
  typedef  struct ;
  typedef character array18__TaS[1:1];
  typedef struct array18__a___PAD array18__a___PAD;
  struct array18__a___PAD a;

typedef array18__TTaSP1___XDLU_1__1 array18__TTaSP1___XDLU_1__1;
typedef  struct ;
typedef character array18__TaS[1:1];
typedef struct array18__a___PAD array18__a___PAD;
struct array18__a___PAD a;
  if (array18_pkg__R1s != 1)
{
  .gnat_last_chance_handler ("array18.adb", 9);
}
  else
{

}
  a = {.F=VIEW_CONVERT_EXPR(array18_pkg.f ())};
  return;
}


_GLOBAL.SZ0.ada_array18 (integer p0, integer p1)
{
  return p1 <= p0 ? ((bitsizetype) ((sizetype) p0 - (sizetype) p1) + 1) * 8 : 0;


_GLOBAL.SZ1.ada_array18 (integer p0, integer p1)
{
  return p1 <= p0 ? ((sizetype) p0 - (sizetype) p1) + 1 : 0;
--



The next 004t.gimple dump is incomplete since the ICE is during gimplify pass.


Using debugger I saw that the X86_64 port never goes inside force_constant_size 
(where is located the ICE on my port)...

Can someone give me a hint to solve my problem ? I have no idea which part of 
my backend could be related to the GENERIC or GIMPLE generation and I'm very 
unfamiliar with this part of GCC.


Regards,


Selim Belbachir

Re: Cross-testing libsanitizer

2014-06-05 Thread Christophe Lyon

On 3 June 2014 14:46, Christophe Lyon  wrote:
> On 3 June 2014 12:16, Yury Gribov  wrote:
 Is this 8G of RAM? If yes - I'd be curious to know which part of
 libsanitizer needs so much memory.
>>>
>>>
>>> Here is what I have in gcc.log:
>>> ==12356==ERROR: AddressSanitizer failed to allocate 0x21000
>>> (8589938688) bytes at address ff000 (errno: 12)^M
>>> ==12356==ReserveShadowMemoryRange failed while trying to map
>>> 0x21000 bytes. Perhaps you're using ulimit -v^M
>>
>>
>> Interesting. AFAIK Asan maps shadow memory with NORESERVE flag so it should
>> not consume any RAM at all...
>>
>
> Thanks for the reminder in fact I posted a qemu patch in February
> http://lists.gnu.org/archive/html/qemu-devel/2014-02/msg00319.html
> I thought it was applied, but it's not yet in trunk
>
> I used to use a patched qemu, but when I upgraded to 2.0 I forgot
> about this patch.
> I am going to re-check with a patched qemu, and ping them.

So after applying my patch to qemu, I no longer see this error.
Now, all execution tests fail in timeout after generating ASAN:SEGV.

Which means I have to investigate is going-on :-(

It worked better in February :-(

Christophe.

Re: RTL representation of i386 shrdl instruction is incorrect?

2014-06-05 Thread Niranjan Hasabnis

Hi Richard,

Thanks for your reply. I looked into some of the details of how that
particular RTL template is used. It seems to me that the particular
RTL template is used only when shifting 64-bit data type on a 32-bit
machine. This is the underlying assumption encoded in i386.c file
which generates that particular RTL only when instruction mode is
DImode. If that is the case, then it won't matter whether one uses
arithmetic shift or logical shift to right shift lower 4-bytes of a 8-byte
value. In other words, the mapping between RTL template and shrdl
is incorrect, but the underlying assumption in i386.c guards the bug.

On Thu, Jun 5, 2014 at 3:51 AM, Richard Biener
 wrote:
> On Thu, Jun 5, 2014 at 12:03 AM, Niranjan Hasabnis
>  wrote:
>> Hello,
>>
>> I was studying i386 machine description for my research purpose,
>> and I stumbled upon following MD entry for 'shrdl' x86 instruction.
>> It is obtained from the most recent i386.md file.
>>
>> (define_insn "x86_shrd"
>>   [(set (match_operand:SI 0 "nonimmediate_operand" "+r*m")
>> (ior:SI (ashiftrt:SI (match_dup 0)
>>  (match_operand:QI 2 "nonmemory_operand" "Ic"))
>> (ashift:SI (match_operand:SI 1 "register_operand" "r")
>>  (minus:QI (const_int 32) (match_dup 2)
>>(clobber (reg:CC FLAGS_REG))]
>>   ""
>>   "shrd{l}\t{%s2%1, %0|%0, %1, %2}"
>>   [(set_attr "type" "ishift")
>>(set_attr "prefix_0f" "1")
>>(set_attr "mode" "SI")
>>(set_attr "pent_pair" "np")
>>(set_attr "athlon_decode" "vector")
>>(set_attr "amdfam10_decode" "vector")
>>(set_attr "bdver1_decode" "vector")])
>>
>> It seems to me that the RTL representation for 'shrdl' is incorrect.
>>
>> Semantics of shrdl instruction as per Intel manual is:
>> "The instruction shifts the first operand (destination operand) to the right
>> the number of bits specified by the third operand (count operand).
>> The second operand (source operand) provides bits to shift in from the
>> left (starting with the most significant bit of the destination operand)."
>> And the way RTL does it is by inclusive-or of arithmetically
>> right-shifted destination and left-shifted source operand.
>>
>> But the problem is that: in case of a destination (reg/mem) containing
>> negative value, arithmetically right-shifted destination will have top bits
>> set to 1. Inclusive-or with such a value is going to generate a
>> result with top bits set to 1 instead of moving contents of source
>> into top bits of destination.
>>
>> E.g., when ebx = b72f60d0, ebp = bfcbd2c8
>> shrdl $16, %ebp, %ebx (ebx is dest, ebp is src)
>> produces 0xd2c8b72f in ebx.
>> But the corresponding RTL produces 0xb72f in ebx.
>>
>> So it seems to me that instead of 'ashiftrt', RTL should have 'lshiftrt'.
>> Can anyone help me with this confusion?
>
> The way I read your explanation you are correct.  It should be possible
> to write a testcase that is miscompiled - just try to produce the
> matched RTL pattern in C and feed it with operands at runtime that
> end up producing a bogus value when shrdl is used.
>
> Oh, and you might want to file a bugreport then ;)
>
> Richard.
>
>> --
>>
>> Thanks,
>> Niranjan Hasabnis,
>> PhD student,
>> Secure Systems Lab,
>> Stony brook University,
>> Stony brook, NY.



-- 

Thanks,
Niranjan Hasabnis,
PhD student,
Secure Systems Lab,
Stony brook University,
Stony brook, NY.

Re: [RFC] PR61300 K&R incoming args

2014-06-05 Thread Jeff Law


On 05/31/14 00:30, Alan Modra wrote:

On Fri, May 30, 2014 at 11:27:52AM -0600, Jeff Law wrote:

On 05/26/14 01:38, Alan Modra wrote:

PR61300 shows a need to differentiate between incoming and outgoing
REG_PARM_STACK_SPACE for the PowerPC64 ELFv2 ABI, due to code like
function.c:assign_parm_is_stack_parm determining that a stack home
is available for incoming args if REG_PARM_STACK_SPACE is non-zero.

Background: The ELFv2 ABI requires a parameter save area only when
stack is actually used to pass parameters, and since varargs are
passed on the stack, unprototyped calls must pass both on the stack
and in registers.  OK, easy you say, !prototype_p(fun) means a
parameter save area is needed.  However, a prototype might not be in
scope when compiling an old K&R style C function body, but this does
*not* mean a parameter save area has necesasrily been allocated.  A
caller may well have a prototype in scope at the point of the call.

Ugh.  This reminds me a lot of the braindamage we had to deal with
in the original PA abi's handling of FP values.

In the general case, how can any function ever be sure as to whether
or not its prototype was in scope at a call site?  Yea, we can know
for things with restricted scope, but if it's externally visible, I
don't see how we're going to know the calling context with absolute
certainty.

What am I missing here?


When compiling the function body you don't need to know whether a
prototype was in scope at the call site.  You just need to know the
rules.  :)  For functions with variable argument lists, you'll always
have a parameter save area.  For other functions, whether or not you
have a parameter save area just depends on the number of arguments and
their types (ie. whether you run out of registers for parameter
passing), and you have that whether or not the function is
prototyped.

A simple example might help clear up any confusion.
I think the confusion was all mine. I think Ulrich's comments in the BZ 
and the paragraph above really help clarify things.  This is entirely an 
issue on the callee side.


On the callee side, we want to be looking strictly at the number/types 
of the arguments.  The prototype/lack thereof isn't relevant.







Given
  void fun1(int a, int b, double c);
  void fun2(int a, ...);
   ...
  fun1 (1, 2, 3.0);
  fun2 (1, 2, 3.0);

A call to fun1 with a prototype in scope won't allocate a parameter
save area, and will pass the first arg in r3, the second in r4, and
the third in f1.

A call to fun2 with a prototype in scope will allocate a parameter
save area of 64 bytes (the minimum size of a parameter save area), and
will pass the first arg in r3, the second in the second slot of the
parameter save area, and the third in the third slot of the parameter
save area.  Now the first eight slots/double-words of the parameter
save area are passed in r3 thru r10, so this means the second arg is
actually passed in r4 and the third in r5, not the stack!
Right, so for a call to fun2 the caller allocates, but the callee 
flushes the parameter registers into the allocated space.  Right?  And 
the allocated space would be contiguous with the space used to pass 
arguments once you've exhausted the argument registers?  FP values get 
passed in integer and FP regs to varargs or unprototyped functions.


[ Why, oh why couldn't we have done all this ~10 years ago when all the 
aspects of the PA ABIs were will fresh in my head.  There's a lot of 
similarities, but remembering how it all worked is proving difficult. ]







A call to fun1 or fun2 without a prototype in scope will allocate a
parameter save area, and pass the first arg in r3, the second in r4,
and the third in both f1 and r5.

When compiling fun1 body, the first arg is known to be in r3, the
second in r4, and the third in f1, and we don't use the parameter save
area for storing incoming args to a stack slot.  (At least, after
PR61300 is fixed..)  It doesn't matter if the parameter save area was
allocated or not, we just don't use it.
Right.  It's reasonably similar to how certain aspects of the PA ABIs 
worked, at least from the callee's viewpoint.





When compiling fun2 body, the first arg is known to be in r3, the
second in r4 and the third in r5.  Since the function has a variable
argument list, registers r4 thru r10 are saved to the parameter
save area stack, and we set up our va_list pointer to the second
double-word of the parameter save area stack.  Of course, code
optimisation might lead to removing the saves and using the args
in their incoming regs, but this is conceptually what happens.

Right.  That's also consistent with the old PA32 ABI.

And so the problem you're trying to solve is that when compiling the 
callee.  You incorrectly assumed that if there was not a prototype for 
the callee's definition that the caller had set up the save area and 
that you could flush arguments to it.  That's not true in the case where 
the caller had a prototype for the callee in-scope (and the cal

Re: RTL representation of i386 shrdl instruction is incorrect?

2014-06-05 Thread Marc Glisse


On Thu, 5 Jun 2014, Niranjan Hasabnis wrote:


Thanks for your reply. I looked into some of the details of how that
particular RTL template is used. It seems to me that the particular
RTL template is used only when shifting 64-bit data type on a 32-bit
machine. This is the underlying assumption encoded in i386.c file
which generates that particular RTL only when instruction mode is
DImode. If that is the case, then it won't matter whether one uses
arithmetic shift or logical shift to right shift lower 4-bytes of a 8-byte
value. In other words, the mapping between RTL template and shrdl
is incorrect, but the underlying assumption in i386.c guards the bug.


This is still a bug, please file a PR. The use of (match_dup 0) apparently 
prevents combine from matching the insn (that's just a guess from my notes 
in PR 55583, I don't have access to my gcc machine right now to check), 
but that doesn't mean we shouldn't fix things.


--
Marc Glisse

Re: Gimplilfy ICE in gnat.dg/array18.adb

2014-06-05 Thread Eric Botcazou

> Can someone give me a hint to solve my problem ? I have no idea which part
> of my backend could be related to the GENERIC or GIMPLE generation and I'm
> very unfamiliar with this part of GCC.

Look at the patch installed in conjunction with gnat.dg/array18.adb.

-- 
Eric Botcazou

gcc-4.8-20140605 is now available

2014-06-05 Thread gccadmin

Snapshot gcc-4.8-20140605 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.8-20140605/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.8 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_8-branch 
revision 211290

You'll find:

 gcc-4.8-20140605.tar.bz2 Complete GCC

  MD5=b1d39c37aac3269cc5ff2ed305139e19
  SHA1=14b69930b581493a0d836b5961b3f27d793e5e34

Diffs from 4.8-20140529 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.8
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.

Re: [RFC] PR61300 K&R incoming args

2014-06-05 Thread Alan Modra

On Thu, Jun 05, 2014 at 01:19:19PM -0600, Jeff Law wrote:
> And so the problem you're trying to solve is that when compiling the
> callee.  You incorrectly assumed that if there was not a prototype
> for the callee's definition that the caller had set up the save area
> and that you could flush arguments to it.  That's not true in the
> case where the caller had a prototype for the callee in-scope (and
> the callee was not a varargs function).
> 
> Right?  Just want to make sure I understand the problem.

Exactly correct.

-- 
Alan Modra
Australia Development Lab, IBM

Re: RTL representation of i386 shrdl instruction is incorrect?

Gimplilfy ICE in gnat.dg/array18.adb

Re: Cross-testing libsanitizer

Re: RTL representation of i386 shrdl instruction is incorrect?

Re: [RFC] PR61300 K&R incoming args

Re: RTL representation of i386 shrdl instruction is incorrect?

Re: Gimplilfy ICE in gnat.dg/array18.adb

gcc-4.8-20140605 is now available

Re: [RFC] PR61300 K&R incoming args

9 matches

Site Navigation

Mail list logo

Footer information