Issue with __int128 in powerpc64le

2014-12-19 Thread Roger Ferrer Ibáñez
Hi,

I'm observing a weird behaviour in PowerPC64 Little Endian that does
not seem to occur on other architectures supporting __int128. The
following code, when compiled with -O1 generates wrong output.

-- test.c
#include 

typedef unsigned __int128 uint128_t;

#define PRINT(value) \
{ union u { uint128_t i; unsigned long long l[2]; } _t = { .i = value }; \
fprintf(stderr, "%s => <%016llx, %016llx>\n", #value, _t.l[1],
_t.l[0]); }

__attribute__((noinline))
uint128_t get_int(uint128_t value, unsigned int num_bytes)
{
uint128_t mask = ~(uint128_t)0;
mask <<= (uint128_t)(8 * num_bytes); /* assuming 1 byte = 8 bits */
mask = ~mask;
value &= mask;

return value;
}

int main(int argc, char* argv[])
{
uint128_t x = 0;
x = get_int(10, /* num_bytes */ 1);

PRINT(x);

return 0;
}
-- end of test.c

$ gcc -v
Using built-in specs.
COLLECT_GCC=/home/Computational/rferrer/gcc/install/bin/gcc
COLLECT_LTO_WRAPPER=/home/Computational/rferrer/gcc/install/libexec/gcc/powerpc64le-unknown-linux-gnu/5.0.0/lto-wrapper
Target: powerpc64le-unknown-linux-gnu
Configured with: ../gcc-src/configure
--prefix=/home/Computational/rferrer/gcc/install
--enable-languages=c,c++,fortran
--with-gmp=/home/Computational/rferrer/gcc/install
--with-mpfr=/home/Computational/rferrer/gcc/install
--with-mpc=/home/Computational/rferrer/gcc/install --enable-multiarch
--disable-multilib
Thread model: posix
gcc version 5.0.0 20141218 (experimental) (GCC)

$ make
gcc  -O0 -o test.O0 test.c
./test.O0
x => <, 000a>
gcc  -O1 -o test.O1 test.c
./test.O1
x => 

It looks like GCC somehow forgets to perform the logical not in the
optimized version.

I'd fill a PR but I'm not sure if I'm triggering some sort of
undefined behaviour in the shift/not/and sequence in 'get_int'.

Is this a bug in GCC or in the code above?

Kind regards,


-- 
Roger Ferrer Ibáñez


Re: Issue with __int128 in powerpc64le

2014-12-19 Thread Richard Biener
On Fri, Dec 19, 2014 at 10:13 AM, Roger Ferrer Ibáñez
 wrote:
> Hi,
>
> I'm observing a weird behaviour in PowerPC64 Little Endian that does
> not seem to occur on other architectures supporting __int128. The
> following code, when compiled with -O1 generates wrong output.
>
> -- test.c
> #include 
>
> typedef unsigned __int128 uint128_t;
>
> #define PRINT(value) \
> { union u { uint128_t i; unsigned long long l[2]; } _t = { .i = value }; \
> fprintf(stderr, "%s => <%016llx, %016llx>\n", #value, _t.l[1],
> _t.l[0]); }
>
> __attribute__((noinline))
> uint128_t get_int(uint128_t value, unsigned int num_bytes)
> {
> uint128_t mask = ~(uint128_t)0;
> mask <<= (uint128_t)(8 * num_bytes); /* assuming 1 byte = 8 bits */
> mask = ~mask;
> value &= mask;
>
> return value;
> }
>
> int main(int argc, char* argv[])
> {
> uint128_t x = 0;
> x = get_int(10, /* num_bytes */ 1);
>
> PRINT(x);
>
> return 0;
> }
> -- end of test.c
>
> $ gcc -v
> Using built-in specs.
> COLLECT_GCC=/home/Computational/rferrer/gcc/install/bin/gcc
> COLLECT_LTO_WRAPPER=/home/Computational/rferrer/gcc/install/libexec/gcc/powerpc64le-unknown-linux-gnu/5.0.0/lto-wrapper
> Target: powerpc64le-unknown-linux-gnu
> Configured with: ../gcc-src/configure
> --prefix=/home/Computational/rferrer/gcc/install
> --enable-languages=c,c++,fortran
> --with-gmp=/home/Computational/rferrer/gcc/install
> --with-mpfr=/home/Computational/rferrer/gcc/install
> --with-mpc=/home/Computational/rferrer/gcc/install --enable-multiarch
> --disable-multilib
> Thread model: posix
> gcc version 5.0.0 20141218 (experimental) (GCC)
>
> $ make
> gcc  -O0 -o test.O0 test.c
> ./test.O0
> x => <, 000a>
> gcc  -O1 -o test.O1 test.c
> ./test.O1
> x => 
>
> It looks like GCC somehow forgets to perform the logical not in the
> optimized version.
>
> I'd fill a PR but I'm not sure if I'm triggering some sort of
> undefined behaviour in the shift/not/and sequence in 'get_int'.
>
> Is this a bug in GCC or in the code above?

Please open a bug in either case.

Richard.

> Kind regards,
>
>
> --
> Roger Ferrer Ibáñez


Re: Issue with __int128 in powerpc64le

2014-12-19 Thread Roger Ferrer Ibáñez
Done, it is PR64358.

Kind regards,

2014-12-19 12:21 GMT+01:00 Richard Biener :
> On Fri, Dec 19, 2014 at 10:13 AM, Roger Ferrer Ibáñez
>  wrote:
>> Hi,
>>
>> I'm observing a weird behaviour in PowerPC64 Little Endian that does
>> not seem to occur on other architectures supporting __int128. The
>> following code, when compiled with -O1 generates wrong output.
>>
>> -- test.c
>> #include 
>>
>> typedef unsigned __int128 uint128_t;
>>
>> #define PRINT(value) \
>> { union u { uint128_t i; unsigned long long l[2]; } _t = { .i = value }; 
>> \
>> fprintf(stderr, "%s => <%016llx, %016llx>\n", #value, _t.l[1],
>> _t.l[0]); }
>>
>> __attribute__((noinline))
>> uint128_t get_int(uint128_t value, unsigned int num_bytes)
>> {
>> uint128_t mask = ~(uint128_t)0;
>> mask <<= (uint128_t)(8 * num_bytes); /* assuming 1 byte = 8 bits */
>> mask = ~mask;
>> value &= mask;
>>
>> return value;
>> }
>>
>> int main(int argc, char* argv[])
>> {
>> uint128_t x = 0;
>> x = get_int(10, /* num_bytes */ 1);
>>
>> PRINT(x);
>>
>> return 0;
>> }
>> -- end of test.c
>>
>> $ gcc -v
>> Using built-in specs.
>> COLLECT_GCC=/home/Computational/rferrer/gcc/install/bin/gcc
>> COLLECT_LTO_WRAPPER=/home/Computational/rferrer/gcc/install/libexec/gcc/powerpc64le-unknown-linux-gnu/5.0.0/lto-wrapper
>> Target: powerpc64le-unknown-linux-gnu
>> Configured with: ../gcc-src/configure
>> --prefix=/home/Computational/rferrer/gcc/install
>> --enable-languages=c,c++,fortran
>> --with-gmp=/home/Computational/rferrer/gcc/install
>> --with-mpfr=/home/Computational/rferrer/gcc/install
>> --with-mpc=/home/Computational/rferrer/gcc/install --enable-multiarch
>> --disable-multilib
>> Thread model: posix
>> gcc version 5.0.0 20141218 (experimental) (GCC)
>>
>> $ make
>> gcc  -O0 -o test.O0 test.c
>> ./test.O0
>> x => <, 000a>
>> gcc  -O1 -o test.O1 test.c
>> ./test.O1
>> x => 
>>
>> It looks like GCC somehow forgets to perform the logical not in the
>> optimized version.
>>
>> I'd fill a PR but I'm not sure if I'm triggering some sort of
>> undefined behaviour in the shift/not/and sequence in 'get_int'.
>>
>> Is this a bug in GCC or in the code above?
>
> Please open a bug in either case.
>
> Richard.
>
>> Kind regards,
>>
>>
>> --
>> Roger Ferrer Ibáñez



-- 
Roger Ferrer Ibáñez


GCC 4.8.4 Released

2014-12-19 Thread Jakub Jelinek
The GNU Compiler Collection version 4.8.4 has been released.

GCC 4.8.4 is the fourth bug-fix release containing important fixes for
regressions and serious bugs in GCC 4.8.3 with over 80 bugs fixed since
the previous release.

This release is available from the FTP servers listed at:

  http://www.gnu.org/order/ftp.html

Please do not contact me directly regarding questions or comments about
this release.  Instead, use the resources available from
http://gcc.gnu.org.

As always, a vast number of people contributed to this GCC release -- far
too many to thank them individually!


GCC 4.8.5 Status Report (2014-12-19)

2014-12-19 Thread Jakub Jelinek
Status
==

GCC 4.8.4 has been released, the branch is again open for regression
bugfixes and documentation fixes.  GCC 4.8.5 could be tentatively released
in April next year.


Quality Data


Priority  #   Change from Last Report
---   ---
P10   +- 0
P2  123   +- 0
P36   +- 0
---   ---
Total   129   +- 0


Previous Report
===

https://gcc.gnu.org/ml/gcc/2014-12/msg00080.html


Re: Missing git tags for released GCC

2014-12-19 Thread H.J. Lu
On Tue, Nov 25, 2014 at 3:12 AM, Jonathan Wakely  wrote:
> On 16 November 2014 at 15:51, H.J. Lu wrote:
>> Hi,
>>
>> Git tags are missing for GCC 4.9.1, 4.9.2, 4.8.3 and 4.7.4.
>
> I can't create the tags but these are the release commits:
>
> git tag gcc-4_9_2-release c1283af40b65f1ad862cf5b27e2d9ed10b2076b6
> git tag gcc-4_9_1-release c6fa1b412663593960e6240eb66d82fa41a1fa0b
> git tag gcc-4_8_3-release 6bbf0dec66c0e719b06cd2fe67559fda6df09000
> git tag gcc-4_7_4-release ae10eb82fe34c18640ad65c2ab94ffc53f349315

I added them together with gcc-4_8_4-release.

Thanks.

-- 
H.J.


GCC document link broken and some document questions

2014-12-19 Thread Qun-Ying
Hi,

I found that links for the "GNAT User's Guide"  are broken for version
4.8.4 and 4.9.2.

The links are
https://gcc.gnu.org/onlinedocs/gcc-4.9.2/gnat_ugn_unw/ and
https://gcc.gnu.org/onlinedocs/gcc-4.8.4/gnat_ugn_unw/

There are links to the guide at

https://gcc.gnu.org/onlinedocs/gcc-4.9.2/gnat_ugn/
and
https://gcc.gnu.org/onlinedocs/gcc-4.8.4/gnat_ugn/

But the documents there have some texinfo left over (HTML only, PDF seems OK)

And all the rest of the links to the PDF/PostSCript or HTML  tarball
have the same problem, need to remove "_unw" from the filename

Also for GCC 4.9.2:
https://gcc.gnu.org/onlinedocs/gcc-4.9.2/gnat_ugn/Switches-for-gcc.html#Switches-for-gcc

It mention the flag "-fdump-xref", but it does not work:
cc1: error: unrecognized command line option ‘-fdump-xref’
Seems a mismatch in document.

For the generated PDF and PS files, I am not sure why it puts the
table of content at the end of the file. It defects its purpose.

Thanks

-- 
Qun-Ying


RE: Instruction scheduler with respect to prefetch instructions.

2014-12-19 Thread Ajit Kumar Agarwal


-Original Message-
From: paul_kon...@dell.com [mailto:paul_kon...@dell.com] 
Sent: Saturday, December 13, 2014 9:46 PM
To: Ajit Kumar Agarwal
Cc: vmaka...@redhat.com; l...@redhat.com; richard.guent...@gmail.com; 
gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; 
Nagaraju Mekala
Subject: Re: Instruction scheduler with respect to prefetch instructions.


> On Dec 13, 2014, at 5:22 AM, Ajit Kumar Agarwal 
>  wrote:
> 
> Hello All:
> 
> Since the prefetch instruction have no direct consumers  in the code 
> stream, they provide considerable freedom to the Instruction scheduler. They 
> are typically assigned lower priorities than most of the instructions in the 
> code stream.
> This tends to cause all the prefetch instructions to be placed 
> together in the final schedule. This causes the performance Degradations by 
> placing them in clumps rather than evenly spreading the prefetch instructions.
> 
> The evenly spreading the prefetch instruction gives better speed up 
> ratios as compared to be placing in clumps for dirty Misses.

>>I can believe that’s true for some processors; is it true for all of them?  I 
>>have the impression that some MIPS processors don’t mind clumped prefetches, 
so long as you don’t exceed the limit on total number of concurrently 
>>pending memory accesses.

I think it's okay to have clumped prefetches for architectures  that are 
decided based on prefetch distance as long
it doesn't exceed the concurrent pending memory access. The clumped prefetches 
that are generated by the 
scheduler as there are no direct consumers in the code stream sometimes exceed 
the concurrent pending memory
access if the special mechanism is not done by the scheduler like some 
information to be passed from the generation
of prefetch algorithm phase to the scheduler.

Due to the freedom provided to instruction scheduler for not having direct 
consumers in the code stream, clumps
the prefetch instructions at the end of the basic blocks which will  
invalidates the actual place where the prefetch
instruction is generated based on the prefetch distance.

The prefetch algorithms based on prefetch distance takes care of the cases 
where the clumped prefetches degraded
the performance due to dirty misses.

My question is there any special way of handling the prefetch instruction with 
respect to the instruction scheduler to
overcome the above.
 
Thanks & Regards
Ajit

paul