Re: operator new[] overflow (PR 19351)

2010-12-04 Thread Florian Weimer
* Joe Buck:

> It's wasted code if the multiply instruction detects the overflow.
> It's true that the cost is small (maybe just one extra instruction
> and the same number of tests, maybe one more on architectures where you
> have to load a large constant), but it is slightly worse code than what
> Chris Lattner showed.

It's possible to improve slightly on the LLVM code by using the
overflow flag (at least on i386/amd64), as explained in this blog
post:



My patch emits a run-time division if a VLA is used in an allocator.
But that's a semi-deprecated GCC extension, so I don't think we need
to care.

> Still, it's certainly an improvement on the current
> situation and the cost is negligible compared to the call to the
> allocator.  Since it's a security issue, some form of the patch should
> go in.

Well, should I resubmit, with the fix for the problem building
size_t(-1)?


Re: operator new[] overflow (PR 19351)

2010-12-04 Thread Gabriel Dos Reis
On Sat, Dec 4, 2010 at 7:22 AM, Florian Weimer  wrote:
> * Joe Buck:
>
>> It's wasted code if the multiply instruction detects the overflow.
>> It's true that the cost is small (maybe just one extra instruction
>> and the same number of tests, maybe one more on architectures where you
>> have to load a large constant), but it is slightly worse code than what
>> Chris Lattner showed.
>
> It's possible to improve slightly on the LLVM code by using the
> overflow flag (at least on i386/amd64), as explained in this blog
> post:
>
> 
>
> My patch emits a run-time division if a VLA is used in an allocator.
> But that's a semi-deprecated GCC extension, so I don't think we need
> to care.

Personally, the VLA issue is not one I would care much about.
If it can be done without much cost, fine.  Otherwise, I would
not tie the checking of the standard construct to it.

>
>> Still, it's certainly an improvement on the current
>> situation and the cost is negligible compared to the call to the
>> allocator.  Since it's a security issue, some form of the patch should
>> go in.
>
> Well, should I resubmit, with the fix for the problem building
> size_t(-1)?

I think that would help.

-- Gaby


Re: PATCH: 2 stage BFD linker for LTO plugin

2010-12-04 Thread H.J. Lu
On Fri, Dec 3, 2010 at 10:07 PM, H.J. Lu  wrote:
> On Fri, Dec 3, 2010 at 6:34 PM, H.J. Lu  wrote:
>> On Fri, Dec 3, 2010 at 6:23 PM, Dave Korn  wrote:
>>> On 04/12/2010 01:24, H.J. Lu wrote:
>>>
 I checked in a patch to implement stage 2 linking. Everything
 seems to work, including "gcc -static ... -lm".
>>>
>>>  Any chance you could send a complete diff?
>>>
>>
>> I will submit a complete diff after I fix a few corner cases.
>> In the meantime, you can clone my git tree and do a "git diff".
>>
>
> Hi,
>
> This patch implements 2 stage BFD linker for LTO plugin.
> It works with current LTO API on all cases I tested.
>
> Known issue:  --whole-archive will call plugin on archives with IR
> in stage 2 linking. But ld never calls plugin to get back object files.
> I will try to avoid it in a follow up patch.
>

This turns out not a problem. In stage 2 linking, for --whole-archive
we call plugin to get symbols in the IR element of an archive and
it will be ignored for stage 2 linking.  It is OK since we already get
the trans object files back for stage 2 linking.

BTW, the new linker passed bootstrap-lto with all default languages.
I am planning to include this patch in the next Linux binutils.


-- 
H.J.


gcc-4.6-20101204 is now available

2010-12-04 Thread gccadmin
Snapshot gcc-4.6-20101204 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.6-20101204/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.6 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/trunk revision 167460

You'll find:

 gcc-4.6-20101204.tar.bz2 Complete GCC (includes all of below)

  MD5=f95126861cd874bb05ea3419820c2884
  SHA1=3c0e3622e47b4ed1f3da432e91ecf5d0e4be0f93

 gcc-core-4.6-20101204.tar.bz2C front end and core compiler

  MD5=493b7a26526ee4c04e8d8d85a3872cde
  SHA1=37313c1f4a2d5bba649c667e98c4452917467d46

 gcc-ada-4.6-20101204.tar.bz2 Ada front end and runtime

  MD5=e834a9620fcc7d97a7564d54f9b23255
  SHA1=fe69a975879744ff73a1d997de17f47953ff6bea

 gcc-fortran-4.6-20101204.tar.bz2 Fortran front end and runtime

  MD5=3590c8763f8ba3a8fd4ce8fd7b04fff6
  SHA1=da8182622520176643c7db069f4d6a663252233a

 gcc-g++-4.6-20101204.tar.bz2 C++ front end and runtime

  MD5=b6ea2902d3ca16ae9ad2f3c5322fdf3b
  SHA1=1838f5bf72d5322e2d04f913e2976ebcb3f8a867

 gcc-java-4.6-20101204.tar.bz2Java front end and runtime

  MD5=1a11a4032a710d5547ee30cb05695b20
  SHA1=8ec8ea4c5e0c5e5c59b028e06b96f300ee9e99da

 gcc-objc-4.6-20101204.tar.bz2Objective-C front end and runtime

  MD5=534727d4ac1e852721b99fa23b8ac1b5
  SHA1=b7993f3e398ae7697ca56c8a21256ce2b83bdf88

 gcc-testsuite-4.6-20101204.tar.bz2   The GCC testsuite

  MD5=1b8347c688078769a81672a233b6532e
  SHA1=ff94bca4ea7dcbce115d9cfb6474c5a0b7602fcc

Diffs from 4.6-20101127 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.6
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Re: PATCH: 2 stage BFD linker for LTO plugin

2010-12-04 Thread H.J. Lu
On Sat, Dec 4, 2010 at 9:34 AM, H.J. Lu  wrote:
> On Fri, Dec 3, 2010 at 10:07 PM, H.J. Lu  wrote:
>> On Fri, Dec 3, 2010 at 6:34 PM, H.J. Lu  wrote:
>>> On Fri, Dec 3, 2010 at 6:23 PM, Dave Korn  
>>> wrote:
 On 04/12/2010 01:24, H.J. Lu wrote:

> I checked in a patch to implement stage 2 linking. Everything
> seems to work, including "gcc -static ... -lm".

  Any chance you could send a complete diff?

>>>
>>> I will submit a complete diff after I fix a few corner cases.
>>> In the meantime, you can clone my git tree and do a "git diff".
>>>
>>
>> Hi,
>>
>> This patch implements 2 stage BFD linker for LTO plugin.
>> It works with current LTO API on all cases I tested.
>>
>> Known issue:  --whole-archive will call plugin on archives with IR
>> in stage 2 linking. But ld never calls plugin to get back object files.
>> I will try to avoid it in a follow up patch.
>>
>
> This turns out not a problem. In stage 2 linking, for --whole-archive
> we call plugin to get symbols in the IR element of an archive and
> it will be ignored for stage 2 linking.  It is OK since we already get
> the trans object files back for stage 2 linking.
>
> BTW, the new linker passed bootstrap-lto with all default languages.
> I am planning to include this patch in the next Linux binutils.
>

I missed the IR object in an archive:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42690#c34

This updated patch fixed it.  OK for trunk?

Thanks.

-- 
H.J.
---
bfd/

2010-12-03  H.J. Lu  

PR ld/12248
* bfd.c (BFD_PLUGIN): New.
(BFD_FLAGS_SAVED): Add BFD_PLUGIN.
(BFD_FLAGS_FOR_BFD_USE_MASK): Likewise.

* bfd-in2.h: Regenerated.

ld/

2010-12-03  H.J. Lu  

PR ld/12248
* ldfile.c (ldfile_try_open_bfd): Turn on BFD_PLUGIN and set
claimed to false on non-object files and unclaimed object files.
Set stage1.

* ldlang.c (cmdline_list): New.
(cmdline_next_claimed_output): Likewise.
(cmdline_list_init): Likewise.
(cmdline_get_stage2_input_files): Likewise.
(debug_cmdline_list): Likewise.
(cmdline_list_append): Likewise.
(cmdline_set_next_claimed_output): Likewise.
(cmdline_list_insert_claimed_output): Likewise.
(new_afile): Set stage1 to FALSE;
(lang_init): Call cmdline_list_init.
(lang_process): Call plugin_active_plugins_p to check plugin
support.  Check cmdline_next_claimed_output before opening
stage 2 input.  Call debug_cmdline_list if trace_file_tries
is set.  Call cmdline_get_stage2_input_files to get stage 2
input files.

* ldlang.h (lang_input_statement_struct): Add stage1.
(cmdline_enum_type): New.
(cmdline_header_type): Likewise.
(cmdline_input_statement_type): Likewise.
(cmdline_claimed_output_type): Likewise.
(cmdline_union_type): Likewise.
(cmdline_list_type): Likewise.
(cmdline_list_append): Likewise.
(cmdline_list_insert_claimed_output): Likewise.
(cmdline_set_next_claimed_output): Likewise.

* ldmain.c (add_archive_element): Call
cmdline_set_next_claimed_output with archive BFD.  Set
BFD_PLUGIN if input isn't claimed by plugin.

* lexsup.c (parse_args): Call cmdline_list_append if needed.

* plugin.c (plugin_opt_plugin_arg): Ignore -pass-through=.
(add_input_file): Replace lang_add_input_file with
cmdline_list_insert_claimed_output.
(add_input_library): Likewise.

ld/testsuite/

2010-12-03  H.J. Lu  

PR ld/12248
* ld-plugin/func1i.c: New.
* ld-plugin/func2.c: Likewise.
* ld-plugin/func2h.c: Likewise.
* ld-plugin/func3p.c: Likewise.

* ld-plugin/plugin.exp: Add object files for symbols claimed
or created by testplugin.
* ld-plugin/plugin-7.d: Updated.
* ld-plugin/plugin-8.d: Likewise.
* ld-plugin/plugin-9.d: Likewise.

diff --git a/bfd/bfd-in2.h b/bfd/bfd-in2.h
index e7805b6..10dbe83 100644
--- a/bfd/bfd-in2.h
+++ b/bfd/bfd-in2.h
@@ -5085,14 +5085,17 @@ struct bfd
   /* Decompress sections in this BFD.  */
 #define BFD_DECOMPRESS 0x1
 
+  /* This BFD has been processed by the linker plugin.  */
+#define BFD_PLUGIN 0x2
+
   /* Flags bits to be saved in bfd_preserve_save.  */
 #define BFD_FLAGS_SAVED \
-  (BFD_IN_MEMORY | BFD_COMPRESS | BFD_DECOMPRESS)
+  (BFD_IN_MEMORY | BFD_COMPRESS | BFD_DECOMPRESS | BFD_PLUGIN)
 
   /* Flags bits which are for BFD use only.  */
 #define BFD_FLAGS_FOR_BFD_USE_MASK \
   (BFD_IN_MEMORY | BFD_COMPRESS | BFD_DECOMPRESS | BFD_LINKER_CREATED \
-   | BFD_TRADITIONAL_FORMAT | BFD_DETERMINISTIC_OUTPUT)
+   | BFD_TRADITIONAL_FORMAT | BFD_DETERMINISTIC_OUTPUT | BFD_PLUGIN)
 
   /* Currently my_archive is tested before adding origin to
  anything. I believe that this can become always an add of
diff --git a/bfd/bfd.c b/bfd/bfd.c
index a9ce7cc..7265156 100644
--- a/

combine two load insns

2010-12-04 Thread roy rosen
Hi,

If I have two load SI insns. Is there any way to combine them into one
load DI insn?
Not using peephole which can catch only this limited case of being
sequential insns.
I have seen something done in ARM (*arith_adjacentmem) but it is very
awkward and would not be realistic if the DI is being used by many
different intrinsics.

Thanks, Roy.


Re: operator new[] overflow (PR 19351)

2010-12-04 Thread Chris Lattner

On Dec 4, 2010, at 5:22 AM, Florian Weimer wrote:

> * Joe Buck:
> 
>> It's wasted code if the multiply instruction detects the overflow.
>> It's true that the cost is small (maybe just one extra instruction
>> and the same number of tests, maybe one more on architectures where you
>> have to load a large constant), but it is slightly worse code than what
>> Chris Lattner showed.
> 
> It's possible to improve slightly on the LLVM code by using the
> overflow flag (at least on i386/amd64), as explained in this blog
> post:
> 
> 

Ah, great point.  I improved the clang codegen to this:

$ cat t.cc 
void *test(long count) {
  return new int[count];
}
$ clang t.cc -S -o - -O3 -mkernel -fomit-frame-pointer -mllvm -show-mc-encoding
.section__TEXT,__text,regular,pure_instructions
.globl  __Z4testl
.align  4, 0x90
__Z4testl:  ## @_Z4testl
## BB#0:## %entry
movl$4, %ecx## encoding: [0xb9,0x04,0x00,0x00,0x00]
movq%rdi, %rax  ## encoding: [0x48,0x89,0xf8]
mulq%rcx## encoding: [0x48,0xf7,0xe1]
movq$-1, %rdi   ## encoding: 
[0x48,0xc7,0xc7,0xff,0xff,0xff,0xff]
cmovnoq %rax, %rdi  ## encoding: [0x48,0x0f,0x41,0xf8]
jmp __Znam  ## TAILCALL
## encoding: [0xeb,A]
##   fixup A - offset: 1, value: 
__Znam-1, kind: FK_PCRel_1
.subsections_via_symbols

This could be further improved by inverting the cmov condition to avoid the 
first movq, which we'll tackle as a general regalloc improvement.

Thanks for the pointer!

-Chris