Re: TLS Implementation Across Architectures

2020-06-29 Thread Alexandre Oliva
On Jun 25, 2020, Joel Sherrill  wrote:

> Is there some documentation on how it is implemented on architectures not
> in Ulrich's paper?

Uli's paper pre-dates GNU2 TLS, I'm not sure whether he updated it to
cover it, so https://www.fsfla.org/~lxoliva/writeups/TLS/ might be useful.

-- 
Alexandre Oliva, freedom fighterhe/himhttps://FSFLA.org/blogs/lxo/
Free Software Evangelist  Stallman was right, but he's left :(
GNU Toolchain Engineer   Live long and free, and prosper ethically


RFC noipa sizeof function for record relayout at link time

2020-06-29 Thread Erick Ochoa

Hello,

I have been working on link time optimization for C that may change the 
size of structs (at link time). We are close to sharing the results we 
have so far, but there are a couple of missing pieces left to work on:


Implementations of sizeof and offsetof that support this change in 
struct layout at link time.


== What is the problem? ==

Currently, for both sizeof and offsetof, the C parser will replace these 
statements with trees that correspond to the value returned by sizeof 
and offsetof at parse time. For example:


// source code
struct astruct a;
memset(a, 0, sizeof(a));

// parse time
memset(a, 0, 64);

// after dead field elimination
// struct astruct is now 56 bytes long
memset(a, 0, 64); // <-- we are overwriting memory!

At link time, we really shouldn't change the value 64 since we can't and 
shouldn't assume that the value 64 came from a sizeof statement. The 
source code could have been written this way:


// source code
struct astruct a;
memset(a, 0, 64);

regardless of whether the struct astruct has a length of 64.

** We need to identify which trees come from sizeof statements **

== What do we want? ==

What we really want is to make sure that our transformation performs the 
following changes (or no changes!) depending on the source code.


If the value for memset's argument comes from a sizeof statement:

// source code
struct astruct a;
memset(a, 0, sizeof(a));

// parse time
memset(a, 0, 64);

// after dead field elimination
memset(a, 0, 56);

However, in the case in which no sizeof is used, we want to do the 
following:


// source code
struct astruct a;
memset(a, 0, 64);

// parse time
memset(a, 0, 64);

// after dead field elimination
memset(a, 0, 64);

== How do we get what we want? ==

Ideally what we want is to:

* Be able to change the value returned by sizeof and offsetof at link time:
  * possibly a global variable?
* Identify which values come from sizeof statement:
  * matching identifiers?
* No re/define valid C identifiers:
  * in gimple we can have an identifier we a dot in it.
* Disable constant propagation and other optimizations:
  * possibly __attribute__((noipa))
* Be able to work with parallel compilation (make -j)
* Be able to work with any Makefile
  * No C code generation and then compile and link gen code at the end.

So, I've been thinking about multiple options:

* Extending gimple to add support for a sizeof statement
* A function per struct generated during compilation (sizeof & offsetof)
* A variable per struct generated during compilation (sizeof and more 
for offsetof)


I think extending gimple to add support for a sizeof statement gets us 
all what we want, however this would involve rewriting possibly many 
parts of GCC. As such, I am somewhat opposed to this.


I then thought of generating global variables during parse/time 
compilation. In this scheme, I would replace sizeof statements with a 
reference to a global variable (or function) that is initialized with 
the value returned by the sizeof statement during parse time. At link 
time we can replace initialization value if needed. For example:


// The parser is parsing a C file
// it encounters a sizeof statement
sizeof(struct astruct);

// Parsing is paused.
// Does a global variable that identifies this struct exists?
// I.e. size_t __lto.sizeof.astruct exists?
// If it doesn't create it.

size_t __lto.sizeof.astruct = 64

// Back to the parser
// instead of replacing
// sizeof(struct astruct) with 64
// replace with the following gimple:

__lto.sizeof.astruct

// Continue parsing until the end of file compilation.

// If at link time we detect that we will delete a field from astruct
// Then we will have to look at the initialization value of
// __lto.sizeof.astruct and replace it with the new value.

size_t __lto.sizeof.$identifier = 56

This strategy can be used with global functions instead of variables and 
it is similar. The only differences would be we would create a global 
function instead of a variable and we would call that function to obtain 
the value.


For offsetof, we will need to change in the following way:

// Parser encounter offsetof
offsetof(struct astruct, b);

/ Parsing is paused.
// Does a global variable that identifies this struct AND field exists?

// The previous field has a size of 8
size_t __lto.offsetof.astruct._8 = 8

// Back to the parser
// instead of replacing
// offsetof(struct astruct, b) with 8
// replace with the following gimple:

__lto.offsetof.astruct._8

// Continue parsing until the end of file compilation.

// If at link time we detect that we will delete the previous field
// then we can rewrite all the offsetof for this struct and which refer
// to the fields that follow the deleted field

__lto.offsetof.astruct._8 = 0;
// Because the previous field was deleted for example
// All variables referring to offsets allocated
// further than the one deleted will need an update as well.

== What's the WORST CASE performance of having an u

GCC 10.1.1 Status Report (2020-06-29)

2020-06-29 Thread Richard Biener


Status
==

The GCC 10 branch is in regression and documentation fixing mode.

We're close to two months after the GCC 10.1 release which means
a first bugfix release is about to happen.  The plan is to release
mid July and I am targeting for a release candidate mid next
week, no later than July 17th.

Branch status looks mostly good so this is a heads up for backporting
of important regression fixes that already happened on trunk as well
as checking build status of non-primary targets.


Quality Data


Priority  #   Change from last report
---   ---
P1 
P2  216   +   8
P3   47   +  33
P4  174   +   1
P5   22   +   1
---   ---
Total P1-P3 263   +  41
Total   459   +  43


Previous Report
===

https://gcc.gnu.org/pipermail/gcc/2020-April/000504.html


Re: RFC noipa sizeof function for record relayout at link time

2020-06-29 Thread Richard Biener via Gcc
On Mon, Jun 29, 2020 at 11:56 AM Erick Ochoa
 wrote:
>
> Hello,
>
> I have been working on link time optimization for C that may change the
> size of structs (at link time). We are close to sharing the results we
> have so far, but there are a couple of missing pieces left to work on:
>
> Implementations of sizeof and offsetof that support this change in
> struct layout at link time.
>
> == What is the problem? ==
>
> Currently, for both sizeof and offsetof, the C parser will replace these
> statements with trees that correspond to the value returned by sizeof
> and offsetof at parse time. For example:
>
> // source code
> struct astruct a;
> memset(a, 0, sizeof(a));
>
> // parse time
> memset(a, 0, 64);
>
> // after dead field elimination
> // struct astruct is now 56 bytes long
> memset(a, 0, 64); // <-- we are overwriting memory!
>
> At link time, we really shouldn't change the value 64 since we can't and
> shouldn't assume that the value 64 came from a sizeof statement. The
> source code could have been written this way:
>
> // source code
> struct astruct a;
> memset(a, 0, 64);
>
> regardless of whether the struct astruct has a length of 64.
>
> ** We need to identify which trees come from sizeof statements **
>
> == What do we want? ==
>
> What we really want is to make sure that our transformation performs the
> following changes (or no changes!) depending on the source code.
>
> If the value for memset's argument comes from a sizeof statement:
>
> // source code
> struct astruct a;
> memset(a, 0, sizeof(a));
>
> // parse time
> memset(a, 0, 64);
>
> // after dead field elimination
> memset(a, 0, 56);
>
> However, in the case in which no sizeof is used, we want to do the
> following:
>
> // source code
> struct astruct a;
> memset(a, 0, 64);
>
> // parse time
> memset(a, 0, 64);
>
> // after dead field elimination
> memset(a, 0, 64);
>
> == How do we get what we want? ==
>
> Ideally what we want is to:
>
> * Be able to change the value returned by sizeof and offsetof at link time:
>* possibly a global variable?
> * Identify which values come from sizeof statement:
>* matching identifiers?
> * No re/define valid C identifiers:
>* in gimple we can have an identifier we a dot in it.
> * Disable constant propagation and other optimizations:
>* possibly __attribute__((noipa))
> * Be able to work with parallel compilation (make -j)
> * Be able to work with any Makefile
>* No C code generation and then compile and link gen code at the end.
>
> So, I've been thinking about multiple options:
>
> * Extending gimple to add support for a sizeof statement
> * A function per struct generated during compilation (sizeof & offsetof)
> * A variable per struct generated during compilation (sizeof and more
> for offsetof)
>
> I think extending gimple to add support for a sizeof statement gets us
> all what we want, however this would involve rewriting possibly many
> parts of GCC. As such, I am somewhat opposed to this.
>
> I then thought of generating global variables during parse/time
> compilation. In this scheme, I would replace sizeof statements with a
> reference to a global variable (or function) that is initialized with
> the value returned by the sizeof statement during parse time. At link
> time we can replace initialization value if needed. For example:
>
> // The parser is parsing a C file
> // it encounters a sizeof statement
> sizeof(struct astruct);
>
> // Parsing is paused.
> // Does a global variable that identifies this struct exists?
> // I.e. size_t __lto.sizeof.astruct exists?
> // If it doesn't create it.
>
> size_t __lto.sizeof.astruct = 64
>
> // Back to the parser
> // instead of replacing
> // sizeof(struct astruct) with 64
> // replace with the following gimple:
>
> __lto.sizeof.astruct
>
> // Continue parsing until the end of file compilation.
>
> // If at link time we detect that we will delete a field from astruct
> // Then we will have to look at the initialization value of
> // __lto.sizeof.astruct and replace it with the new value.
>
> size_t __lto.sizeof.$identifier = 56
>
> This strategy can be used with global functions instead of variables and
> it is similar. The only differences would be we would create a global
> function instead of a variable and we would call that function to obtain
> the value.
>
> For offsetof, we will need to change in the following way:
>
> // Parser encounter offsetof
> offsetof(struct astruct, b);
>
> / Parsing is paused.
> // Does a global variable that identifies this struct AND field exists?
>
> // The previous field has a size of 8
> size_t __lto.offsetof.astruct._8 = 8
>
> // Back to the parser
> // instead of replacing
> // offsetof(struct astruct, b) with 8
> // replace with the following gimple:
>
> __lto.offsetof.astruct._8
>
> // Continue parsing until the end of file compilation.
>
> // If at link time we detect that we will delete the previous field
> // then we can rewrite all the offsetof for this struct and which refer
>

Re: RFC noipa sizeof function for record relayout at link time

2020-06-29 Thread Richard Biener via Gcc
On Mon, Jun 29, 2020 at 1:05 PM Richard Biener
 wrote:
>
> On Mon, Jun 29, 2020 at 11:56 AM Erick Ochoa
>  wrote:
> >
> > Hello,
> >
> > I have been working on link time optimization for C that may change the
> > size of structs (at link time). We are close to sharing the results we
> > have so far, but there are a couple of missing pieces left to work on:
> >
> > Implementations of sizeof and offsetof that support this change in
> > struct layout at link time.
> >
> > == What is the problem? ==
> >
> > Currently, for both sizeof and offsetof, the C parser will replace these
> > statements with trees that correspond to the value returned by sizeof
> > and offsetof at parse time. For example:
> >
> > // source code
> > struct astruct a;
> > memset(a, 0, sizeof(a));
> >
> > // parse time
> > memset(a, 0, 64);
> >
> > // after dead field elimination
> > // struct astruct is now 56 bytes long
> > memset(a, 0, 64); // <-- we are overwriting memory!
> >
> > At link time, we really shouldn't change the value 64 since we can't and
> > shouldn't assume that the value 64 came from a sizeof statement. The
> > source code could have been written this way:
> >
> > // source code
> > struct astruct a;
> > memset(a, 0, 64);
> >
> > regardless of whether the struct astruct has a length of 64.
> >
> > ** We need to identify which trees come from sizeof statements **
> >
> > == What do we want? ==
> >
> > What we really want is to make sure that our transformation performs the
> > following changes (or no changes!) depending on the source code.
> >
> > If the value for memset's argument comes from a sizeof statement:
> >
> > // source code
> > struct astruct a;
> > memset(a, 0, sizeof(a));
> >
> > // parse time
> > memset(a, 0, 64);
> >
> > // after dead field elimination
> > memset(a, 0, 56);
> >
> > However, in the case in which no sizeof is used, we want to do the
> > following:
> >
> > // source code
> > struct astruct a;
> > memset(a, 0, 64);
> >
> > // parse time
> > memset(a, 0, 64);
> >
> > // after dead field elimination
> > memset(a, 0, 64);

But why do you think the difference of handling of sizeof(a) vs.
a constant is warranted?  It's by no means required that
whenever semantically the size of 'a' is needed you need to
write sizeof(a) but the user can just write literal 64 here.

It's the same with malloc sites btw.

So it seems you cannot use the presence or not presence
of 'sizeof' to derive semantics.

> > == How do we get what we want? ==
> >
> > Ideally what we want is to:
> >
> > * Be able to change the value returned by sizeof and offsetof at link time:
> >* possibly a global variable?
> > * Identify which values come from sizeof statement:
> >* matching identifiers?
> > * No re/define valid C identifiers:
> >* in gimple we can have an identifier we a dot in it.
> > * Disable constant propagation and other optimizations:
> >* possibly __attribute__((noipa))
> > * Be able to work with parallel compilation (make -j)
> > * Be able to work with any Makefile
> >* No C code generation and then compile and link gen code at the end.
> >
> > So, I've been thinking about multiple options:
> >
> > * Extending gimple to add support for a sizeof statement
> > * A function per struct generated during compilation (sizeof & offsetof)
> > * A variable per struct generated during compilation (sizeof and more
> > for offsetof)
> >
> > I think extending gimple to add support for a sizeof statement gets us
> > all what we want, however this would involve rewriting possibly many
> > parts of GCC. As such, I am somewhat opposed to this.
> >
> > I then thought of generating global variables during parse/time
> > compilation. In this scheme, I would replace sizeof statements with a
> > reference to a global variable (or function) that is initialized with
> > the value returned by the sizeof statement during parse time. At link
> > time we can replace initialization value if needed. For example:
> >
> > // The parser is parsing a C file
> > // it encounters a sizeof statement
> > sizeof(struct astruct);
> >
> > // Parsing is paused.
> > // Does a global variable that identifies this struct exists?
> > // I.e. size_t __lto.sizeof.astruct exists?
> > // If it doesn't create it.
> >
> > size_t __lto.sizeof.astruct = 64
> >
> > // Back to the parser
> > // instead of replacing
> > // sizeof(struct astruct) with 64
> > // replace with the following gimple:
> >
> > __lto.sizeof.astruct
> >
> > // Continue parsing until the end of file compilation.
> >
> > // If at link time we detect that we will delete a field from astruct
> > // Then we will have to look at the initialization value of
> > // __lto.sizeof.astruct and replace it with the new value.
> >
> > size_t __lto.sizeof.$identifier = 56
> >
> > This strategy can be used with global functions instead of variables and
> > it is similar. The only differences would be we would create a global
> > function instead of a variable and we would call that function to obt

Re: RFC noipa sizeof function for record relayout at link time

2020-06-29 Thread Jakub Jelinek via Gcc
On Mon, Jun 29, 2020 at 01:05:20PM +0200, Richard Biener via Gcc wrote:
> > // source code
> > struct astruct a;
> > memset(a, 0, sizeof(a));
> >
> > // parse time
> > memset(a, 0, 64);

Actually, I don't see the point at all, it doesn't matter if the user
used sizeof(a) or 64 knowing that the structure time is 64 bytes, or
say constexpr/consteval that used somewhere the sizeof in some complex
expression, or whatever else.
One just shouldn't try to remove fields at all in these cases, similar to
many other cases where it is not safely possible.
Actually, you could for memset (a, 0, 64); remove any fields that can be
removed and whose offsetof is >= 64, because such a call doesn't affect
them.

Jakub



Re: Automatically generated ChangeLog files - script

2020-06-29 Thread Martin Liška

@Alexander: May I please remind you this?

Martin

On 6/24/20 10:28 AM, Martin Liška wrote:

On 6/22/20 3:15 PM, Alexandre Oliva wrote:

On May 26, 2020, Martin Liška  wrote:


On 5/26/20 12:15 PM, Pierre-Marie de Rodat wrote:

 * contracts.adb, einfo.adb, exp_ch9.adb, sem_ch12.adb,



It's not supported right now and it will make the filename parsing
much more complicated.


Hello.

I support the patch:



Another colleague recently run into a problem with either:

* $filename <$case>:

or

* $filename [$condition]:

I can't recall which one it was, but the following patch is supposed to
implement both.  Alas, I couldn't figure out how to test it:
git_check_commit.py is failing with:

Traceback (most recent call last):
   File "contrib/gcc-changelog/git_check_commit.py", line 38, in 
 not args.non_strict_mode):
   File "/l/tmp/build/gcc/contrib/gcc-changelog/git_repository.py", line 57, in 
parse_git_revisions
 elif file.renamed_file:
AttributeError: 'Diff' object has no attribute 'renamed_file'


accept  and [cond] in ChangeLog

From: Alexandre Oliva 

Only '(' and ':' currently terminate file lists in ChangeLog entries
in the ChangeLog parser.  This rules out such legitimate entries as:

* filename :
* filename [COND]:

This patch extends the ChangeLog parser to recognize these forms.


for  contrib/ChangeLog

* gcc-changelog/git_commit.py: Support CASE and COND.
---
  contrib/gcc-changelog/git_commit.py |   16 
  1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/contrib/gcc-changelog/git_commit.py 
b/contrib/gcc-changelog/git_commit.py
index 4a78793..537c667 100755
--- a/contrib/gcc-changelog/git_commit.py
+++ b/contrib/gcc-changelog/git_commit.py
@@ -154,6 +154,7 @@ changelog_regex = re.compile(r'^(?:[fF]or 
+)?([a-z0-9+-/]*)ChangeLog:?')
  pr_regex = re.compile(r'\tPR (?P[a-z+-]+\/)?([0-9]+)$')
  dr_regex = re.compile(r'\tDR ([0-9]+)$')
  star_prefix_regex = re.compile(r'\t\*(?P\ *)(?P.*)')
+end_of_location_regex = re.compile(r'[[<(:]')


Please escape the '[':
+end_of_location_regex = re.compile(r'[\[<(:]')

and please a test-case for it.

Thanks,
Martin


  LINE_LIMIT = 100
  TAB_WIDTH = 8
@@ -203,14 +204,13 @@ class ChangeLogEntry:
  line = m.group('content')
  if in_location:
-    # Strip everything that is not a filename in "line": entities
-    # "(NAME)", entry text (the colon, if present, and anything
-    # that follows it).
-    if '(' in line:
-    line = line[:line.index('(')]
-    in_location = False
-    if ':' in line:
-    line = line[:line.index(':')]
+    # Strip everything that is not a filename in "line":
+    # entities "(NAME)", cases "", conditions
+    # "[COND]", entry text (the colon, if present, and
+    # anything that follows it).
+    m = end_of_location_regex.search(line)
+    if m:
+    line = line[:m.start()]
  in_location = False
  # At this point, all that's left is a list of filenames








Re: Customized coverage instrumentation for multiple C files

2020-06-29 Thread Martin Liška

On 6/27/20 5:33 AM, Shuai Wang via Gcc wrote:

virtual unsigned int execute(function *fun) override

which has no idea about the .C file information. In LLVM all .C files are
roughly maintained in separate "modules" but I just don't know how to
access such information in GIMPLE.


Hey.

Please take a look at coverage_begin_function (unsigned lineno_checksum, 
unsigned cfg_checksum).
Note that situation can be more complicated as a function can come up from a 
header file.
Or you may experiment using aux_base_name.

Martin


Re: RFC noipa sizeof function for record relayout at link time

2020-06-29 Thread Erick Ochoa




On 29.06.20 04:05, Richard Biener wrote:

On Mon, Jun 29, 2020 at 11:56 AM Erick Ochoa
 wrote:


Hello,

I have been working on link time optimization for C that may change the
size of structs (at link time). We are close to sharing the results we
have so far, but there are a couple of missing pieces left to work on:

Implementations of sizeof and offsetof that support this change in
struct layout at link time.

== What is the problem? ==

Currently, for both sizeof and offsetof, the C parser will replace these
statements with trees that correspond to the value returned by sizeof
and offsetof at parse time. For example:

// source code
struct astruct a;
memset(a, 0, sizeof(a));

// parse time
memset(a, 0, 64);

// after dead field elimination
// struct astruct is now 56 bytes long
memset(a, 0, 64); // <-- we are overwriting memory!

At link time, we really shouldn't change the value 64 since we can't and
shouldn't assume that the value 64 came from a sizeof statement. The
source code could have been written this way:

// source code
struct astruct a;
memset(a, 0, 64);

regardless of whether the struct astruct has a length of 64.

** We need to identify which trees come from sizeof statements **

== What do we want? ==

What we really want is to make sure that our transformation performs the
following changes (or no changes!) depending on the source code.

If the value for memset's argument comes from a sizeof statement:

// source code
struct astruct a;
memset(a, 0, sizeof(a));

// parse time
memset(a, 0, 64);

// after dead field elimination
memset(a, 0, 56);

However, in the case in which no sizeof is used, we want to do the
following:

// source code
struct astruct a;
memset(a, 0, 64);

// parse time
memset(a, 0, 64);

// after dead field elimination
memset(a, 0, 64);

== How do we get what we want? ==

Ideally what we want is to:

* Be able to change the value returned by sizeof and offsetof at link time:
* possibly a global variable?
* Identify which values come from sizeof statement:
* matching identifiers?
* No re/define valid C identifiers:
* in gimple we can have an identifier we a dot in it.
* Disable constant propagation and other optimizations:
* possibly __attribute__((noipa))
* Be able to work with parallel compilation (make -j)
* Be able to work with any Makefile
* No C code generation and then compile and link gen code at the end.

So, I've been thinking about multiple options:

* Extending gimple to add support for a sizeof statement
* A function per struct generated during compilation (sizeof & offsetof)
* A variable per struct generated during compilation (sizeof and more
for offsetof)

I think extending gimple to add support for a sizeof statement gets us
all what we want, however this would involve rewriting possibly many
parts of GCC. As such, I am somewhat opposed to this.

I then thought of generating global variables during parse/time
compilation. In this scheme, I would replace sizeof statements with a
reference to a global variable (or function) that is initialized with
the value returned by the sizeof statement during parse time. At link
time we can replace initialization value if needed. For example:

// The parser is parsing a C file
// it encounters a sizeof statement
sizeof(struct astruct);

// Parsing is paused.
// Does a global variable that identifies this struct exists?
// I.e. size_t __lto.sizeof.astruct exists?
// If it doesn't create it.

size_t __lto.sizeof.astruct = 64

// Back to the parser
// instead of replacing
// sizeof(struct astruct) with 64
// replace with the following gimple:

__lto.sizeof.astruct

// Continue parsing until the end of file compilation.

// If at link time we detect that we will delete a field from astruct
// Then we will have to look at the initialization value of
// __lto.sizeof.astruct and replace it with the new value.

size_t __lto.sizeof.$identifier = 56

This strategy can be used with global functions instead of variables and
it is similar. The only differences would be we would create a global
function instead of a variable and we would call that function to obtain
the value.

For offsetof, we will need to change in the following way:

// Parser encounter offsetof
offsetof(struct astruct, b);

/ Parsing is paused.
// Does a global variable that identifies this struct AND field exists?

// The previous field has a size of 8
size_t __lto.offsetof.astruct._8 = 8

// Back to the parser
// instead of replacing
// offsetof(struct astruct, b) with 8
// replace with the following gimple:

__lto.offsetof.astruct._8

// Continue parsing until the end of file compilation.

// If at link time we detect that we will delete the previous field
// then we can rewrite all the offsetof for this struct and which refer
// to the fields that follow the deleted field

__lto.offsetof.astruct._8 = 0;
// Because the previous field was deleted for example
// All variables referring to offsets allocated
// further than the one deleted 

Re: RFC noipa sizeof function for record relayout at link time

2020-06-29 Thread Erick Ochoa




On 29.06.20 04:08, Richard Biener wrote:

On Mon, Jun 29, 2020 at 1:05 PM Richard Biener
 wrote:


On Mon, Jun 29, 2020 at 11:56 AM Erick Ochoa
 wrote:


Hello,

I have been working on link time optimization for C that may change the
size of structs (at link time). We are close to sharing the results we
have so far, but there are a couple of missing pieces left to work on:

Implementations of sizeof and offsetof that support this change in
struct layout at link time.

== What is the problem? ==

Currently, for both sizeof and offsetof, the C parser will replace these
statements with trees that correspond to the value returned by sizeof
and offsetof at parse time. For example:

// source code
struct astruct a;
memset(a, 0, sizeof(a));

// parse time
memset(a, 0, 64);

// after dead field elimination
// struct astruct is now 56 bytes long
memset(a, 0, 64); // <-- we are overwriting memory!

At link time, we really shouldn't change the value 64 since we can't and
shouldn't assume that the value 64 came from a sizeof statement. The
source code could have been written this way:

// source code
struct astruct a;
memset(a, 0, 64);

regardless of whether the struct astruct has a length of 64.

** We need to identify which trees come from sizeof statements **

== What do we want? ==

What we really want is to make sure that our transformation performs the
following changes (or no changes!) depending on the source code.

If the value for memset's argument comes from a sizeof statement:

// source code
struct astruct a;
memset(a, 0, sizeof(a));

// parse time
memset(a, 0, 64);

// after dead field elimination
memset(a, 0, 56);

However, in the case in which no sizeof is used, we want to do the
following:

// source code
struct astruct a;
memset(a, 0, 64);

// parse time
memset(a, 0, 64);

// after dead field elimination
memset(a, 0, 64);


But why do you think the difference of handling of sizeof(a) vs.
a constant is warranted?  It's by no means required that
whenever semantically the size of 'a' is needed you need to
write sizeof(a) but the user can just write literal 64 here.

It's the same with malloc sites btw.

So it seems you cannot use the presence or not presence
of 'sizeof' to derive semantics.


I think the difference of handling sizeof(a) vs a constant is warranted 
because whenever we delete a field from the structure, we are changing 
the size of the structure.


I understand that the user can write a number instead of sizeof(a). 
However, we argue that the user's "knowledge" of a type's size might be 
speculation. The value of the result of the sizeof operator is 
implementation defined.


Therefore the semantics of memset(a, 0, 64) is precisely memset(a, 0, 
64). While the semantics of memset(a, 0, sizeof(a)) != memset(a, 0, 64) 
since it depends on the compiler and target machine (and we argue the 
optimizations chosen).





== How do we get what we want? ==

Ideally what we want is to:

* Be able to change the value returned by sizeof and offsetof at link time:
* possibly a global variable?
* Identify which values come from sizeof statement:
* matching identifiers?
* No re/define valid C identifiers:
* in gimple we can have an identifier we a dot in it.
* Disable constant propagation and other optimizations:
* possibly __attribute__((noipa))
* Be able to work with parallel compilation (make -j)
* Be able to work with any Makefile
* No C code generation and then compile and link gen code at the end.

So, I've been thinking about multiple options:

* Extending gimple to add support for a sizeof statement
* A function per struct generated during compilation (sizeof & offsetof)
* A variable per struct generated during compilation (sizeof and more
for offsetof)

I think extending gimple to add support for a sizeof statement gets us
all what we want, however this would involve rewriting possibly many
parts of GCC. As such, I am somewhat opposed to this.

I then thought of generating global variables during parse/time
compilation. In this scheme, I would replace sizeof statements with a
reference to a global variable (or function) that is initialized with
the value returned by the sizeof statement during parse time. At link
time we can replace initialization value if needed. For example:

// The parser is parsing a C file
// it encounters a sizeof statement
sizeof(struct astruct);

// Parsing is paused.
// Does a global variable that identifies this struct exists?
// I.e. size_t __lto.sizeof.astruct exists?
// If it doesn't create it.

size_t __lto.sizeof.astruct = 64

// Back to the parser
// instead of replacing
// sizeof(struct astruct) with 64
// replace with the following gimple:

__lto.sizeof.astruct

// Continue parsing until the end of file compilation.

// If at link time we detect that we will delete a field from astruct
// Then we will have to look at the initialization value of
// __lto.sizeof.astruct and replace it with the new value.

size_t __lto.sizeof.$identifier = 

Re: RFC noipa sizeof function for record relayout at link time

2020-06-29 Thread Martin Jambor
Hi,

On Mon, Jun 29 2020, Erick Ochoa wrote:
> == How do we get what we want? ==
>
> Ideally what we want is to:
>

[...]

> * Disable constant propagation and other optimizations:
>* possibly __attribute__((noipa))

[...]

> == What's the WORST CASE performance of having an unknown sizeof? ==
>
> I was concerned that the values generated by sizeof might be used by 
> constant propagation and other optimizations might depend on this value. 
> So, in order to have an idea of the impact of this transformation might 
> have on code, I manually changed all sizeof statements on MCF for an 
> equivalent function __lto_sizeof_T() where T identifies the type. I made 
> all these functions static and used the attribute noipa. There was no 
> measurable performance degradation when compared with an untransformed 
> run of MCF. Of course, this is only a data point and the performance 
> might depend on the application/workload/architecture... but it is good 
> to have a data point!

Note that attribute noipa also disables inlining.

>
> == What are some known unknowns? ==
>
>
> * Does noipa work for variables?
>* I think attribute noipa works for functions but I am not sure if it 
> works for variables.

No, but there are not too many IPA analyses/optimizations working on
global variables.  And making the attributes work for them should not be
hard.

> * After changing the definition of struct can we re-run all link time 
> optimizations?
>* I would not like to sacrifice performance. Because I might have 
> hindered constant propagation all optimizations which depend on it might 
> suffer. Therefore, I was wondering if after changing the fields I can 
> delete the noipa attribtue and re-run all link time optimizations 
> somehow? (However, the experiment tells us that this might not be a 
> worry. Perhaps there might be other benchmarks which are more affected 
> by this transformation.)

That would need quite a surgery in the pass manager, and it would
require re-running all body analyses at the link phase, something we're
trying to avoid.

If you need to disable IPA-CP (and IPA-SRA) changing a particular
parameter (e.g. because an earlier IPA pass has changed it beyond
recognition), ipa_param_adjustments::get_surviving_params should place
false in the corresponding vector element.

If you need to disable propagation of a specific value in a specific
call, you need to prevent creation of the associated jump function.

But of course, if the call gets inlined the propagation will happen
anyway, so if you are afraid that propagation of any value anywhere can
possibly be based on an offsetof or sizeof which you are changing, then
I don't think your problems are limited just to IPA (link time
optimization) propagation.

I'd do what Richi has suggested and enter some conservative mode if
sizeof and especially offsetof were used on a type (you might still be
able to handle memset from offset zero until the end of the structure as
a special case?).

Martin


Re: RFC noipa sizeof function for record relayout at link time

2020-06-29 Thread Erick Ochoa




On 29.06.20 06:05, Martin Jambor wrote:

Hi,

On Mon, Jun 29 2020, Erick Ochoa wrote:

== How do we get what we want? ==

Ideally what we want is to:



[...]


* Disable constant propagation and other optimizations:
* possibly __attribute__((noipa))


[...]


== What's the WORST CASE performance of having an unknown sizeof? ==

I was concerned that the values generated by sizeof might be used by
constant propagation and other optimizations might depend on this value.
So, in order to have an idea of the impact of this transformation might
have on code, I manually changed all sizeof statements on MCF for an
equivalent function __lto_sizeof_T() where T identifies the type. I made
all these functions static and used the attribute noipa. There was no
measurable performance degradation when compared with an untransformed
run of MCF. Of course, this is only a data point and the performance
might depend on the application/workload/architecture... but it is good
to have a data point!


Note that attribute noipa also disables inlining.


This is good! We want to basically make the values returned by these 
functions as opaque as possible. At least as a worst case analysis for 
how bad removing the sizeof parse time substitution all the way until 
runtime would be.






== What are some known unknowns? ==


* Does noipa work for variables?
* I think attribute noipa works for functions but I am not sure if it
works for variables.


No, but there are not too many IPA analyses/optimizations working on
global variables.  And making the attributes work for them should not be
hard.


* After changing the definition of struct can we re-run all link time
optimizations?
* I would not like to sacrifice performance. Because I might have
hindered constant propagation all optimizations which depend on it might
suffer. Therefore, I was wondering if after changing the fields I can
delete the noipa attribtue and re-run all link time optimizations
somehow? (However, the experiment tells us that this might not be a
worry. Perhaps there might be other benchmarks which are more affected
by this transformation.)


That would need quite a surgery in the pass manager, and it would
require re-running all body analyses at the link phase, something we're
trying to avoid.


I understand. We might skip this since the experiment showed that there 
was no performance degradation. But again... only one data point != 
generalization.




If you need to disable IPA-CP (and IPA-SRA) changing a particular
parameter (e.g. because an earlier IPA pass has changed it beyond
recognition), ipa_param_adjustments::get_surviving_params should place
false in the corresponding vector element.


Awesome! Thanks! I have been looking for something like this for a 
little while.




If you need to disable propagation of a specific value in a specific
call, you need to prevent creation of the associated jump function.

But of course, if the call gets inlined the propagation will happen
anyway, so if you are afraid that propagation of any value anywhere can
possibly be based on an offsetof or sizeof which you are changing, then
I don't think your problems are limited just to IPA (link time
optimization) propagation.


Sorry, I have some problem understanding this. You mentioned that noipa 
disables inlining (which is good). Here you state that "if the call gets 
inlined" then the value will be propagated.


Are you saying that this mini benchmark experiment was flawed and that 
we should also look at how to disable non-ipa passes?


Thanks!



I'd do what Richi has suggested and enter some conservative mode if
sizeof and especially offsetof were used on a type (you might still be
able to handle memset from offset zero until the end of the structure as
a special case?).

Martin



Re: RFC noipa sizeof function for record relayout at link time

2020-06-29 Thread Martin Jambor
Hi,

On Mon, Jun 29 2020, Erick Ochoa wrote:
> On 29.06.20 06:05, Martin Jambor wrote:
>> Hi,
>> 
>> On Mon, Jun 29 2020, Erick Ochoa wrote:
>>> == How do we get what we want? ==
>>>
>>> Ideally what we want is to:
>>>
>> 
>> [...]
>> 
>>> * Disable constant propagation and other optimizations:
>>> * possibly __attribute__((noipa))
>> 
>> [...]
>> 
>>> == What's the WORST CASE performance of having an unknown sizeof? ==
>>>
>>> I was concerned that the values generated by sizeof might be used by
>>> constant propagation and other optimizations might depend on this value.
>>> So, in order to have an idea of the impact of this transformation might
>>> have on code, I manually changed all sizeof statements on MCF for an
>>> equivalent function __lto_sizeof_T() where T identifies the type. I made
>>> all these functions static and used the attribute noipa. There was no
>>> measurable performance degradation when compared with an untransformed
>>> run of MCF. Of course, this is only a data point and the performance
>>> might depend on the application/workload/architecture... but it is good
>>> to have a data point!
>> 
>> Note that attribute noipa also disables inlining.
>
> This is good! We want to basically make the values returned by these 
> functions as opaque as possible. At least as a worst case analysis for 
> how bad removing the sizeof parse time substitution all the way until 
> runtime would be.
>
>> 
>>>
>>> == What are some known unknowns? ==
>>>
>>>
>>> * Does noipa work for variables?
>>> * I think attribute noipa works for functions but I am not sure if it
>>> works for variables.
>> 
>> No, but there are not too many IPA analyses/optimizations working on
>> global variables.  And making the attributes work for them should not be
>> hard.
>> 
>>> * After changing the definition of struct can we re-run all link time
>>> optimizations?
>>> * I would not like to sacrifice performance. Because I might have
>>> hindered constant propagation all optimizations which depend on it might
>>> suffer. Therefore, I was wondering if after changing the fields I can
>>> delete the noipa attribtue and re-run all link time optimizations
>>> somehow? (However, the experiment tells us that this might not be a
>>> worry. Perhaps there might be other benchmarks which are more affected
>>> by this transformation.)
>> 
>> That would need quite a surgery in the pass manager, and it would
>> require re-running all body analyses at the link phase, something we're
>> trying to avoid.
>
> I understand. We might skip this since the experiment showed that there 
> was no performance degradation. But again... only one data point != 
> generalization.
>
>> 
>> If you need to disable IPA-CP (and IPA-SRA) changing a particular
>> parameter (e.g. because an earlier IPA pass has changed it beyond
>> recognition), ipa_param_adjustments::get_surviving_params should place
>> false in the corresponding vector element.
>
> Awesome! Thanks! I have been looking for something like this for a 
> little while.
>
>> 
>> If you need to disable propagation of a specific value in a specific
>> call, you need to prevent creation of the associated jump function.
>> 
>> But of course, if the call gets inlined the propagation will happen
>> anyway, so if you are afraid that propagation of any value anywhere can
>> possibly be based on an offsetof or sizeof which you are changing, then
>> I don't think your problems are limited just to IPA (link time
>> optimization) propagation.
>
> Sorry, I have some problem understanding this. You mentioned that noipa 
> disables inlining (which is good). Here you state that "if the call gets 
> inlined" then the value will be propagated.
>
> Are you saying that this mini benchmark experiment was flawed and that 
> we should also look at how to disable non-ipa passes?

My point was simply that if you just do not want inter-procedural
propagations to happen, than you cannot just somehow disable/limit
IPA-CP but also inlining, because that can convert them to plain old
intra-procedural propagation.  And disabling inlining is going to have
huge performance implications.  So I'd try to avoid such need as much as
possible.

Martin


Re: RFC noipa sizeof function for record relayout at link time

2020-06-29 Thread Joseph Myers
On Mon, 29 Jun 2020, Erick Ochoa wrote:

> We are not targeting C++ at the moment. What contexts exist in C where we
> require constant expressions? On the top of my head I have array sizes and
> initialization of static variables? In such cases, then yes we agree that we

Bit-field widths, static assertions, array designators in initializers, 
values in enum declarations, case labels, 

It's not always possible to determine the type of an expression without 
knowing the values of constant expressions within it.  An integer constant 
expression cast to (void *) is a null pointer constant if it has value 0, 
but not if it has another value.  Now look at the rules for the type of 
conditional expressions between two pointers, which depend on whether one 
is a null pointer constant.  glibc's  uses that when building 
with older compilers.  GNU __builtin_choose_expr yields simpler cases 
where a type depends on the value of an integer constant expression.

-- 
Joseph S. Myers
jos...@codesourcery.com


RE: design automation professionals

2020-06-29 Thread Emery White
Hi,



Did you had a chance to review my previous mail which I sent across
regarding cybersecurity and IT risk management professionals?



If you are interested please drop me a note so that we can connect and
discuss about the opportunity.



Cheers!



Regards,

Emery White



*From:* Emery White [mailto:emery.wh...@techsearchberry.com]
*Sent:* 24 June 2020 12:43
*To:* 'gcc@gcc.gnu.org' 
*Subject:* design automation professionals



Hi,

would you like to check out the contacts of design automation professionals?



· System Designers and architects

· Logic and circuit designers

· Tool developers

· Researchers

· Validation engineers, CAD managers

· Senior managers and executives



If you are interested please drop me a note so that we can connect and
discuss about the opportunity.

Thanks in advance!

Regards,

*Emery White |Online Marketing Managers| *



If you do not wish to receive further emails, please respond with
"Unsubscribe" in the subject line.


Emit a variable defined in gcc

2020-06-29 Thread Harshit Sharma via Gcc
Hello,
I am working on a gcc patch for asan. The patch is almost ready except one
thing. To make sure that the user has applied this patch before using asan
feature, I want to declare an additional variable in gcc which is reference
by our source code so that if this patch is missing, the user gets an error
compiling the code because the reference to this variable will not be
resolved.

I am still new to gcc development. So, can anyone tell me how can I make
gcc emit this variable?


Thanks,
Harshit