from:"David Wohlferd"

Using the asm suffix

2015-08-16 Thread David Wohlferd

As a followup to my update to the inline asm docs, I'm cleaning up the 
docs for 'Asm Labels.'  The changes I want to make are pretty 
straight-forward (attached; comments welcome).  But then I came across 
this line of code (from 
https://github.com/rschmukler/cs537-p5/blob/master/xv6/proc.h#L38):


extern struct proc *proc asm("%gs:4");

This x86 code says that 'proc' is located at an offset of 4 bytes from 
the gs register.


There isn't any description of using asm like this in the current Asm 
Labels docs.  But 'gs:4' isn't really a label.  There also isn't any 
description of it in the Explicit Reg Vars section.  But 'gs:4' isn't 
really a register either.  So apparently using asm like this this isn't 
documented anywhere.


Which makes me wonder:  Is this not doc'ed because using 'asm' like this 
isn't supported?  Or is there a supported feature here that needs to be 
doc'ed?


dw
Index: extend.texi
===
--- extend.texi	(revision 226751)
+++ extend.texi	(working copy)
@@ -8367,8 +8367,14 @@
 
 You can specify the name to be used in the assembler code for a C
 function or variable by writing the @code{asm} (or @code{__asm__})
-keyword after the declarator as follows:
+keyword after the declarator.
+It is up to you to make sure that the assembler names you choose do not
+conflict with any other assembler symbols.
 
+@subsubheading Assembler names for data:
+
+This sample shows how to specify the assembler name for data:
+
 @smallexample
 int foo asm ("myfoo") = 2;
 @end smallexample
@@ -8379,33 +8385,30 @@
 @samp{_foo}.
 
 On systems where an underscore is normally prepended to the name of a C
-function or variable, this feature allows you to define names for the
+variable, this feature allows you to define names for the
 linker that do not start with an underscore.
 
 It does not make sense to use this feature with a non-static local
 variable since such variables do not have assembler names.  If you are
 trying to put the variable in a particular register, see @ref{Explicit
-Reg Vars}.  GCC presently accepts such code with a warning, but will
-probably be changed to issue an error, rather than a warning, in the
-future.
+Reg Vars}.
 
-You cannot use @code{asm} in this way in a function @emph{definition}; but
-you can get the same effect by writing a declaration for the function
-before its definition and putting @code{asm} there, like this:
+@subsubheading Assembler names for functions:
 
+To specify the assember name for functions, write a declaration for the 
+function before its definition and put @code{asm} there, like this:
+
 @smallexample
-extern func () asm ("FUNC");
-
-func (x, y)
- int x, y;
-/* @r{@dots{}} */
+extern int func (int x, int y) asm ("MYFUNC");
+ 
+int func (int x, int y)
+@{
+   /* @r{@dots{}} */
 @end smallexample
 
-It is up to you to make sure that the assembler names you choose do not
-conflict with any other assembler symbols.  Also, you must not use a
-register name; that would produce completely invalid assembler code.  GCC
-does not as yet have the ability to store static variables in registers.
-Perhaps that will be added.
+@noindent
+This specifies that the name to be used for the function @code{func} in
+the assembler code should be @code{MYFUNC}.
 
 @node Explicit Reg Vars
 @subsection Variables in Specified Registers

Re: Using the asm suffix

2015-08-17 Thread David Wohlferd




There isn't any description of using asm like this in the current Asm
Labels docs.

And there shouldn't be.  It's a hack.


Ok, good.  After experimenting with this, I wasn't looking forward to 
trying to describe what did and didn't work.


dw

Re: Using the asm suffix

2015-08-17 Thread David Wohlferd


Thank you for the review and comments.

On 8/17/2015 3:41 AM, Segher Boessenkool wrote:

On Sun, Aug 16, 2015 at 06:33:40PM -0700, David Wohlferd wrote:

  On systems where an underscore is normally prepended to the name of a C
-function or variable, this feature allows you to define names for the
+variable, this feature allows you to define names for the
  linker that do not start with an underscore.

Why remove this?


This doc section (Controlling Names Used in Assembler Code) describes 
how the asm suffix affects both data and functions. However, it jumbles 
the two descriptions together.


My intent here is to break this clearly into two @subsubheadings: 
'Assembler names for data' and 'Assembler names for functions'. Since 
data is the first section, I removed the word 'function' here.



  It does not make sense to use this feature with a non-static local
  variable since such variables do not have assembler names.  If you are
  trying to put the variable in a particular register, see @ref{Explicit
-Reg Vars}.  GCC presently accepts such code with a warning, but will
-probably be changed to issue an error, rather than a warning, in the
-future.
+Reg Vars}.

And this?


Vague statements about possible changes that may or not ever be written 
are not helpful in docs.  In this case the statement is particularly 
unhelpful since even the warning appears to be gone.



+To specify the assember name for functions, write a declaration for the

 ^ typo


Some typos are more embarrassing than others.


+function before its definition and put @code{asm} there, like this:
+
  @smallexample
-extern func () asm ("FUNC");
-
-func (x, y)
- int x, y;
-/* @r{@dots{}} */
+extern int func (int x, int y) asm ("MYFUNC");
+
+int func (int x, int y)
+@{
+   /* @r{@dots{}} */
  @end smallexample

If you want to modernise the code, drop "extern" as well?  :-)


Ok.


-Also, you must not use a
-register name; that would produce completely invalid assembler code.  GCC
-does not as yet have the ability to store static variables in registers.
-Perhaps that will be added.

And why remove these?


Again with the vague statements about possible future changes. Also, 
whether or not static variables can be stored in registers has nothing 
to do with Asm Labels.  If this is still true, it belongs in Explicit 
Reg Vars.


Update attached.

dw
Index: extend.texi
===
--- extend.texi	(revision 226751)
+++ extend.texi	(working copy)
@@ -8367,8 +8367,14 @@
 
 You can specify the name to be used in the assembler code for a C
 function or variable by writing the @code{asm} (or @code{__asm__})
-keyword after the declarator as follows:
+keyword after the declarator.
+It is up to you to make sure that the assembler names you choose do not
+conflict with any other assembler symbols.
 
+@subsubheading Assembler names for data:
+
+This sample shows how to specify the assembler name for data:
+
 @smallexample
 int foo asm ("myfoo") = 2;
 @end smallexample
@@ -8379,33 +8385,30 @@
 @samp{_foo}.
 
 On systems where an underscore is normally prepended to the name of a C
-function or variable, this feature allows you to define names for the
+variable, this feature allows you to define names for the
 linker that do not start with an underscore.
 
 It does not make sense to use this feature with a non-static local
 variable since such variables do not have assembler names.  If you are
 trying to put the variable in a particular register, see @ref{Explicit
-Reg Vars}.  GCC presently accepts such code with a warning, but will
-probably be changed to issue an error, rather than a warning, in the
-future.
+Reg Vars}.
 
-You cannot use @code{asm} in this way in a function @emph{definition}; but
-you can get the same effect by writing a declaration for the function
-before its definition and putting @code{asm} there, like this:
+@subsubheading Assembler names for functions:
 
+To specify the assembler name for functions, write a declaration for the 
+function before its definition and put @code{asm} there, like this:
+
 @smallexample
-extern func () asm ("FUNC");
-
-func (x, y)
- int x, y;
-/* @r{@dots{}} */
+int func (int x, int y) asm ("MYFUNC");
+ 
+int func (int x, int y)
+@{
+   /* @r{@dots{}} */
 @end smallexample
 
-It is up to you to make sure that the assembler names you choose do not
-conflict with any other assembler symbols.  Also, you must not use a
-register name; that would produce completely invalid assembler code.  GCC
-does not as yet have the ability to store static variables in registers.
-Perhaps that will be added.
+@noindent
+This specifies that the name to be used for the function @code{func} in
+the assembler code should be @code{MYFUNC}.
 
 @node Explicit Reg Vars
 @subsection Variables in Specified Registers

Re: Using the asm suffix

2015-08-19 Thread David Wohlferd


(Resending due to email glitch)

Thanks again for your comments.

On 8/18/2015 2:23 AM, Segher Boessenkool wrote:

On Mon, Aug 17, 2015 at 09:55:48PM -0700, David Wohlferd wrote:
  On systems where an underscore is normally prepended to the name 
of a C

-function or variable, this feature allows you to define names for the
+variable, this feature allows you to define names for the
  linker that do not start with an underscore.

Why remove this?

This doc section (Controlling Names Used in Assembler Code) describes
how the asm suffix affects both data and functions. However, it jumbles
the two descriptions together.

Probably because they are the same thing...


My intent here is to break this clearly into two @subsubheadings:
'Assembler names for data' and 'Assembler names for functions'. Since
data is the first section, I removed the word 'function' here.

I missed that, sorry.  Or, did you forget to add the same text to the
"function" description?

This patch would be much easier to review if you did one change per 
patch.


I'm not sure what 'one change' might mean in this context.  This topic 
is all a single texi node (ie it all ends up on a single html page).  
This page isn't that big.  Would looking at the output help clarify this:


Current: https://gcc.gnu.org/onlinedocs/gcc/Asm-Labels.html
Proposed: http://limegreensocks.com/gcc/Asm-Labels.html


  It does not make sense to use this feature with a non-static local
  variable since such variables do not have assembler names.  If 
you are
  trying to put the variable in a particular register, see 
@ref{Explicit

-Reg Vars}.  GCC presently accepts such code with a warning, but will
-probably be changed to issue an error, rather than a warning, in the
-future.
+Reg Vars}.

And this?

Vague statements about possible changes that may or not ever be written
are not helpful in docs.  In this case the statement is particularly
unhelpful since even the warning appears to be gone.

I don't agree it is a vague statement about possible future changes; it
is more like a statement of intent.


Saying this will "probably" change at some (unspecified) time "in the 
future" seems the very definition of vague.  Especially when you 
consider that the docs have been threatening this change for over 15 
years now.



It tells the reader "don't write code like this".


If this is intended just for emphasis, how about replacing the existing 
text ("It does not make sense to use this feature with a non-static 
local variable since such variables do not have assembler names.") with 
"Do not use this feature with a non-static local variable." or maybe "It 
is not supported to use this feature with a non-static local variable 
since such variables do not have assembler names."


Saying "It does not make sense" to do something doesn't mean the same 
thing (to me) as "Do not do this" or "It is not supported to..."



And the warning is still there ("ignoring asm-specifier for non-static
local variable").


Huh.  I was unable to get gcc to produce this warning.  Is there a trick?

int main()
{
   int a asm("asdf");
   a = 13;
   return a;
}

I have tried 4.9.2 and 5, with -O2 and -O0, and I'm getting no warning 
(or error) messages.



-Also, you must not use a
-register name; that would produce completely invalid assembler 
code.  GCC
-does not as yet have the ability to store static variables in 
registers.

-Perhaps that will be added.

And why remove these?

Again with the vague statements about possible future changes. Also,
whether or not static variables can be stored in registers has nothing
to do with Asm Labels.  If this is still true, it belongs in Explicit
Reg Vars.

The first part ("must not use a register name") is an important warning.


Clarifying this is a good idea.  Although limiting it to only saying 
"don't use register names" seems a little, well, limiting. Who knows 
what kind of offsets or asm qualifiers they might try to cram in here?  
How about:


"Only label names valid for your assembler are permitted within the asm."

This would go right after the warning about conflicts.


The second part (about statics) might well be better moved, but it should
not be _re_moved just like that!  And it is still true (gives an error,
"multiple storage classes").


I have already started work on the changes I think are needed for 
Explicit Reg Vars (the last section of gcc docs I'm planning on doing).  
But I want to finish the (relatively easy) Asm Labels stuff first.  
Should I put this change in there?  Or would someone just tell me all 
the Asm Labels stuff should go in its own patch?


Also, I'm not seeing the multiple storage classes message either:

int main()
{
   static int a asm("asdf");
   a = 5;
   return a;
}

dw

Re: Using the asm suffix

2015-08-19 Thread David Wohlferd


[snip]


how about replacing the existing
text ("It does not make sense to use this feature with a non-static
local variable since such variables do not have assembler names.") with
"Do not use this feature with a non-static local variable." or maybe "It
is not supported to use this feature with a non-static local variable
since such variables do not have assembler names."

"You cannot use this feature ..." etc.?  Keep the part about not having
assembler names, it is useful.


Due to the quirks of the English language, I'm not sure 'cannot' is the 
right word here.  More correct would be 'cannot reliably' but I don't 
want to be that wishy-washy.


And I'm a little iffy about the 'since such variables do not have 
assembler names,' as it seemed a bit bold to make assertions about the 
implementation details for all assemblers (past, present and future) for 
all platforms.  But you are right, it does convey a bit of the 'why' for 
this limitation, so keeping it is a good idea.  How about:


"gcc does not support using this feature with a non-static local 
variable since typically such variables do not have assembler names."


BTW, the trick for getting the "ignoring asm-specifier for non-static 
local variable" message was renaming my file from sta5.cpp to sta5.c.  
Seems like this should apply to both, but whatever.



The first part ("must not use a register name") is an important warning.

Clarifying this is a good idea.  Although limiting it to only saying
"don't use register names" seems a little, well, limiting. Who knows
what kind of offsets or asm qualifiers they might try to cram in here?

Register names is the common case to hurt you...  "r0" etc. ;-)


But as we have seen (%gs:4), people are willing to try other things.  
And rather than try to list all the things that don't work (register, 
registers with offsets, etc), I'm hoping we can find a way to specify 
the one thing that is supported.



How about:

"Only label names valid for your assembler are permitted within the asm."

The assembler calls it "symbol names" (labels end in a colon).  But it
won't help: e.g. "r0" is a perfectly fine name for the assembler, too!

It should say something like "if the string you put here is seen as
something else than a symbol name by the assembler, anywhere the compiler
puts it, you're on your own", but that is pretty vague as well.


Ok, how about "Only symbol names that define labels are permitted within 
the asm."


A bit awkward, but I believe it conveys the intent.


You forgot to make it "register", so it is not a register variable.


Ahh, true.  I suppose we could say something about don't use 'register' 
with 'asm labels', but it doesn't seem worth the effort.


dw

Re: Using the asm suffix

2015-09-07 Thread David Wohlferd

In order for the doc maintainers to approve this patch, I need to have 
someone sign off on the technical accuracy.  Now that I have included 
the points we have discussed (attached), hopefully we are there.


Original text: https://gcc.gnu.org/onlinedocs/gcc/Asm-Labels.html
Proposed text: http://limegreensocks.com/gcc/Asm-Labels.html

Still pending is the line I removed about 'static variables in 
registers' that belongs in the Reg Vars section.  I have additional 
changes I want to make to Reg Vars sections, so once this patch is 
accepted, I'll post that work.


dw
Index: extend.texi
===
--- extend.texi	(revision 226751)
+++ extend.texi	(working copy)
@@ -8367,8 +8367,14 @@
 
 You can specify the name to be used in the assembler code for a C
 function or variable by writing the @code{asm} (or @code{__asm__})
-keyword after the declarator as follows:
+keyword after the declarator.
+It is up to you to make sure that the assembler names you choose do not
+conflict with any other assembler symbols, or reference registers.
 
+@subsubheading Assembler names for data:
+
+This sample shows how to specify the assembler name for data:
+
 @smallexample
 int foo asm ("myfoo") = 2;
 @end smallexample
@@ -8379,33 +8385,30 @@
 @samp{_foo}.
 
 On systems where an underscore is normally prepended to the name of a C
-function or variable, this feature allows you to define names for the
+variable, this feature allows you to define names for the
 linker that do not start with an underscore.
 
-It does not make sense to use this feature with a non-static local
-variable since such variables do not have assembler names.  If you are
-trying to put the variable in a particular register, see @ref{Explicit
-Reg Vars}.  GCC presently accepts such code with a warning, but will
-probably be changed to issue an error, rather than a warning, in the
-future.
+GCC does not support using this feature with a non-static local variable 
+since such variables do not have assembler names.  If you are
+trying to put the variable in a particular register, see 
+@ref{Explicit Reg Vars}.
 
-You cannot use @code{asm} in this way in a function @emph{definition}; but
-you can get the same effect by writing a declaration for the function
-before its definition and putting @code{asm} there, like this:
+@subsubheading Assembler names for functions:
 
+To specify the assembler name for functions, write a declaration for the 
+function before its definition and put @code{asm} there, like this:
+
 @smallexample
-extern func () asm ("FUNC");
-
-func (x, y)
- int x, y;
-/* @r{@dots{}} */
+int func (int x, int y) asm ("MYFUNC");
+ 
+int func (int x, int y)
+@{
+   /* @r{@dots{}} */
 @end smallexample
 
-It is up to you to make sure that the assembler names you choose do not
-conflict with any other assembler symbols.  Also, you must not use a
-register name; that would produce completely invalid assembler code.  GCC
-does not as yet have the ability to store static variables in registers.
-Perhaps that will be added.
+@noindent
+This specifies that the name to be used for the function @code{func} in
+the assembler code should be @code{MYFUNC}.
 
 @node Explicit Reg Vars
 @subsection Variables in Specified Registers

Proposed doc update for Explicit Reg Vars 1/3

2015-10-12 Thread David Wohlferd

Having updated the docs for Basic asm, Extended asm, and Asm Labels, I 
am now sending my patches for the last of the inline asm sections: 
Explicit Reg Vars.


My first attempt to update this got postponed (see 
https://gcc.gnu.org/ml/gcc-patches/2014-06/msg02369.html).  This patch 
addresses the previous concerns.


Note that there is nothing actually "wrong" with the existing text. It 
does not provide inaccurate information or miss key details.  The 
problem is that (from a compiler user's point of view) the text is hard 
to follow.  It reads as though people just have just dropped in new text 
as it occurs to them, without ever going back to put things in context, 
add formatting, etc.


Using Explicit Register variables is a feature that doesn't get a lot of 
attention (why should it?), which probably explains why no one has taken 
time to polish it up.  But now it has annoyed someone who is willing to 
work on it.


The are 3 web pages associated with Explicit Reg Vars (Menu, Global, and 
Local), so I am sending this as 3 patches.  The first patch (attached) 
is the menu page.


The current menu page has a couple of flaws:

1) It tries to condense the entire contents of the other 2 pages into a 
single paragraph each.  This loses a lot of detail and nuance, as well 
as introducing unnecessary duplication of information with the subpages.
2) What the heck is a "reg var"?  Why can't we just call this a 
"register variable"?


Instead of trying to fix/clarify/update the duplicated information, this 
patch removes it, then provides enough information to differentiate the 
two types of register variables, and finally directs people to the 
appropriate subpages which already contain all the information from this 
page (and more).


It also changes the section names (and refs within the docs) to avoid 
the unnecessary abbreviations.  As part of this, it also uses @anchor to 
ensure that any external links to the old names will still resolve 
correctly.  Is this the standard?  Or do we just let them fail?


For people who find the HTML easier to review:

Here's the current text: 
https://gcc.gnu.org/onlinedocs/gcc/Explicit-Reg-Vars.html
And here's the new: 
http://limegreensocks.com/gcc/Explicit-Register-Variables.html


dw


Index: extend.texi
===
--- extend.texi	(revision 228690)
+++ extend.texi	(working copy)
@@ -7254,7 +7254,8 @@
 * Extended Asm::   Inline assembler with operands.
 * Constraints::Constraints for @code{asm} operands
 * Asm Labels:: Specifying the assembler name to use for a C symbol.
-* Explicit Reg Vars::  Defining variables residing in specified registers.
+* Explicit Register Variables::  Defining variables residing in specified 
+   registers.
 * Size of an asm:: How GCC calculates the size of an @code{asm} block.
 @end menu
 
@@ -7774,7 +7775,8 @@
 the optimizers to produce the best possible code. 
 If you must use a specific register, but your Machine Constraints do not
 provide sufficient control to select the specific register you want, 
-local register variables may provide a solution (@pxref{Local Reg Vars}).
+local register variables may provide a solution (@pxref{Local Register 
+Variables}).
 
 @item cvariablename
 Specifies a C lvalue expression to hold the output, typically a variable name.
@@ -8004,7 +8006,8 @@
 the compiler chooses the most efficient one based on the current context.
 If you must use a specific register, but your Machine Constraints do not
 provide sufficient control to select the specific register you want, 
-local register variables may provide a solution (@pxref{Local Reg Vars}).
+local register variables may provide a solution (@pxref{Local Register 
+Variables}).
 
 Input constraints can also be digits (for example, @code{"0"}). This indicates 
 that the specified input must be in the same place as the output constraint 
@@ -8086,7 +8089,8 @@
 Clobber descriptions may not in any way overlap with an input or output 
 operand. For example, you may not have an operand describing a register class 
 with one member when listing that register in the clobber list. Variables 
-declared to live in specific registers (@pxref{Explicit Reg Vars}) and used 
+declared to live in specific registers (@pxref{Explicit Register 
+Variables}) and used 
 as @code{asm} input or output operands must have no part mentioned in the 
 clobber description. In particular, there is no way to specify that input 
 operands get modified without also specifying them as output operands.
@@ -8442,7 +8446,7 @@
 GCC does not support using this feature with a non-static local variable 
 since such variables do not have assembler names.  If you are
 trying to put the variable in a particular register, see 
-@ref{Explicit Reg Vars}.
+@ref{Explicit Register Variables}.
 
 @subsubheading Assembler names for functions:
 
@@ -8461,50 +8465,34 @@
 This specifies that the name to be used f

Proposed doc update for Explicit Reg Vars 2/3

2015-10-12 Thread David Wohlferd


Patch 2/3 is the update for the Global Register Variables page (attached).

Reviewing this patch is going to be difficult.

It's a lot easier to review a patch that just has a few lines of text 
being added.  However, this leads to 'chunky' docs with a bunch of 
disjointed paragraphs (which is what this page has now). Eventually, 
someone needs to take the whole thing, and rearrange, reformat and 
reorganize it.  Which is what this patch does.


So there isn't any new content (except the part recently removed from 
asm labels about statics), just the old content rephrased and 
reformatted.  There is, however, some decidedly unhelpful text that got 
removed (paraphrasing):


- There are unsourced, unsubstantiated reports that on some platforms, 
certain things might or might not work.

- Eventually the compiler may work differently than it does now.

Argh!  Who thinks this stuff is helpful?  And while it was painful, I 
kept in the part that says "when specifying register names, make sure 
they are really register names."


As for the review, perhaps reading the html is easier than reading the 
patch?


Here's the current text: 
https://gcc.gnu.org/onlinedocs/gcc/Global-Reg-Vars.html
And here's the new: 
http://limegreensocks.com/gcc/Global-Register-Variables.html


dw
Index: extend.texi
===
--- extend.texi	(revision 228690)
+++ extend.texi	(working copy)
@@ -8506,7 +8506,8 @@
 @cindex global register variables
 @cindex registers, global variables in
 
-You can define a global register variable in GNU C like this:
+You can define a global register variable and associate it with a specified register
+like this:
 
 @smallexample
 register int *foo asm ("a5");
@@ -8513,62 +8514,77 @@
 @end smallexample
 
 @noindent
-Here @code{a5} is the name of the register that should be used.  Choose a
-register that is normally saved and restored by function calls on your
-machine, so that library routines will not clobber it.
+Here @code{a5} is the name of the register that should be used.  Note that 
+this is the same syntax used for defining local register variables, but for 
+a global variable the declaration appears outside a function.  The 
+@code{register} keyword is required, and cannot be combined with 
+@code{static}.  The register name must be a valid register name for the
+target platform.
 
-Naturally the register name is CPU-dependent, so you need to
-conditionalize your program according to CPU type.  The register
-@code{a5} is a good choice on a 68000 for a variable of pointer
-type.  On machines with register windows, be sure to choose a ``global''
-register that is not affected magically by the function call mechanism.
+Registers can be a limited resource on some systems and allowing the 
+compiler to manage their usage usually results in the best code.  However, 
+under special circumstances it can make sense to reserve some globally.
+For example this may be useful in programs such as programming language 
+interpreters that have a couple of global variables that are accessed 
+very often.
 
-In addition, different operating systems on the same CPU may differ in how they
-name the registers; then you need additional conditionals.  For
-example, some 68000 operating systems call this register @code{%a5}.
+After defining a global register variable, for the duration of 
+the current compilation:
 
-Eventually there may be a way of asking the compiler to choose a register
-automatically, but first we need to figure out how it should choose and
-how to enable you to guide the choice.  No solution is evident.
+@itemize @bullet
+@item The register is reserved entirely for this use, and will not be 
+allocated for any other purpose.
+@item The register is not saved and restored by any functions.
+@item Stores into this register are never deleted even if they appear to be 
+dead, but references may be deleted, moved or simplified.
+@end itemize
 
-Defining a global register variable in a certain register reserves that
-register entirely for this use, at least within the current compilation.
-The register is not allocated for any other purpose in the functions
-in the current compilation, and is not saved and restored by
-these functions.  Stores into this register are never deleted even if they
-appear to be dead, but references may be deleted or moved or
-simplified.
+Note that these points @emph{only} apply to code that is compiled with the
+definition.  The behavior of code that is merely linked in (for example 
+code from libraries) is not affected.
 
-It is not safe to access the global register variables from signal
-handlers, or from more than one thread of control, because the system
-library routines may temporarily use the register for other things (unless
-you recompile them specially for the task at hand).
+If you want to recompile source files that do not actually use your global 
+register variable so they do not use the specified reg

Proposed doc update for Explicit Reg Vars 3/3

2015-10-12 Thread David Wohlferd


Patch 3/3 is the update for the Local Register Variables page (attached).

This patch starts with a question.  Looking at bug 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64951 (register variable 
with template function) is this a bug that will be fixed? Or a 
limitation that should be doc'ed?  Both the current docs and the patch 
ignore this bug.


As with patch #2, this is primarily about reformatting/reorganizing.  
Although it also adds the limitation from asm labels re statics.


I was hoping to modify the text to say that local register variables can 
"only" be used to call Extended asm.  This would greatly simplify this 
section.  But there has been pushback on this (despite the fact that no 
one has really suggested any other reasonable use).  So I have softened 
this, and listed things from the existing docs that are explicitly not 
supported.


For people who find the HTML easier to review:

Here's the current text: 
https://gcc.gnu.org/onlinedocs/gcc/Local-Reg-Vars.html
And here's the new: 
http://limegreensocks.com/gcc/Local-Register-Variables.html


dw


Index: extend.texi
===
--- extend.texi	(revision 228690)
+++ extend.texi	(working copy)
@@ -8604,7 +8604,7 @@
 @cindex specifying registers for local variables
 @cindex registers for local variables
 
-You can define a local register variable with a specified register
+You can define a local register variable and associate it with a specified register
 like this:
 
 @smallexample
@@ -8614,45 +8614,19 @@
 @noindent
 Here @code{a5} is the name of the register that should be used.  Note
 that this is the same syntax used for defining global register
-variables, but for a local variable it appears within a function.
+variables, but for a local variable the declaration appears within a 
+function.  The @code{register} keyword is required, and cannot be combined 
+with @code{static}.  The register name must be a valid register name for the
+target platform.
 
-Naturally the register name is CPU-dependent, but this is not a
-problem, since specific registers are most often useful with explicit
-assembler instructions (@pxref{Extended Asm}).  Both of these things
-generally require that you conditionalize your program according to
-CPU type.
-
-In addition, operating systems on one type of CPU may differ in how they
-name the registers; then you need additional conditionals.  For
-example, some 68000 operating systems call this register @code{%a5}.
-
-Defining such a register variable does not reserve the register; it
-remains available for other uses in places where flow control determines
-the variable's value is not live.
-
-This option does not guarantee that GCC generates code that has
-this variable in the register you specify at all times.  You may not
-code an explicit reference to this register in the assembler
-instruction template part of an @code{asm} statement and assume it
-always refers to this variable.
-However, using the variable as an input or output operand to the @code{asm}
-guarantees that the specified register is used for that operand.  
-@xref{Extended Asm}, for more information.
-
-Stores into local register variables may be deleted when they appear to be dead
-according to dataflow analysis.  References to local register variables may
-be deleted or moved or simplified.
-
-As with global register variables, it is recommended that you choose a
-register that is normally saved and restored by function calls on
-your machine, so that library routines will not clobber it.  
-
-Sometimes when writing inline @code{asm} code, you need to make an operand be a 
-specific register, but there's no matching constraint letter for that 
-register. To force the operand into that register, create a local variable 
+The intended use for this feature is to specify registers 
+for input and output operands when calling Extended @code{asm} 
+(@pxref{Extended Asm}). This may be necessary if the constraints for a 
+particular machine don't provide sufficient control to select the desired 
+register.  To force an operand into a register, create a local variable 
 and specify the register in the variable's declaration. Then use the local 
-variable for the asm operand and specify any constraint letter that matches 
-the register:
+variable for the @code{asm} operand and specify any constraint letter that 
+matches the register:
 
 @smallexample
 register int *p1 asm ("r0") = @dots{};
@@ -8661,11 +8635,11 @@
 asm ("sysint" : "=r" (result) : "0" (p1), "r" (p2));
 @end smallexample
 
-@emph{Warning:} In the above example, be aware that a register (for example r0) can be 
-call-clobbered by subsequent code, including function calls and library calls 
-for arithmetic operators on other variables (for example the initialization 
-of p2). In this case, use temporary variables for expressions between the 
-register assignments:
+@emph{Warning:} In the above example, be aware that a register

Re: Proposed doc update for Explicit Reg Vars 1/3

2015-10-20 Thread David Wohlferd


Abot the patches themselves...  Hard to review again, sigh...


I know, and I'm sorry.

I just can't see any way to completely re-org the text without the patch 
becoming a nightmare.  I was hoping the html links would make that 
easier, but I guess not.  On the plus side, Explicit reg vars is the 
last section I plan to do this to.  I appreciate you taking the time.



The current menu page has a couple of flaws:

1) It tries to condense the entire contents of the other 2 pages into a
single paragraph each.  This loses a lot of detail and nuance, as well
as introducing unnecessary duplication of information with the subpages.

It provides an introduction, which is quite helpful in this case.  Without
some blurb the two-entry menu looks silly too.

Can you move the intro to the separate pages instead of losing it altogether?


I did keep a small amount of intro on the the menu page.  If you feel 
there's more that we should keep, I'm certainly willing to re-visit 
this.  Perhaps after we resolve the local/global stuff so we know what 
we really want to say.


Jeff has already checked in the patch for this menu page (but not the 
Local or Global subpages), so you can see what I've left here on the gcc 
website 
(https://gcc.gnu.org/onlinedocs/gcc/Explicit-Register-Variables.html).



Two spaces after a full stop


Oops.Again.You can probably just automatically add this 
to every review you send me.  It's just so automatic for me to type this 
way.  In my (feeble) defense, the original text had this too.


Lastly, if some external website is linking to "Explicit Reg Vars", what 
do we want to have happen now that we have renamed that to "Explicit 
Register Variables"?  Should the link just fail?  I've added the @anchor 
so it doesn't, but I'm not sure that's the standard for gcc.  Who should 
I be asking?


dw

Re: Proposed doc update for Explicit Reg Vars 2/3

2015-10-20 Thread David Wohlferd




- Eventually the compiler may work differently than it does now.
That is helpful.  It's a way signaling that things may change and that 
depending on the precise syntax and semantics may be unwise.


From time to time, particularly with GCC extensions, it has been 
necessary to declare certain usage as invalid or to change the 
implementation is significantly visible ways.


We never like doing that, but it is sometimes unavoidable.  As a 
general rule, the more an extension exposes how GCC works internally, 
the more likely it has been to need significant changes over time.  
(asms being the easiest example to cite).


So please consider keeping something which signals the semantics might 
change.


The quote in question is:

Eventually there may be a way of asking the compiler to choose a 
register automatically, but first we need to figure out how it should 
choose and how to enable you to guide the choice. No solution is evident.


I struggle (and fail) to find anything in this text worth keeping.

So for the example, "a5" is a particularly bad choice these days on 
the m68k as it's the PIC register.  It may be advisable to just use 
r for some value of N and try to be processor agnostic here.


"a5" is what the original text used, but I have no preference.  I've 
changed it to r12, which works on my x64.


dw

Re: Proposed doc update for Explicit Reg Vars 2/3

2015-10-20 Thread David Wohlferd



Line too long. I know quite a bit of doc does that, but that's no 
excuse :-)


Reduced to < 79.


+Registers can be a limited resource on some systems and allowing the

They are a limited resource on almost all systems.  "Scarce resource"?


"Scarce" it is.  I've left the rest alone for the moment, but how would 
you feel about:


"Registers are a scarce resource on most systems and allowing the"


+After defining a global register variable, for the duration of
+the current compilation:

It's probably better to say "for the current compilation unit"?  There now
is LTO and whatnot.


Changed to "for the current compilation unit".


+All global register variable declarations must precede all function
+definitions.  If such a declaration appears after function definitions,
+the declaration would be too late to prevent the register from being used
+for other purposes in the preceding functions.

This isn't true anymore, not even with -fno-toplevel-reorder or -O0.


I'm going to interpret this as a recommendation to remove this text, 
rather than just an FYI.  Done.



+When selecting a register, choose one that is normally saved and
+restored by function calls on your machine.  This ensures that code
+which is unaware of this reservation (such as library routines) will
+restore it before returning.

The compiler also warns, possibly for the unlikely case that the user has
not read the documentation.


I'm going to interpret this comment as just an FYI, and NOT something 
that should be added to the docs.


I've attached the new patch for the Globals.  For review purposes, you 
can just diff it with the previous one.  Viewed that way, the changes 
are pretty minor.


dw
Index: extend.texi
===
--- extend.texi	(revision 229108)
+++ extend.texi	(working copy)
@@ -8473,7 +8473,7 @@
 GNU C allows you to associate specific hardware registers with C 
 variables.  In almost all cases, allowing the compiler to assign
 registers produces the best code.  However under certain unusual
-circumstances,  more precise control over the variable storage is 
+circumstances, more precise control over the variable storage is 
 required.
 
 Both global and local variables can be associated with a register.  The
@@ -8492,69 +8492,80 @@
 @cindex registers, global variables in
 @cindex registers, global allocation
 
-You can define a global register variable in GNU C like this:
+You can define a global register variable and associate it with a specified 
+register like this:
 
 @smallexample
-register int *foo asm ("a5");
+register int *foo asm ("r12");
 @end smallexample
 
 @noindent
-Here @code{a5} is the name of the register that should be used.  Choose a
-register that is normally saved and restored by function calls on your
-machine, so that library routines will not clobber it.
+Here @code{r12} is the name of the register that should be used. Note that 
+this is the same syntax used for defining local register variables, but for 
+a global variable the declaration appears outside a function. The 
+@code{register} keyword is required, and cannot be combined with 
+@code{static}. The register name must be a valid register name for the
+target platform.
 
-Naturally the register name is CPU-dependent, so you need to
-conditionalize your program according to CPU type.  The register
-@code{a5} is a good choice on a 68000 for a variable of pointer
-type.  On machines with register windows, be sure to choose a ``global''
-register that is not affected magically by the function call mechanism.
+Registers can be a scarce resource on some systems and allowing the 
+compiler to manage their usage usually results in the best code. However, 
+under special circumstances it can make sense to reserve some globally.
+For example this may be useful in programs such as programming language 
+interpreters that have a couple of global variables that are accessed 
+very often.
 
-In addition, different operating systems on the same CPU may differ in how they
-name the registers; then you need additional conditionals.  For
-example, some 68000 operating systems call this register @code{%a5}.
+After defining a global register variable, for the current compilation
+unit:
 
-Eventually there may be a way of asking the compiler to choose a register
-automatically, but first we need to figure out how it should choose and
-how to enable you to guide the choice.  No solution is evident.
+@itemize @bullet
+@item The register is reserved entirely for this use, and will not be 
+allocated for any other purpose.
+@item The register is not saved and restored by any functions.
+@item Stores into this register are never deleted even if they appear to be 
+dead, but references may be deleted, moved or simplified.
+@end itemize
 
-Defining a global register variable in a certain register reserves that
-register entirely for this use, at least within the current compilation.
-The register is not alloca

Re: Proposed doc update for Explicit Reg Vars 3/3

2015-10-20 Thread David Wohlferd

I'm trying to sum up what was discussed here.  What I'm hearing is 
(quoting Jeff):


> the technical reality is I can't see a use outside the extended asm.

Andrew has discussed some other uses, but as Jeff observed:

> Given the way the optimizers and register allocation work,
> I don't think we can make guarantees around [Andrew's] use
> of the feature.  It happens to still work and may work
> forever, but I'm not going to set it in stone.

If the only usage we are prepared to "set in stone" is extended asm, 
then it follows that that is the only supported usage.  Areas of 
"non-guaranteed behavior" are things the docs should actively discourage 
people from using.


Given this, I'm going to go ahead and re-work the local register 
variables page (probably tomorrow) stating extended asm is the only 
supported usage.  Although I also think it's important to mention 
Andrew's point.  If someone sees it in code somewhere, at least the docs 
will give them some idea what is going on.


Stop me if I've got this all wrong...

> dealing with any blow-back we get along the way

Since no one is proposing any code changes here, it's not like people's 
programs are going to stop working tomorrow.  It's possible that some 
future code change to the Local Register Variables feature (perhaps 
provoked by this doc change) will break something.  But if the broken 
use case is deemed valid, it's the *code* change that should get the 
blow-back, not the doc change.  Hopefully when that happens, the new use 
case will then be added to the Local Register Variables docs.  Probably 
as a chunky block at the end of the page, but still...


Or did I miss your point?

Nobody has commented on 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64951 (register variable 
with template function).  While I'm updating the page, is this a 
limitation of Local Register Variables that should be doc'ed?


dw

Re: Proposed doc update for Explicit Reg Vars 2/3

2015-10-21 Thread David Wohlferd



I installed this patch from David with an update to use the "Registers 
are a scarce resource ..." text.


2 down, 1 to go.  Thanks.

dw

Re: Proposed doc update for Explicit Reg Vars 3/3

2015-10-22 Thread David Wohlferd




Given this, I'm going to go ahead and re-work the local register
variables page (probably tomorrow) stating extended asm is the only
supported usage.  Although I also think it's important to mention
Andrew's point.  If someone sees it in code somewhere, at least the docs
will give them some idea what is going on.

Stop me if I've got this all wrong...

Seems reasonable.


An updated Local Register Variables patch is attached with the changes 
discussed.  It also includes removing the extra space after '.' that 
Segher has been giving me grief about and Jeff's request re Globals:


> Signaling that this stuff may change and that we'd like better 
solutions for certain issues is, IMHO, worth keeping.  Please keep it.


I remain unconvinced that this text is useful for compiler-users, and it 
seems odd to keep gcc's todo list in the user documentation. But I trust 
your judgement, so I have restored the verbatim text to the global page 
(at the bottom).



Nobody has commented on
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64951 (register variable
with template function).  While I'm updating the page, is this a
limitation of Local Register Variables that should be doc'ed?
Your call -- folks will be switching from a development to a bugfixing 
mindset shortly as the gcc6 development window closes. Hopefully 
someone will take a look at this particular issue at that time.


Ahh.  I assumed that with your Jedi mind powers, you would just 'know' 
whether this was fixable.  I'll just wait to see what happens.


So, this pretty much finishes all the code and doc changes I wanted to 
do to gcc.  While doing this work, I have found a number of bugzilla 
entries where it looks like the needed work has already been done 
(sometimes by the doc changes I've been sending), but the bug hasn't yet 
been resolved.  While I can add comments, bugzilla won't let me resolve 
them or I would update them myself.


When people are in bug-fixing mode seems like the right time to pursue 
this.  How will I know when it's time?  Is there a web page?


dw
Index: extend.texi
===
--- extend.texi	(revision 229159)
+++ extend.texi	(working copy)
@@ -8471,12 +8471,12 @@
 @cindex specified registers
 
 GNU C allows you to associate specific hardware registers with C 
-variables.  In almost all cases, allowing the compiler to assign
-registers produces the best code.  However under certain unusual
+variables. In almost all cases, allowing the compiler to assign
+registers produces the best code. However under certain unusual
 circumstances, more precise control over the variable storage is 
 required.
 
-Both global and local variables can be associated with a register.  The
+Both global and local variables can be associated with a register. The
 consequences of performing this association are very different between
 the two, as explained in the sections below.
 
@@ -8579,6 +8579,10 @@
 variables, and to restore them in a @code{longjmp}. This way, the same
 thing happens regardless of what @code{longjmp} does.
 
+Eventually there may be a way of asking the compiler to choose a register 
+automatically, but first we need to figure out how it should choose and 
+how to enable you to guide the choice. No solution is evident.
+
 @node Local Register Variables
 @subsubsection Specifying Registers for Local Variables
 @anchor{Local Reg Vars}
@@ -8586,56 +8590,34 @@
 @cindex specifying registers for local variables
 @cindex registers for local variables
 
-You can define a local register variable with a specified register
-like this:
+You can define a local register variable and associate it with a specified 
+register like this:
 
 @smallexample
-register int *foo asm ("a5");
+register int *foo asm ("r12");
 @end smallexample
 
 @noindent
-Here @code{a5} is the name of the register that should be used.  Note
-that this is the same syntax used for defining global register
-variables, but for a local variable it appears within a function.
+Here @code{r12} is the name of the register that should be used. Note
+that this is the same syntax used for defining global register variables, 
+but for a local variable the declaration appears within a function. The 
+@code{register} keyword is required, and cannot be combined with 
+@code{static}. The register name must be a valid register name for the
+target platform.
 
-Naturally the register name is CPU-dependent, but this is not a
-problem, since specific registers are most often useful with explicit
-assembler instructions (@pxref{Extended Asm}).  Both of these things
-generally require that you conditionalize your program according to
-CPU type.
+As with global register variables, it is recommended that you choose 
+a register that is normally saved and restored by function calls on your 
+machine, so that calls to library routines will not clobber it.
 
-In addition, operating systems on one type of CPU may differ in how they
-name the r

inline asm and multi-alternative constraints

2015-10-25 Thread David Wohlferd

Does gcc's inline asm support multi-alternative constraints?  Or are 
they only supported for md?


The fact that it is doc'ed with the other constraints 
(https://gcc.gnu.org/onlinedocs/gcc/Constraints.html) says it works for 
inline.  But https://gcc.gnu.org/bugzilla/show_bug.cgi?id=10396#c17 says 
it only works for md.


I've got a patch ready to remove this section from the non-md docs 
(attached).  But there probably needs to be more support than a 11 year 
old comment to approve it.


Dropping a supported feature is always controversial.  But if it doesn't 
work, perhaps less so.  After all, doc'ing something that doesn't work 
is just as bad.


dw

PS If it *is* supported, then the docs need some work.
Index: extend.texi
===
--- extend.texi	(revision 229293)
+++ extend.texi	(working copy)
@@ -7902,9 +7902,6 @@
 When supported, the target will define the preprocessor symbol
 @code{__GCC_ASM_FLAG_OUTPUTS__}.
 
-Because of the special nature of the flag output operands, the constraint
-may not include alternatives.
-
 Most often, the target has only one flags register, and thus is an implied
 operand of many instructions.  In this case, the operand should not be
 referenced within the assembler template via @code{%0} etc, as there's
Index: md.texi
===
--- md.texi	(revision 229293)
+++ md.texi	(working copy)
@@ -1093,7 +1093,6 @@
 @ifclear INTERNALS
 @menu
 * Simple Constraints::  Basic use of constraints.
-* Multi-Alternative::   When an insn has two alternative constraint-patterns.
 * Modifiers::   More precise control over effects of constraints.
 * Machine Constraints:: Special constraints for some particular machines.
 @end menu
@@ -1450,6 +1449,7 @@
 @code{sign_extend}.
 @end ifset
 
+@ifset INTERNALS
 @node Multi-Alternative
 @subsection Multiple Alternative Constraints
 @cindex multiple alternative constraints
@@ -1530,6 +1530,8 @@
 the first, 1 for the second alternative, etc.).  @xref{Output Statement}.
 @end ifset
 
+@end ifset
+
 @ifset INTERNALS
 @node Class Preferences
 @subsection Register Class Preferences

Re: inline asm and multi-alternative constraints

2015-10-30 Thread David Wohlferd


On 10/29/2015 1:47 PM, Richard Henderson wrote:

On 10/27/2015 02:05 PM, Jeff Law wrote:

On 10/25/2015 09:41 PM, David Wohlferd wrote:

Does gcc's inline asm support multi-alternative constraints?  Or are
they only supported for md?

dw

PS If it *is* supported, then the docs need some work.

I think Richard corrected me last I spoke on this topic :-)  They *are*
supported.  ie, something like this should work on a ciscy target.

   asm("add %0,%1" : "=r,m"(x) : "rim,ri"(y))

Correct.

They are supported, so long as the assembly can use the same text for all
alternatives.  Thus multiple alternatives in inline asm is basically useless
for RISC targets, but occasionally useful for CISC targets.


Aha.  Thank you for the info.

Ok, I have discarded the other patch.  I'm not sure what this means for 
10396, since that seemed to be the solution there.


I have updated the non-md text with (most of) the changes I think it 
needs (attached).  These changes are pleasantly minor, mostly just 
adding some example text and a bit of formatting.


However.  Trying to actually use the information on this page is turning 
up some problems.


First, it could use a bit more clarity in the text that describes how 
gcc chooses among the alternatives.  There are apparently 3 criteria 
that affect this decision: # of statements needed to copy params, order 
of alternatives, and flags.  But it appears that they aren't all 
weighted equally.  For example no number of '?' seem to be able to 
override an alternative that causes a reload (change the example below 
to use eax instead of ebx).  But before I try to re-phrase this 
paragraph, I'm hoping someone can provide more details.  What exactly 
are the rules here?


Second, attempting to use ! and $ isn't working the way the docs led me 
to expect (actually, I can't make them work at all).  Starting with this 
(contrived) i386 code, gcc selects the second alternative (using -O2 for 
x64).  This makes sense to me.


int main()
{
   int x = 3;
   int y = 17;

   asm("or %1,%0" : "+r,b"(x) : "r,i"(y));

   return x;
}

It took two '?' in front of the 'b' to convince gcc to use the first 
alternative.  This seems reasonable, and shows that the first 
alternative is indeed viable.


The problem started when I tried to replace the '??' with '!'. Being 
"severe," I figured one '!' should do the work of two '?'. But even 
using multiple '!' doesn't cause it to switch to the first alternative.  
Parsing the current docs:


"! - Disparage severely the alternative that the '!' appears in. This 
alternative can still be used if it fits without reloading, but if 
reloading is needed, some other alternative will be used. "


The alternative I am trying to disparage uses ebx.  Using that register 
causes a push/pop.  That seems to me like "reloading is needed," so I 
expected that by definition the other alternative would be used.  Even 
if reloading isn't a factor (or if I don't understand it correctly), the 
alternative '!' is applied to should still be "severely" disparaged (ie 
at least as much as 2 '?').  But apparently it's not.


Using '^' does change the selection if I use two of them.  But according 
to the docs, '^' *only* applies if "the operand with the '^' needs a 
reload."  Since the '^' is having an effect, doesn't that imply that 
there is a reload associated with the second alternative?  But if there 
is, why didn't '!' work as expected?


'$' also doesn't affect the selection, probably for the same reasons as '!'.

I don't know if ! and $ are broken, or if the docs just aren't 
explaining them well enough.  But something isn't working here.


Lastly, while it is a core concept to compiler-writers, 'reloads' isn't 
really a compiler-user concept.  Perhaps some of these flags aren't user 
manual appropriate (I'm looking at you !^$).


dw

PS

Current text: https://gcc.gnu.org/onlinedocs/gcc/Multi-Alternative.html
Proposed text: http://limegreensocks.com/gcc/Multi_002dAlternative.html

Index: md.texi
===
--- md.texi	(revision 229470)
+++ md.texi	(working copy)
@@ -1465,6 +1465,8 @@
 constraint for an operand is made from the letters for this operand
 from the first alternative, a comma, the letters for this operand from
 the second alternative, a comma, and so on until the last alternative.
+All operands for a single instruction must have the same number of 
+alternatives.
 @ifset INTERNALS
 Here is how it is done for fullword logical-or on the 68000:
 
@@ -1483,8 +1485,20 @@
 @samp{%} in the cons

Re: inline asm and multi-alternative constraints

2015-11-02 Thread David Wohlferd


On 11/2/2015 3:43 PM, Sandra Loosemore wrote:

On 11/02/2015 04:06 PM, Jeff Law wrote:

On 10/30/2015 09:09 PM, David Wohlferd wrote:


I have updated the non-md text with (most of) the changes I think it
needs (attached).  These changes are pleasantly minor, mostly just
adding some example text and a bit of formatting.

However.  Trying to actually use the information on this page is 
turning

up some problems.

I think the fundamental problem here is we ought not be exposing those
modifiers to the user.  They're inherently tied to the details of the
register allocation and reloading passes.


This is what I'm thinking as well. 


I agree.

The only reason I didn't delete them before is that removing support for 
an existing feature can be contentious.  But I see no practical way to 
doc this for compiler-users.  And even as an inline-asm aficionado, I 
can't think of any use for them anyway.  Since there seems to be 
consensus, they're gone.


Why would a user even need multi-alternative constraints in inline 
asm?  An insn template might be instantiated in many different 
contexts and need to deal with different flavors of operands, but 
inline asm code is generally unique and the programmer writing it 
knows very well what the operands are supposed to be (this one is a 
register, that one is an address, etc).


I wouldn't go that far.  There are libraries that use inline asm 
(sometimes in headers), and a library cannot know how it may be used.  
The only practical approach for them is to list all options so gcc can 
do its best.


Choosing the most efficient form of a logical-or instruction is hardly 
a good motivating example, either -- nobody writes inline asm to do that.


However, it does make a good example for machine definitions, which also 
@includes this file.  And while perhaps not an optimal sample for inline 
asm, it is a reasonable one, since other platforms (including x86) have 
a similar instruction which has similar limitations.


I'm under the impression that the primary uses of inline asm are 
either to access machine instructions not exposed by builtins, or to 
provide a block of highly tuned code replacing all/most of a C 
function body.


I have attached a new patch.  It removes the flags and the paragraph 
that tries to describe how the alternative is chosen from the non-md 
docs.  Now it just says the compiler will choose the most efficient 
alternative.


Other than the line about "All operands for a single instruction must 
have the same number of alternatives", the 'internals' docs (which also 
includes this file) should be unaffected.


dw

Current text: https://gcc.gnu.org/onlinedocs/gcc/Multi-Alternative.html
Proposed text: http://limegreensocks.com/gcc/Multi_002dAlternative.html

Index: md.texi
===
--- md.texi	(revision 229470)
+++ md.texi	(working copy)
@@ -1465,6 +1465,8 @@
 constraint for an operand is made from the letters for this operand
 from the first alternative, a comma, the letters for this operand from
 the second alternative, a comma, and so on until the last alternative.
+All operands for a single instruction must have the same number of 
+alternatives.
 @ifset INTERNALS
 Here is how it is done for fullword logical-or on the 68000:
 
@@ -1482,9 +1484,7 @@
 @samp{0} for operand 1, and @samp{dmKs} for operand 2.  The @samp{=} and
 @samp{%} in the constraints apply to all the alternatives; their
 meaning is explained in the next section (@pxref{Class Preferences}).
-@end ifset
 
-@c FIXME Is this ? and ! stuff of use in asm()?  If not, hide unless INTERNAL
 If all the operands fit any one alternative, the instruction is valid.
 Otherwise, for each alternative, the compiler counts how many instructions
 must be added to copy the operands so that that alternative applies.
@@ -1521,7 +1521,6 @@
 the alternative only if the operand with the @samp{$} needs a reload.
 @end table
 
-@ifset INTERNALS
 When an insn pattern has multiple alternatives in its constraints, often
 the appearance of the assembler code is determined mostly by which
 alternative was matched.  When this is so, the C code for writing the
@@ -1529,7 +1528,22 @@
 the ordinal number of the alternative that was actually satisfied (0 for
 the first, 1 for the second alternative, etc.).  @xref{Output Statement}.
 @end ifset
+@ifclear INTERNALS
 
+So the first alternative for the 68000's logical-or could be written as 
+@code{"+m" (output) : "ir" (input)}.  The second could be @code{"+r" 
+(output): "irm" (input)}.  However, the fact that two memory locations 
+cannot be used in a single instruction prevents simply using @code{"+rm" 
+(output) : "irm" (input)}.  Using multi-alternatives, this might be 
+written as @code{"+m,r" (output) : "ir,irm" (input)}.  This describes
+all the available altern

Re: inline asm and multi-alternative constraints

2015-11-06 Thread David Wohlferd


On 11/6/2015 4:46 PM, Segher Boessenkool wrote:

On Fri, Nov 06, 2015 at 03:29:43PM -0700, Jeff Law wrote:

It's never easy to predict whether or not something like this will be
contentious.  Worst case is you post, it's contentious, we iterate a bit
and reach some kind of resolution (ok, worst case is no resolution is
reached, but that doesn't happen to often).

In this case I simply don't see a way to sensibly document those
modifiers without bringing in the implementation details of register
class preferencing, reload, IRA & LRA.  And once those details are
brought into the picture, everyone loses.


This very point is what made it clear to me that these flags should be 
removed.  If there is no practical way to describe to the target 
audience how they work or when to use them, they don't belong here.


I'll take partial credit for asking the questions that highlighted the 
problem, but Jeff taking the time to respond to them is what got us to 
the right answer here (thanks Jeff!).



I'm sure there's someone out there using '?' and '!' in a
multi-alternative asm constraint.  They may even read the docs and
complain and we can try to educate them why those modifiers are no
longer documented.

Another reason why we shouldn't document such things is that it makes
it harder to change (anything about) those things later, although they
really are implementation details.


True.  In this case, they were already documented and the change was to 
remove them, something I hesitate to do.  But I think what Jeff checked 
in (thanks Jeff!) gives us the right answer for this page.



The same goes for some constraints and almost all output modifiers.


Are you suggesting more doc changes?  Looking thru the pages you reference:

- Starting with 'modifiers', "=+&" and (reluctantly) "%" seem reasonable 
for inline asm.  But both "#*" seem sketchy.


- Under 'simple constraints', "mringX" all (more or less) make sense to 
me.  But "oV<>sp" are not things I can envision using.


- The 'machine constraints' for i386 (the only machine I know) all seem 
reasonable.  However for platforms that support autoincrement 
(powerpc?), apparently using "m" needs more docs (per 
https://gcc.gnu.org/ml/gcc/2008-03/msg01079.html).


Are these the things to which you are referring?  I've always assumed 
the parts that seem obscure here were due to my i386-centric view of the 
world.  Are some of them actually md-only?


There are other minor changes I'd make on some of these pages.  But 
mostly they are not worth it unless I'm doing something else there too.  
So if there's something here you think needs changing, let me know and 
I'll take a crack at it.


Other than that, I'll keep working my way thru the doc issues in the 
inline-asm bugs.  I've done what I can for 10396.


dw

basic asm and memory clobbers

2015-11-08 Thread David Wohlferd

It seems like a doc update is what is needed to close PR24414 (Old-style 
asms don't clobber memory).  I'm working on this now (phase 1) in the 
unlikely event that someone is inspired to make a code change here instead.


Like  Richard Henderson, I rather expected basic (or "old-style") asm to 
perform a memory clobber (it doesn't).  This bug is now over a decade 
old, so presumably the question of how this is going to work is 
settled.  IAC, the docs should reflect the current behavior.


Based on this issue plus what I have learned since last updating this 
page, I'm proposing the attached patch.


dw

CCing the commenters from the bug.
Index: extend.texi
===
--- extend.texi	(revision 229910)
+++ extend.texi	(working copy)
@@ -7353,7 +7353,8 @@
 @end itemize
 
 Safely accessing C data and calling functions from basic @code{asm} is more 
-complex than it may appear. To access C data, it is better to use extended 
+complex than it may appear. To access C data (including both local and
+global register variables), use extended
 @code{asm}.
 
 Do not expect a sequence of @code{asm} statements to remain perfectly 
@@ -7376,6 +7377,12 @@
 visibility of any symbols it references. This may result in GCC discarding 
 those symbols as unreferenced.
 
+Basic @code{asm} statements are not treated as though they used a "memory"
+clobber, although they do implicitly perform a clobber of the flags
+(@pxref{Clobbers}). Also, there is no implicit clobbering of registers,
+so any registers changed must be restored to their original value before
+exiting the @code{asm}.
+
 The compiler copies the assembler instructions in a basic @code{asm} 
 verbatim to the assembly language output file, without 
 processing dialects or any of the @samp{%} operators that are available with

Re: inline asm and multi-alternative constraints

2015-11-11 Thread David Wohlferd


On 11/9/2015 1:52 PM, Jeff Law wrote:

On 11/07/2015 12:50 AM, David Wohlferd wrote:


- Starting with 'modifiers', "=+&" and (reluctantly) "%" seem reasonable
for inline asm.  But both "#*" seem sketchy.
Right.  =+& are no-brainer yes, as are the constants 0-9.  % is 
probably OK as well.


#* are similar to !? in that they are inherently tied into the 
register class preferencing implementation and documenting them would 
be inadvisable.


Actually, #* are already doc'ed in the user guide.  Are you advising 
they be removed?


If so, the attached patch does this.  It also removes references to 
define_peephole2 and define_splits from the user guide version of this 
page.  There are other parts of this page that are more md than ug, but 
these are the ones that annoyed me the most.


dw

Original: https://gcc.gnu.org/onlinedocs/gcc/Modifiers.html
Proposed: http://limegreensocks.com/gcc/Modifiers.html

Index: md.texi
===
--- md.texi	(revision 229910)
+++ md.texi	(working copy)
@@ -1646,7 +1646,9 @@
 GCC can only handle one commutative pair in an asm; if you use more,
 the compiler may fail.  Note that you need not use the modifier if
 the two alternatives are strictly identical; this would only waste
-time in the reload pass.  The modifier is not operational after
+time in the reload pass.
+@ifset INTERNALS
+The modifier is not operational after
 register allocation, so the result of @code{define_peephole2}
 and @code{define_split}s performed after reload cannot rely on
 @samp{%} to make the intended insn match.
@@ -1665,7 +1667,6 @@
 @samp{*} additionally disparages slightly the alternative if the
 following character matches the operand.
 
-@ifset INTERNALS
 Here is an example: the 68000 has an instruction to sign-extend a
 halfword in a data register, and can also sign-extend a value by
 copying it into an address register.  While either kind of register is

Re: inline asm and multi-alternative constraints

2015-11-11 Thread David Wohlferd




On 11/9/2015 2:03 AM, Richard Earnshaw wrote:

On 09/11/15 09:57, Richard Earnshaw wrote:

On 07/11/15 09:23, Segher Boessenkool wrote:

On Fri, Nov 06, 2015 at 11:50:40PM -0800, David Wohlferd wrote:

The same goes for some constraints and almost all output modifiers.

Are you suggesting more doc changes?  Looking thru the pages you reference:

- Starting with 'modifiers', "=+&" and (reluctantly) "%" seem reasonable
for inline asm.  But both "#*" seem sketchy.

Output modifiers, not constraint modifiers -- things like "%X0" in
the output template.  Many are only useful in the machine description,
but some (like that 'X' for rs6000) are vital for asm as well.


They're not just useful, they're essential on AArch64 and ARM.  They're
needed, for example, to get the 32/64-bit register sizing correct.


On the other hand, we have %S on ARM which cannot ever be used from
inline assembler: it matches the result of a match_operator rule with
complex internal structure that could never be generated from user code.

R.


I don't know enough about ARM to be comfortable making a change here 
myself.  But dropping this out of the docs is simple enough.  All you 
need to do is wrap it with


@ifset INTERNALS
@end ifset

Then it will still show up in the Machine Description section, but not 
the users' guide.


dw

Re: basic asm and memory clobbers

2015-11-15 Thread David Wohlferd


On 11/9/2015 1:32 AM, Segher Boessenkool wrote:

On Sun, Nov 08, 2015 at 04:10:01PM -0800, David Wohlferd wrote:

It seems like a doc update is what is needed to close PR24414 (Old-style
asms don't clobber memory).

What is needed to close the bug is to make the compiler work properly.


The question of course is, what does 'properly' mean?  My assertion is 
that 10 years on, 'properly' means whatever it's doing now. Changing it 
at this point will probably break more than it fixes, and (as you said) 
there is a plausible work-around using extended asm.


So while this bug could be resolved as 'invalid' (since the compiler is 
behaving 'properly'), I'm thinking to split the difference and 'fix' it 
with a doc patch that describes the supported behavior.



Whether that means clobbering memory or not, I don't much care -- with
the status quo, if you want your asm to clobber memory you have to use
extended asm; if basic asm is made to clobber memory, if you want your
asm to *not* clobber memory you have to use extended asm (which you
can with no operands by writing e.g.  asm("bork" : );  ).  So both
behaviours are available whether we make a change or not.

But changing things now will likely break user code.


  Safely accessing C data and calling functions from basic @code{asm} is more
-complex than it may appear. To access C data, it is better to use extended
+complex than it may appear. To access C data (including both local and
+global register variables), use extended
  @code{asm}.

I don't think this makes things clearer.  Register vars are described
elsewhere already;


The docs for local register variables describe this limitation.  But 
globals does not.


Whether this information belongs in local register, global register, 
basic asm, or all 3 depends on which section of the docs users will be 
reading when they need to know this information.



if you really think it needs mentioning here, put
it at the end (in its own sentence), don't break up this sentence.


Ok.


(dot space space).


+Basic @code{asm} statements are not treated as though they used a "memory"
+clobber, although they do implicitly perform a clobber of the flags
+(@pxref{Clobbers}).

They do not clobber the flags.  Observe:


Ouch.  i386 shows the same thing for basic asm.

Having to preserve the flags is ugly, but since that's the behavior, 
let's write it down.



===
void f(int a)
{
 a = a >> 2;
 if (a <= 0)
 asm("OHAI");
 if (a >= 0)
 asm("OHAI2");
}
===

Compiling this for powerpc gives (-m32, edited):

f:
 srawi. 9,3,2# this sets cr0
 ble 0,.L5   # this uses cr0
.L2:
 OHAI2
 blr
 .p2align 4,,15
.L5:
 OHAI
 bnelr 0 # this uses cr0
 b .L2

which shows that CR0 (which is "cc") is live over the asm.  So are all
other condition regs.

It is true for cc0 targets I guess, but there aren't many of those left.


Also, there is no implicit clobbering of registers,
+so any registers changed must be restored to their original value before
+exiting the @code{asm}.

One of the important uses of asm is to set registers GCC does not know
about, so you might want to phrase this differently.


Ahh, good point.

What would you say to "general purpose registers?"

Update attached.

dw
Index: extend.texi
===
--- extend.texi	(revision 229910)
+++ extend.texi	(working copy)
@@ -7353,8 +7353,9 @@
 @end itemize
 
 Safely accessing C data and calling functions from basic @code{asm} is more 
-complex than it may appear. To access C data, it is better to use extended 
-@code{asm}.
+complex than it may appear. To access C data use extended @code{asm}. Do
+not attempt to directly access local or global register variables from
+within basic @code{asm} (@pxref{Explicit Register Variables}).
 
 Do not expect a sequence of @code{asm} statements to remain perfectly 
 consecutive after compilation. If certain instructions need to remain 
@@ -7376,6 +7377,11 @@
 visibility of any symbols it references. This may result in GCC discarding 
 those symbols as unreferenced.
 
+Basic @code{asm} statements are not treated as though they used a "memory"
+clobber (@pxref{Clobbers}). Also, neither the flags nor the general-purpose
+registers are clobbered, so any changes must be restored to their original
+value before exiting the @code{asm}.
+
 The compiler copies the assembler instructions in a basic @code{asm} 
 verbatim to the assembly language output file, without 
 processing dialects or any of the @samp{%} operators that are available with

Re: basic asm and memory clobbers

2015-11-16 Thread David Wohlferd


On 11/16/2015 1:29 PM, Jeff Law wrote:

On 11/15/2015 06:23 PM, David Wohlferd wrote:

On 11/9/2015 1:32 AM, Segher Boessenkool wrote:

On Sun, Nov 08, 2015 at 04:10:01PM -0800, David Wohlferd wrote:
It seems like a doc update is what is needed to close PR24414 
(Old-style

asms don't clobber memory).

What is needed to close the bug is to make the compiler work properly.


The question of course is, what does 'properly' mean?  My assertion is
that 10 years on, 'properly' means whatever it's doing now. Changing it
at this point will probably break more than it fixes, and (as you said)
there is a plausible work-around using extended asm.

So while this bug could be resolved as 'invalid' (since the compiler is
behaving 'properly'), I'm thinking to split the difference and 'fix' it
with a doc patch that describes the supported behavior.
I'd disagree.  A traditional asm has to be considered an opaque blob 
that read/write/clobber any register or memory location.


When I first encountered basic asm, my expectation was that of course it 
clobbers.  It HAS to, right?  But that said, let me give my best devil's 
advocate impersonation and ask: Why?


- There is no standard that says it must do this.
- I'm only aware of 1 person who has ever asked for this change. And the 
request has been deemed so unimportant it has languished for a very long 
time.
- There is a plausible work-around with extended asm, which (mostly) has 
clear semantics regarding clobbers.
- While the change probably won't introduce bad code, if it does it will 
be in ways that are going to be difficult to track down, in an area 
where few have the expertise to debug.
- Existing code that currently does things 'right' (ie push/pop any 
modified registers) will suddenly be doing things 'wrong,' or at least 
wastefully.
- Other than top-level asm, it seems like every existing basic asm will 
(probably) get a new performance penalty (memory usage + code size + 
cycles) to allow for situations they may already be handling correctly 
or that don't apply.


True, these aren't particularly compelling reasons to not make the 
change.  But I also don't see any compelling benefits to offset them.


For existing users, presumably they have already found whatever solution 
they need and will just be annoyed that they have to revisit their code 
to see the impact of this change.  Will they need to #if to ensure 
consistent performance/function between gcc versions?  For future users, 
they will have the docs telling them the behavior, and pointing them to 
the (now well documented) extended asm.  Where's the benefit?


If someone were proposing basic asm as a new feature, I'd absolutely be 
arguing that it should clobber everything.  Or I might argue that basic 
asm should only be allowed at top-level (where I don't believe 
clobbering matters?) and everything else should be extended asm so we 
KNOW what to clobber (hmm...).


But changing this so gcc tries (probably futilely) to emulate other 
implementations of asm...  That seems like a weak case to support a 
change to this long-time behavior.  Unless there are other benefits I'm 
just not seeing?


--
Ok, that's my best shot.  You have way more expertise and experience 
here than I do, so I expect that after you think it over, you'll make 
the right call.  And despite my attempt here to defend the opposite 
side, I'm not entirely sure what the right call is.  But these seem like 
the right questions.


Either way, let me know if I can help.

It's also the case that assuming an old style asm can read or clobber 
any memory location is the safe, conservative thing to do. 


Well, safe-r.  Even if you make this change, embedding basic asm in C 
routines still seems risky.  Well, riskier than extended which is risky 
enough.


So the right thing in my mind is to ensure that behaviour 


The right thing in my mind is to find ways to prod people into using 
extended asm instead of basic.  Then they explicitly specify their 
requirements rather than depending on clunky all-or-nothing defaults.


Maybe to the extent of gcc deprecating (non-top level) basic over time 
(-fallow-basic-asm=[none|top|any] where v6 defaults to 'any' and v7 
defaults to 'top').  I'd be surprised if gcc went this way, but that 
doesn't mean it wouldn't be better.



and document it.


and to document it.


Andrew's logic is just plain wrong in that BZ.





Whether that means clobbering memory or not, I don't much care -- with
the status quo, if you want your asm to clobber memory you have to use
extended asm; if basic asm is made to clobber memory, if you want your
asm to *not* clobber memory you have to use extended asm (which you
can with no operands by writing e.g.  asm("bork" : );  ).  So both
behaviours a

Re: basic asm and memory clobbers

2015-11-19 Thread David Wohlferd




Unless there are other benefits I'm just not seeing?
When we fix 24414 by honoring the "uses/clobbers all hard registers 
and memory" semantics for old-style asms, those old-style asms will be 
*less* likely to cause problems in the presence of ever-improving 
optimization techniques.


Ok, this is a good point.  In fact, it may resolve existing problems 
that people don't know they have.


However I still have concerns that people might be surprised by the 
change in behavior.  Looking thru the linux kernel source (a significant 
collection of inline asm containing both basic (~878) and extended 
(4833) statements), it seems there are places where they really are 
going to want the "clobber nothing" semantics.


For that reason, I'd like to propose adding 2 new clobbers to extended 
asm as part of this work:


"clobberall" - This gives extended the same semantics as whatever the 
new basic asm will be using.

"clobbernone" - This gives the same semantics as the current basic asm.

Clobbernone may seem redundant, since not specifying any clobbers should 
do the same thing.  But actually it doesn't, at least on i386.  At 
present, there is no way for extended asm to not clobber "cc".  I don't 
know if other platforms have similar issues.


When basic asm changes, I expect that having a way to "just do what it 
used to do" is going to be useful for some people.



Either way, let me know if I can help.
About the only immediate task would be to ensure that the 
documentation for traditional asms clearly documents the desired 
semantics and somehow note that there are known bugs in the 
implementation (ie 24414, handling of flags registers, and probably 
other oddities)


Given that gcc is at phase 3, I'm guessing this work won't be in v6?  Or 
would this be considered "general bugfixing?"


The reason I ask is I want to clearly document what the current behavior 
is as well as informing them about what's coming.  If this isn't 
changing until v7, the text can be updated then to reflect the new behavior.



And I suspect it's still a lot less intrusive than you might think.


I tried to picture the most basic case I can think of that uses 
something clobber-able:


   for (int x=0; x < 1000; x++)
  asm("#stuff");

This generates very simple and highly performant code:

movl$1000, %eax
.L2:
#stuff
subl$1, %eax
jne .L2

Using extended asm to simulate the clobberall gives:

movl$1000, 44(%rsp)
.L2:
#stuff
subl$1, 44(%rsp)
jne .L2

It allocates an extra 4 bytes, and changed everything to memory accesses 
instead of using a register.  Obviously not a huge performance impact on 
this tiny sample, but it does suggest to me that sometimes there could be.


My point being simply that people may want the old behavior, so we need 
to be sure there's a way to get it (ie "clobbernone").



+Basic @code{asm} statements are not treated as though they used a

"memory"
+clobber, although they do implicitly perform a clobber of the flags
+(@pxref{Clobbers}).

They do not clobber the flags.  Observe:


Ouch.  i386 shows the same thing for basic asm.

Sadly, I suspect this isn't consistent across targets.


Bigger ouch.  I'll follow up on this after the discussion about changing
basic asm is complete (which may render this moot).

It likely depends on how the target models the flags.


I'm not quite sure how to proceed here.  I'm pretty sure no one wants me 
to write "basic asm doesn't clobber flags, except that maybe it does on 
some (unspecified) platforms."


I've tried to follow the code, but without any particular success. I was 
hoping to see decode_reg_name_and_count (or decode_reg_name) being 
called from platform-specific routines and handling -3, but not so much.


Using users as beta testers is normally frowned upon (outside of 
Microsoft), but perhaps the solution here is to just say that it doesn't 
clobber flags (currently the most common case?), and update the docs if 
and when people complain?  Yes, that's bad, but saying nothing at all 
isn't any better.  And we know it's true for at least 2 platforms.


dw

Re: basic asm and memory clobbers

2015-11-20 Thread David Wohlferd


On 11/20/2015 2:17 AM, Andrew Haley wrote:

On 20/11/15 01:23, David Wohlferd wrote:

I tried to picture the most basic case I can think of that uses
something clobber-able:

 for (int x=0; x < 1000; x++)
asm("#stuff");

This generates very simple and highly performant code:

  movl$1000, %eax
.L2:
  #stuff
  subl$1, %eax
  jne .L2

Using extended asm to simulate the clobberall gives:

  movl$1000, 44(%rsp)
.L2:
  #stuff
  subl$1, 44(%rsp)
  jne .L2

It allocates an extra 4 bytes, and changed everything to memory accesses
instead of using a register.

Can you show us your code?  I get

xx:
movl$1000, %eax
.L2:
#stuff
subl$1, %eax
jne .L2
rep; ret

for

void xx() {
   for (int x=0; x < 1000; x++)
 asm volatile("#stuff" : : : "memory");
}

What you're describing looks like a bug: x doesn't have its address
taken.


The intent for 24414 is to change basic asm such that it will become 
(quoting jeff) "an opaque blob that read/write/clobber any register or 
memory location."  Such being the case, "memory" is not sufficient:


#define CLOBBERALL "eax", "ebx", "ecx", "edx", "r8", "r9", "r10", "r11", 
"r12", "r13", "r14", "r15", "edi", "esi", "ebp", "cc", "memory"


int main()
{
   for (int x=0; x < 1000; x++)
  asm("#":::CLOBBERALL);
}

dw

Re: basic asm and memory clobbers

2015-11-20 Thread David Wohlferd


On 11/19/2015 7:14 PM, Segher Boessenkool wrote:

On Thu, Nov 19, 2015 at 05:23:55PM -0800, David Wohlferd wrote:

For that reason, I'd like to propose adding 2 new clobbers to extended
asm as part of this work:

"clobberall" - This gives extended the same semantics as whatever the
new basic asm will be using.
"clobbernone" - This gives the same semantics as the current basic asm.

I don't think this is necessary or useful.  They are also awful names:
"clobberall" cannot clobber everything (think of the stack pointer),


I'm not emotionally attached to the names.  But providing the same 
capability to extended that we are proposing for basic doesn't seem so 
odd.  Shouldn't extended be able to do (at least) everything basic does?


My first thought is that it allows people to incrementally start 
migrating from (new) basic to extended (something I think we should 
encourage).  Or use it as a debug tool to see if the failure you are 
experiencing from your asm is due to a missing clobber.  Since the 
capability will already be implemented for basic, providing a way to 
access it from extended seems trivial (if we can agree on a name).


As you say, clobbering the stack pointer presents special challenges 
(although gcc has a specific way of dealing with stack register 
clobbers, see 52813).  This is why I described the feature as having 
"the same semantics as whatever the new basic asm will be using."



and "clobbernone" does clobber some (those clobbered by any asm),


Seems like a quibble.  Those other things (I assume you mean things like 
pipelining?) most users aren't even aware of (or they wouldn't be so 
eager to use inline asm in the first place).  Would it be more palatable 
if we called it "v5BasicAsmMode"?  "ClobberMin"?



Clobbernone may seem redundant, since not specifying any clobbers should
do the same thing.  But actually it doesn't, at least on i386.  At
present, there is no way for extended asm to not clobber "cc".  I don't
know if other platforms have similar issues.

Some do.  The purpose is to stay compatible with asm written for older
versions of the compiler.


Backward compatibility is important.  I understand that due to the cc0 
change in x86, existing code may have broken without always clobbering 
cc.  This was seen as the safest way to ensure that didn't happen.  
However no solution was/is available for people who correctly knew 
whether their asm clobbers the flags.


Mostly I'm ok with that.  All the ways that I can think of to try to 
re-allow people to start using the cc clobber are just not worth it.  I 
simply can't believe there are many cases where there's going to be a 
benefit.


But as I said: backward compatibility is important.  Providing a way for 
people who need/want the old basic asm semantics seems useful. And I 
don't believe we can (quite) do that without clobbernone.



When basic asm changes, I expect that having a way to "just do what it
used to do" is going to be useful for some people.

24414 says the documented behaviour hasn't been true for at least
fourteen years.  It isn't likely anyone is relying on that behaviour.


?

To my knowledge, there was no documentation of any sort about what basic 
asm clobbered until I added it.  But what people are (presumably) 
relying on is that whatever it did in the last version, it's going to 
continue to do that in the next.  And albeit with good intentions, we 
are planning on changing that.



but perhaps the solution here is to just say that it doesn't
clobber flags (currently the most common case?), and update the docs if
and when people complain?  Yes, that's bad, but saying nothing at all
isn't any better.  And we know it's true for at least 2 platforms.

Saying nothing at all at least is *correct*.


We don't know that saying "it doesn't clobber flags" is wrong either.  
All we know is that jeff said "I suspect this isn't consistent across 
targets."


But that's neither here nor there.  The real question is, if we can't 
say that, what can we say?


- If 24414 is going in v6, then we can doc that it does the clobber and 
be vague about the old behavior.
- If 24414 isn't going in v6, then what?  I suppose we can say that it 
can vary by platform.  We could even provide your sample code as a means 
for people to discover their platform's behavior.



It isn't necessary for users to know what registers the compiler
considers to be clobbered by an asm, unless they actually clobber
something in the assembler code themselves.


I'm not sure I follow.

If someone has code that uses a register, currently they must restore 
the value before exiting the asm or risk disaster.  So they might write 
asm("push eax ; DoSomethingWith eax ; pop eax").

Re: basic asm and memory clobbers

2015-11-20 Thread David Wohlferd


On 11/20/2015 3:14 AM, Andrew Haley wrote:

On 20/11/15 10:37, David Wohlferd wrote:

The intent for 24414 is to change basic asm such that it will become
(quoting jeff) "an opaque blob that read/write/clobber any register or
memory location."  Such being the case, "memory" is not sufficient:

#define CLOBBERALL "eax", "ebx", "ecx", "edx", "r8", "r9", "r10", "r11",
"r12", "r13", "r14", "r15", "edi", "esi", "ebp", "cc", "memory"

Hmm.  I would not be at all surprised to see this cause reload
failures.  You certainly shouldn't clobber the frame pointer on
any machine which needs one.


If I don't clobber ebp, gcc just uses it:

movl$1000, %ebp
.L2:
#
subl$1, %ebp
jne .L2

The original purpose of this code was to attempt to show that this kind 
of "clobbering everything" behavior (the proposed new behavior for basic 
asm) could have non-trivial impact on existing routines. While I've been 
told that changing the existing "clobber nothing" approach to this kind 
of "clobber everything" is "less intrusive than you might think," I'm 
struggling to believe it.  It seems to me that one asm("nop") thrown 
into a driver routine to fix a timing problem could end up making a real 
mess.


But actually we're kind of past that.  When Jeff, Segher, (other) Andrew 
and Richard all say "this is how it's going to work," it's time for me 
to set aside my reservations and move on.


So now I'm just trying my best to make sure that if it *is* an issue, 
people have a viable solution readily available.  And to make sure it's 
all correctly doc'ed (which is what started this whole mess).


dw

Re: basic asm and memory clobbers

2015-11-20 Thread David Wohlferd


On 11/20/2015 8:14 AM, Richard Henderson wrote:

On 11/20/2015 04:34 PM, Jakub Jelinek wrote:

Isn't that going to break too much code though?  I mean, e.g. including
libgcc...


I don't know.  My suspicion is very little.

But that's actually what I'd like to know before we start adjusting 
code in other ways wrt basic asms.


I can provide a little data here.

In an effort to gain some perspective, I've been looking at inline asm 
usage in the linux kernel (4.3).  Clearly this isn't "typical usage," 
but it is probably one of the biggest users of inline asm, and likely 
has the best justifications for doing so (being an OS and all).


There are ~5,711 instances of inline asm in use.  Of those, ~4,833 are 
extended and ~878 are basic.


I don't have any numbers about how many are top level vs in function, 
but let me see what I can do.


A quick look at libgcc shows that there are 109 extended and 45 basic 
asm statements.  I'll see how many end up being top-level, but it looks 
like most of them.


dw

Re: basic asm and memory clobbers

2015-11-21 Thread David Wohlferd


On 11/20/2015 3:55 PM, David Wohlferd wrote:

On 11/20/2015 8:14 AM, Richard Henderson wrote:

On 11/20/2015 04:34 PM, Jakub Jelinek wrote:

Isn't that going to break too much code though?  I mean, e.g. including
libgcc...


I don't know.  My suspicion is very little.

But that's actually what I'd like to know before we start adjusting 
code in other ways wrt basic asms.


I can provide a little data here.

In an effort to gain some perspective, I've been looking at inline asm 
usage in the linux kernel (4.3).  Clearly this isn't "typical usage," 
but it is probably one of the biggest users of inline asm, and likely 
has the best justifications for doing so (being an OS and all).


There are ~5,678 instances of inline asm in use.  Of those, ~4,833 are 
extended and ~845 are basic.


I don't have any numbers about how many are top level vs in function, 
but let me see what I can do.


Ok, the news here is mixed.  Of those 845:

- Only 50 of them look like top level asm.  I was hoping for more.
- 457 are in 6 files in the lib/raid6 directory, so there's a bunch that 
can be done quickly.
- That leaves 338 miscellaneous other uses spread throughout some 200 
files across multiple platforms.  That seems like a lot.


Despite the concerns expressed by Jeff about the difficulties in 
changing from basic to extended, it looks to me like they don't need any 
conversion (other s/%/%%/).  Adding the trailing colon should be 
sufficient to provide the semantics they have now, which apparently is 
deemed sufficient.


A quick look at libgcc shows that there are 109 extended and 45 basic 
asm statements.  I'll see how many end up being top-level, but it 
looks like most of them.


Of the 45 basic asm statements, only 9 aren't top-level.  They all 
appear to be trivial to change to extended.


To sum up:

- Some projects (like libgcc) are going to be simple to update. Maybe an 
hour's work all told.
- Some projects (like testsuite) are going to take longer.  While the 
changes are mostly straight-forward, the number of files involved will 
be a factor.
- I took a look at the Mingw-w64 project.  It has ~20 non-top level 
asms, so also pretty simple to update.
- But some projects (like linux kernel) are going to be more 
challenging.  Not so much making the changes (although that will take a 
while) as convincing yourself that the change was harmless and still 
compiles on all supported platforms.


Yes, this represents a very limited sample.  And one weighted towards 
projects that are more likely to use asm.  But it does give some sense 
of the scale.


So, what now?

While I'd like to take the big step and start kicking out warnings for 
non-top-level right now, that may be too bold for phase 3.  A more 
modest step for v6 would just provide a way to find them (maybe 
something like -Wnon-top-basic-asm or -Wonly-top-basic-asm) and doc the 
current behavior as well as the upcoming change.


Adding the warning is not something I can do.  But I'll create the doc 
patch once someone confirms that this is the plan.


dw

Re: basic asm and memory clobbers

2015-11-21 Thread David Wohlferd

On 11/19/2015 5:53 PM, Sandra Loosemore wrote:

On 11/19/2015 06:23 PM, David Wohlferd wrote:

About the only immediate task would be to ensure that the
documentation for traditional asms clearly documents the desired
semantics and somehow note that there are known bugs in the
implementation (ie 24414, handling of flags registers, and probably
other oddities)

Given that gcc is at phase 3, I'm guessing this work won't be in v6? Or
would this be considered "general bugfixing?"

The reason I ask is I want to clearly document what the current behavior
is as well as informing them about what's coming. If this isn't
changing until v7, the text can be updated then to reflect the new
behavior.

Documentation fixes are accepted all the way through Stage 4, since
there's less risk of introducing regressions in user programs from
accidental documentation mistakes than code errors.

The code change isn't yet finalized. I'm hoping to doc something
vaguely like:

"basic asm (other than at top level) is being deprecated because blah> potentially unsafe due to optimizations . You can
locate the statements that will no longer be supported using
-Wonly-top-basic-asm. Change them to use extended asm instead."

What's your take on having the user guide link to the gcc wiki? If we
do make this change, I'd kinda like to create a "how to convert basic
asm to extended." But it doesn't seem like a good fit for the user
docs. But if the user docs don't reference the wiki, I doubt anyone
would ever find it.

OTOH, I'd discourage adding anything to the docs about anticipated
changes in future releases, except possibly to note that certain
features or behavior are deprecated and may be removed in future
releases (with a suggestion about what you should do instead).

I'd love to see the doc folks make a pass and remove every "some day
this won't work" text that doesn't include this. If there is no way for
users to prepare, you aren't helping.

And remove all the "some day there might be a new feature" stuff too.
It just wastes users' time trying to figure out if "some day" has
arrived yet. And it makes them cry when the new feature, which is
exactly what they need, isn't there yet.

We've already got too many "maybe someday this will be fixed" notes in
the manual that are not terribly useful to users.

You'd get my vote to remove them all. If I got a vote.

Re: basic asm and memory clobbers

2015-11-23 Thread David Wohlferd


On 11/23/2015 2:04 AM, Andrew Haley wrote:

On 21/11/15 12:56, David Wohlferd wrote:

So, what now?

While I'd like to take the big step and start kicking out warnings for
non-top-level right now, that may be too bold for phase 3.  A more
modest step for v6 would just provide a way to find them (maybe
something like -Wnon-top-basic-asm or -Wonly-top-basic-asm) and doc the
current behavior as well as the upcoming change.

Warnings would be good.


Richard's suggestion was:

> I'm suggesting that we don't accept [basic asm] at all inside a 
function.  One must audit the source and make a conscious decision to 
write asm("bla" : ); instead.  Accepting basic asm outside of a function 
is perfectly ok.


I'm really not a compiler-writer, but I've taken a shot at implementing 
this.  While Richard is talking about completely deprecating this 
feature (a direction I support), I've started by emitting warnings, and 
by having the warnings disabled by default. This allows people to 
experiment with the new direction without getting clobbered by it.  My 
intent is something like this:


-Wonly-top-basic-asm
Warn if basic @code{asm} statements are used inside a function (ie not
at file scope/top level).  Due to the potential for unsafe optimizations,
always use extended instead of basic asm inside functions.  This
warning is disabled by default and is not enabled by -Wall or -Wextra.

I probably won't include the bits about Wall or Wextra in the actual doc 
patch.  They're here more to provoke comments in case someone thinks 
this behavior should change.  I'm open to suggestions about alternate 
names, too.


I've got this working for both c and c++.  It doesn't affect other 
places that use "asm" like "explicit register variables," "asm labels," 
"extended asm" or "top level basic asm."  It is also pleasantly small.  
As written, it should be useful to find places in current code that are 
at risk.


However (there's always a 'however'), it doesn't correctly handle 
"naked" functions (ie __attribute__((naked)) ).  By definition, naked 
functions can *only* include basic asm 
(https://gcc.gnu.org/ml/gcc/2014-05/msg00172.html).  So generating a 
warning for them is incorrect.


I'll need help fixing that.

I don't know if it is possible from within the parsers (where my current 
code is being added) to walk back up and get the attributes for the 
function.  I assume not.  In that case, I'll need some help finding some 
place up the call stack where you can.  Suggestions welcome.


The patch is at http://www.LimeGreenSocks.com/gcc/24414f.zip and 
includes test code.



My warning still holds: there are modes of compilation on some
machines where you can't clobber all registers without causing reload
failures.  This is why Jeff didn't fix this in 1999.  So, if we really
do want to clobber "all" registers in basic asm it'll take a lot of
work.


I was always reluctant to see this change made.  In addition to the 
issues you mention, I had questions about the impact on the surrounding 
code.  I like Richard's direction much better.  We can start with a 
disabled warning, then upgrade as seems warranted.


dw

Re: basic asm and memory clobbers

2015-11-23 Thread David Wohlferd


On 11/23/2015 12:37 PM, Jeff Law wrote:

On 11/23/2015 03:04 AM, Andrew Haley wrote:

On 21/11/15 12:56, David Wohlferd wrote:

So, what now?

While I'd like to take the big step and start kicking out warnings for
non-top-level right now, that may be too bold for phase 3.  A more
modest step for v6 would just provide a way to find them (maybe
something like -Wnon-top-basic-asm or -Wonly-top-basic-asm) and doc the
current behavior as well as the upcoming change.


Warnings would be good.

My warning still holds: there are modes of compilation on some
machines where you can't clobber all registers without causing reload
failures.  This is why Jeff didn't fix this in 1999.  So, if we really
do want to clobber "all" registers in basic asm it'll take a lot of
work.
Exactly.  In retrospect, I probably should have generated more tests 
for those conditions back in '99.  Essentially they'd document a class 
of problems we'd like to fix over time.


I know some have been addressed in various forms, but it hasn't been 
systematic.


My recommendation here is to:

  1. Note in the docs what the behaviour should be.  This guides where
  we want to go from an implementation standpoint.  I think it'd be fine
  to *suggest* only using old style asms at the toplevel, but I'm less
  convinced that mandating that restriction is wise.


I hear your concerns about mandating this.  Perhaps starting by 
providing an option to find them, then (someday) enabling that option by 
default?



  2. As we come across failures for adhere to the desired behaviour,
  fix or document them as known inconsistencies.  If we find that some
  are inherently un-fixable, then we'll need to tighten the docs around
  those.


It's your expectation that extended asm won't be sufficient to resolve 
these issues?


The more I think about it, I'm just not keen on forcing all those 
old-style asms to change.


If you mean you aren't keen to change them to "clobber all," I'm with 
you.  If you are worried about changing them from basic to extended, 
what kinds of problems do you foresee?  I've been reading a lot of basic 
asm lately, and it seems to me that most of it would be fine with a 
simple colon.  Certainly no worse than the current behavior.


dw

Re: basic asm and memory clobbers

2015-11-23 Thread David Wohlferd


On 11/23/2015 1:44 PM, paul_kon...@dell.com wrote:

On Nov 23, 2015, at 4:36 PM, David Wohlferd  wrote:

...

The more I think about it, I'm just not keen on forcing all those old-style 
asms to change.

If you mean you aren't keen to change them to "clobber all," I'm with you.  If 
you are worried about changing them from basic to extended, what kinds of problems do you 
foresee?  I've been reading a lot of basic asm lately, and it seems to me that most of it 
would be fine with a simple colon.  Certainly no worse than the current behavior.

I'm not sure.  I have some asm("sync") which I think assume that this means 
asm("sync"::"memory")


Another excellent reason to nudge people towards using extended asm.  If 
you saw asm("sync":::"memory"), you would *know* what it did, without 
having to read the docs (which don't say anyway).


I'm pretty confident that asm("") doesn't clobber memory on i386, but 
maybe that behavior is platform-specific.  Since i386 doesn't have 
"sync", I assume you are on something else?


If you have a chance to experiment, I'd love confirmation from other 
platforms that asm("blah") is the same as asm("blah":).  Feel free to 
email me off list to discuss.


dw

Re: basic asm and memory clobbers

2015-11-24 Thread David Wohlferd

On 11/24/2015 8:58 AM, paul_kon...@dell.com wrote:

On Nov 23, 2015, at 8:39 PM, David Wohlferd wrote:

On 11/23/2015 1:44 PM, paul_kon...@dell.com wrote:

On Nov 23, 2015, at 4:36 PM, David Wohlferd wrote:

...

The more I think about it, I'm just not keen on forcing all those old-style
asms to change.

If you mean you aren't keen to change them to "clobber all," I'm with you. If
you are worried about changing them from basic to extended, what kinds of problems do you
foresee? I've been reading a lot of basic asm lately, and it seems to me that most of it
would be fine with a simple colon. Certainly no worse than the current behavior.

I'm not sure. I have some asm("sync") which I think assume that this means
asm("sync"::"memory")

Another excellent reason to nudge people towards using extended asm. If you saw
asm("sync":::"memory"), you would *know* what it did, without having to read
the docs (which don't say anyway).

I'm pretty confident that asm("") doesn't clobber memory on i386, but maybe that behavior
is platform-specific. Since i386 doesn't have "sync", I assume you are on something else?

Yes, MIPS.

If you have a chance to experiment, I'd love confirmation from other platforms that
asm("blah") is the same as asm("blah":). Feel free to email me off list to
discuss.

I'm really concerned with loosening the meaning of basic asm. I wish I could
find the documentation that says, or implies, that it is a memory clobber.
And/or that it is implicitly volatile.

The problem is that it's clear from existing code that this assumption was made, and that
defining it otherwise would break such code. For example, the code I quoted clearly
won't work if stores are moved across the asm("sync").

Given the ever improving optimizers, these things are time bombs -- code that
worked for years might suddenly break when the compiler is upgraded.

If such breakage is to be done, it must at least come with a warning (which
must default to ON). But I'd prefer to see the more conservative approach
(more clobbers) taken.

It looks like Ian has already addressed most of your concerns. Just to
emphasize one point:

The current behavior of basic asm is to NOT clobber memory. So your
existing code that performs asm("sync") is already at risk.

The 'fix' I am proposing is to give warnings for every use of basic asm
inside functions (top-level asm is not a problem). Users should change
all such code to use extended so that they can (must) explicitly specify
what their asm clobbers (if anything). Armed with this information, the
optimizers can do their work safely. And people maintaining the code
can finally be clear about what the asm really does.

And the code change should be simple. They can get the same "clobber
nothing" behavior that basic asm has always performed by simply adding a
colon on the end ( asm("sync":) ). Or they can use any of extended
asm's features to get different behavior ( asm("sync":::"memory") ).

Or to put that another way, correctly written basic asm can be converted
to extended by just adding a colon. Incorrect code may require a bit
more work.

basic asm and memory clobbers - Proposed solution

2015-11-24 Thread David Wohlferd

I have solved the problem with my previous patch.  Here's the update 
(feedback welcome): http://www.LimeGreenSocks.com/gcc/24414g.zip


Based on my understanding from the previous thread, this patch now does 
what it needs to do (code-wise) to resolve this "basic asm and memory 
clobbers" issue.  As mentioned previously, this patch introduces a new 
warning (-Wonly-top-basic-asm), which is disabled by default.  When 
enabled, it triggers a warning for any basic asm inside a function, 
unless the function has the "naked" attribute.


An argument can be made that the default for this warning should be 
'enabled.'  Yes, this will break builds that use basic asm and -Werror, 
but it can easily be disabled with -Wno-only-top-basic-asm.  And if we 
don't enable it, no one is going to know the check is even available.  
Then hidden problems like the one Paul was just describing still won't 
be found, and optimizations will continue to have unexpected side 
effects.  OTOH, I can also see changing this to 'enabled' as more 
appropriate in the next phase 1.


Now that I'm done with the code fix, I'm working on an update to the 
docs.  Obviously they should be checked in as part of the code fix. I'm 
planning to actually use the word "deprecated" when describing the use 
of basic asm within functions.  Seems like a big step.


But there's no point in my proceeding any further until someone in 
authority agrees that this is the desired solution.  I'm not actually 
sure who that is, but further work is a waste of time if no one is 
prepared to approve it.


If you are that person, the questions to be answered are:

1) Is the idea of changing basic asm to "clobber everything" dead?
2) Is creating a warning for the use of "basic asm inside a function" 
the solution for this issue?

3) Should the warning be enabled by default in v6?
4) Should the warning be enabled by Wall or Wextra?
5) Should the v6 docs explicitly describe using "basic asm inside a 
function" as deprecated?


If you're looking for my recommendations, I would say: Yes, Yes, 
(reluctantly) No, No and Yes.


With this information in hand, I'll take a shot at finishing this off.

For something that started out as a simple 3 sentence doc patch, this 
sure has turned into a project...


dw

Re: basic asm and memory clobbers

2015-11-26 Thread David Wohlferd


On 11/26/2015 8:26 AM, Hans-Peter Nilsson wrote:

On Thu, 26 Nov 2015, Segher Boessenkool wrote:

On Thu, Nov 26, 2015 at 05:30:48AM -0500, Hans-Peter Nilsson wrote:

On Fri, 20 Nov 2015, Richard Henderson wrote:

I'd be perfectly happy to deprecate and later completely remove basic asm
within functions.

We've explictly promised (directed to kernel people IIRC) that
the empty basic asm; 'asm ("")', has forward-compatible
outlining magic, so people would not have to keep adding
ever-new attributes like noinline,noclone to avoid finding their
functions "spilling over", so please exclude that.  To wit, from
extend.texi:

@item noinline
@cindex @code{noinline} function attribute
This function attribute prevents a function from being considered for
inlining.
@c Don't enumerate the optimizations by name here; we try to be
@c future-compatible with this mechanism.
If the function does not have side-effects, there are optimizations
other than inlining that cause function calls to be optimized away,
although the function call is live.  To keep such calls from being
optimized away, put
@smallexample
asm ("");
@end smallexample

@noindent
(@pxref{Extended Asm}) in the called function, to serve as a
special
side-effect.

Any asm without outputs (including basic asm) is volatile, so it can
not be optimised away.  Putting an empty asm in a noinline function
should always work as you want, it's not because it is basic asm.

I know, the point is that we've promised the above and we
shouldn't back down on this mechanism to instead deprecate and
warn about it, as seems to be the direction.


That is indeed the direction I am heading unless someone stops me.

I was not aware of this magic.  Thanks for pointing it out.  To be 
clear, wouldn't asm("":) have the same effect?


Since we have committed to this, we don't want to change it lightly.  
Also, looking at existing inline asm, this is probably one of the most 
common uses.  Allowing this exception will probably ease the migration, 
even if it makes the docs a little clunkier.


So how about this:

1) Change the docs to say asm("":), so future coders will do "the right 
thing."

2) Allow the exception for v6.
3) Re-evaluate if-and-when we continue with the deprecation process.


This happens to be expressed as a basic asm and was chosen
because the right things happen regarding backwards-
compatibility.  More care has to be taken for forward-
compatibility.  I'm a bit worried that it still works only
because of happenstance.

I know what (I hope) you're thinking but it's hard to produce a
test-case that fails when a future yet unknown optimization
causes "spill-over", but I can imagine a quick return-path
seeming inlinable in LTO.


dw

Re: basic asm and memory clobbers

2015-11-26 Thread David Wohlferd


>> To be clear, wouldn't asm("":) have the same effect?
>
> That does not matter.  It'd require source-code changes to
> users' code.

My suggestion was to allow the exception to the "basic asm in a 
function" warning, but change the docs to show using the new syntax.  
This does not require any user code change.  But it kinda does matter 
whether or not it will work.


Copy/pasting my suggestion:

1) Change the docs to say asm("":), so future coders will do "the right 
thing."

2) Allow the exception for v6.
3) Re-evaluate if-and-when we continue with the deprecation process.

It may not be clear from this, but I don't expect step 3 to happen until 
at least v7.


And saying "the right thing" may be a bit flip.  But if deprecating 
basic asm is the path we are choosing, then telling users to use this 
syntax is how the text should read.  Assuming it works.


> BTW, does that syntax work for really olden gcc
> (say 2.95 era)?

The oldest docs I can find are 2.95.3, and they talk about the extended 
asm syntax, so I assume so: 
https://gcc.gnu.org/onlinedocs/gcc-2.95.3/gcc_4.html#SEC93


> Either way, we promised 'asm("")'.  I'm pretty
> sure the empty string is identifiable.

It is.  I've already updated my code to support this exception.

dw

Re: basic asm and memory clobbers - Proposed solution

2015-11-27 Thread David Wohlferd


On 11/24/2015 6:50 PM, David Wohlferd wrote:
I have solved the problem with my previous patch. Here's the update 
(feedback welcome): http://www.LimeGreenSocks.com/gcc/24414g.zip


Based on my understanding from the previous thread, this patch now 
does what it needs to do (code-wise) to resolve this "basic asm and 
memory clobbers" issue.  As mentioned previously, this patch 
introduces a new warning (-Wonly-top-basic-asm), which is disabled by 
default.  When enabled, it triggers a warning for any basic asm inside 
a function, unless the function has the "naked" attribute.


An argument can be made that the default for this warning should be 
'enabled.'  Yes, this will break builds that use basic asm and 
-Werror, but it can easily be disabled with -Wno-only-top-basic-asm.  
And if we don't enable it, no one is going to know the check is even 
available.  Then hidden problems like the one Paul was just describing 
still won't be found, and optimizations will continue to have 
unexpected side effects. OTOH, I can also see changing this to 
'enabled' as more appropriate in the next phase 1.


Now that I'm done with the code fix, I'm working on an update to the 
docs.  Obviously they should be checked in as part of the code fix. 
I'm planning to actually use the word "deprecated" when describing the 
use of basic asm within functions.  Seems like a big step.


But there's no point in my proceeding any further until someone in 
authority agrees that this is the desired solution.  I'm not actually 
sure who that is, but further work is a waste of time if no one is 
prepared to approve it.


If you are that person, the questions to be answered are:

1) Is the idea of changing basic asm to "clobber everything" dead?
2) Is creating a warning for the use of "basic asm inside a function" 
the solution for this issue?

3) Should the warning be enabled by default in v6?
4) Should the warning be enabled by Wall or Wextra?
5) Should the v6 docs explicitly describe using "basic asm inside a 
function" as deprecated?


If you're looking for my recommendations, I would say: Yes, Yes, 
(reluctantly) No, No and Yes.


With this information in hand, I'll take a shot at finishing this off.

For something that started out as a simple 3 sentence doc patch, this 
sure has turned into a project...


I have incorporated the feedback from Hans-Peter Nilsson, who pointed 
out that the 'noinline' function attribute explicitly documents behavior 
related to using asm("").  The patch now includes an exception for 
this.  Given how often this bit of code is used in various projects, 
allowing this exception will surely ease the transition.


I'm told that some people won't review patches unless they are included 
as attachments, so this time the patch is attached.


The patch includes first cut docs for the warning message, but I'm still 
hoping to hear from someone before trying to update the basic asm docs.


dw
Index: gcc/c-family/c.opt
===
--- gcc/c-family/c.opt	(revision 230734)
+++ gcc/c-family/c.opt	(working copy)
@@ -585,6 +585,10 @@
 C++ ObjC++ Var(warn_namespaces) Warning
 Warn on namespace definition.
 
+Wonly-top-basic-asm
+C C++ Var(warn_only_top_basic_asm) Warning
+Warn on unsafe uses of basic asm.
+
 Wsized-deallocation
 C++ ObjC++ Var(warn_sized_deallocation) Warning EnabledBy(Wextra)
 Warn about missing sized deallocation functions.
Index: gcc/c/c-parser.c
===
--- gcc/c/c-parser.c	(revision 230734)
+++ gcc/c/c-parser.c	(working copy)
@@ -5880,7 +5880,18 @@
   labels = NULL_TREE;
 
   if (c_parser_next_token_is (parser, CPP_CLOSE_PAREN) && !is_goto)
+  {
+/* Warn on basic asm used inside of functions, 
+   EXCEPT when in naked functions. Also allow asm(""). */
+if (warn_only_top_basic_asm && (TREE_STRING_LENGTH (str) != 1) )
+  if (lookup_attribute ("naked", 
+DECL_ATTRIBUTES (current_function_decl)) 
+== NULL_TREE)
+warning_at(asm_loc, OPT_Wonly_top_basic_asm, 
+   "asm statement in function does not use extended syntax");
+
 goto done_asm;
+  }
 
   /* Parse each colon-delimited section of operands.  */
   nsections = 3 + is_goto;
Index: gcc/cp/parser.c
===
--- gcc/cp/parser.c	(revision 230734)
+++ gcc/cp/parser.c	(working copy)
@@ -17594,6 +17594,8 @@
   bool goto_p = false;
   required_token missing = RT_NONE;
 
+  location_t asm_loc = cp_lexer_peek_token (parser->lexer)->location;
+
   /* Look for the `asm' keyword.  */
   cp_parser_require_keyword (parser, RID_ASM, RT_ASM);
 
@@ -17752,6 +17754,16 @@
 	  /

Re: AW: basic asm and memory clobbers - Proposed solution

2015-11-27 Thread David Wohlferd


On 11/27/2015 11:02 PM, Bernd Edlinger wrote:

Hi,


I just found this in the docs:

The compiler copies the assembler instructions in a basic @code{asm}
verbatim to the assembly language output file, without
processing dialects or any of the @samp{%} operators that are available with
extended @code{asm}. This results in minor differences between basic
@code{asm} strings and extended @code{asm} templates. For example, to refer to
registers you might use @samp{%eax} in basic @code{asm} and
@samp{%%eax} in extended @code{asm}.


So it might be good to warn about % in asm statements too, because changing
anything on the asm syntax can be is quite dangerous.


There are a few differences to be aware of between basic and extended, 
and yes that is one of them.


I'm putting together a "How to convert basic asm to extended asm" 
guide.  Not quite sure where to put it yet.  Doesn't really belong in 
the User Guide, but I worry no one would ever see it in the wiki.



And I wonder what the exact equivalence of asm("") is
in extended asm ("":::) or asm volatile ("":::) ?


From the docs: "asm statements that have no output operands, including 
asm goto statements, are implicitly volatile."  In other words, those 2 
are the same.



Well, I start to think that Jeff is right, and we should treat a asm ("") as if 
it
were asm volatile ("" ::: )


I believe Segher is already looking at this: 
https://gcc.gnu.org/ml/gcc/2015-11/msg00196.html



but if the asm ("nonempty with optional %") we should
treat it as asm volatile ("nonempty with optional %%" ::: "memory").
Our docs should say that explicitly, and the implementation should follow that
direction.


Bernd.

Re: basic asm and memory clobbers - Proposed solution

2015-11-29 Thread David Wohlferd




On 11/28/2015 10:30 AM, paul_kon...@dell.com wrote:

On Nov 28, 2015, at 2:02 AM, Bernd Edlinger  wrote:

...
Well, I start to think that Jeff is right, and we should treat a asm ("") as if 
it
were asm volatile ("" ::: ) but if the asm ("nonempty with optional %") we 
should
treat it as asm volatile ("nonempty with optional %%" ::: "memory").

I agree.  Even if that goes beyond the letter of what the manual has promised 
before, it is the cautious answer, and it matches expectations of a lot of 
existing code.


Trying to guess what people might have been expecting is a losing game.  
There is a way for people to be clear about what they want to clobber, 
and that's to use extended asm.  The way to clear up the ambiguity is to 
start deprecating basic asm, not to add to the confusion by changing its 
behavior after all these years.


And the first step to do that is to provide a means of finding them.  
That's what the patch at 
https://gcc.gnu.org/ml/gcc/2015-11/msg00198.html does.


Once they are located, people can decide for themselves what to do. If 
they favor the 'cautious' approach, they can change their asms to use 
:::"memory" (or start clobbering registers too, to be *really* safe).  
For people who require maximum backward compatibility and/or minimum 
impact, they can use :::.


Have you tried that patch?  How many warnings does it kick out for your 
projects?


dw

Re: basic asm and memory clobbers - Proposed solution

2015-12-01 Thread David Wohlferd

On 12/1/2015 8:08 AM, Richard Earnshaw wrote:
> Formatting nit: the '== NULL_TREE)' should line up with the start of
> 'lookup_attribute'.

> Same here.

Ok.  Other than that, how do we proceed here?

When pursuing a course to "deprecate and later completely remove basic 
asm within functions," I assume I need a global maintainer or two to 
sign off on this?

While Richard Henderson's post to that effect may have gotten lost in 
all the discussion (and my ultra-slow-motion roll out plan may have 
confused things further), that's what's meant by #5 on my "List of 
questions for a person in authority":

1) Is the idea of changing basic asm to clobber things dead?
2) Is creating a warning for the use of "basic asm inside a function" 
the solution for this issue?

3) Should the warning be enabled by default in v6?
4) Should the warning be enabled by -Wall or -Wextra?
5) Should the v6 docs explicitly describe using "basic asm inside a 
function" as deprecated?

Saying it's dead in the docs is the first step to making it dead in the 
code.  This patch just implements an optional warning (unless #3,4 crank 
it up to a default warning), but the intent is that eventually (v7? v8?) 
this turns into a fatal error.

One off-hand comment by someone (even a gm) doesn't seem quite enough to 
approve this.  And some guidance about how quickly we want to get there 
would also be useful.  I've been trying to do the work, but I could use 
some direction from someone who understands the gcc vision.

dw

Re: basic asm and memory clobbers - Proposed solution

2015-12-01 Thread David Wohlferd


On 12/1/2015 10:10 AM, Bernd Edlinger wrote:
> And a test case is missing too.
>
> I think this warning concentrates now only on basic asm.
> And people will be probably fix it in the most easy way,
> by just adding a colon.

Probably true.  At least I hope it's that easy for most people.

> But IMHO asm("bla":) isn't any better than asm("bla").
> I think _any_ asm with non-empty assembler string, that
> claims to clobber _nothing_ is highly suspicious, and worth to be
> warned about.  I don't see any exceptions from this rule.

There's one right now in the basic asm docs: asm("int $3");

And I've seen others: asm volatile ("nop"), asm(".byte 0xf1\n"). I've 
seen a bunch more, but you get the idea.


dw

Re: basic asm and memory clobbers - Proposed solution

2015-12-01 Thread David Wohlferd

On 11/30/2015 4:01 AM, Andrew Haley wrote:
>> There is a way for people to be clear about what they want to clobber,
>> and that's to use extended asm.  The way to clear up the ambiguity 
is to
>> start deprecating basic asm, not to add to the confusion by changing 
its

>> behavior after all these years.
>
> Well, I disagree.  The warning is good, but so is the memory clobber.
> They're not exclusive.

As Richard Henderson put it: "I'd be perfectly happy to deprecate and 
later completely remove basic asm within functions."

So my intent is that today's optional warning becomes tomorrow's default 
warning which eventually turns into a fatal error.  We could speed that 
up a bit by making it a default warning now and a fatal error as soon as 
we re-enter phase 1.  I just thought it might be better to update the 
docs now and delay the change to give more of a heads up.

Yes, we could pursue both clobbering and completely removing at the same 
time, but I don't think we should.  Since (currently) basic asm always 
has no clobbers describing how to update basic asm to extended is fairly 
simple: To get the same behavior in extended as you used to get from 
basic, just add a colon.  If some versions of gcc perform clobbers, then 
knowing how to convert gets way harder.

dw

Re: basic asm and memory clobbers - Proposed solution

2015-12-01 Thread David Wohlferd


On 12/1/2015 7:56 PM, Segher Boessenkool wrote:

On Tue, Dec 01, 2015 at 08:41:22PM -0700, Jeff Law wrote:

Isn't "asm" conditionally supported for ISO C++?  In which case it's not
mandatory and semantics are implementation defined.

Yes.


My strong preference is still to document the desired semantics for GCC
and treat anything that does not adhere to those semantics as a bug.

I agree, but I think the semantics should be what they currently are.
If we want a "clobbers everything" version we could make a new syntax
for that, so that not all current asm-without-operands has to pay the
hefty price.


I think a non-default warning is fine.  A default warning (or
-Wall/-Westra) is probably undesirable, though I'm still willing to be
convinced either way on that.

I don't think the warning is a good idea until we have decided what
direction we want this to go in.


If we are headed toward removing "basic asm in a function," then the 
warning (combined with deprecation in the docs) is a logical first 
step.  It lets people find and begin to fix this code with only a 
trivial change to v6.  But if that's not where we're headed, I don't see 
who would ever use it.


I also agree that we shouldn't change the semantics of basic asm to 
"clobber everything."  The coding for the change is not immediately 
clear, the backward compatibility issues are a challenge, and future 
moves to extended become much more complicated.  I understand that 
clobbering may fix problems (that no one is asking us to fix), but at a 
performance, maintenance or 'new bugs' price that I can't see how to 
justify.


I think requiring people to specify what their asm affects (aka extended 
asm) is the right answer.  But if removing "basic asm in a function," is 
not the direction we want to go, my vote is to just doc the current 
behavior.


dw

Re: AW: basic asm and memory clobbers - Proposed solution

2015-12-02 Thread David Wohlferd


On 12/2/2015 3:34 AM, Bernd Edlinger wrote:

Hi,


Surely in code like that, you would make "x" volatile?  Memory clobbers
are not a substitute for correct use of volatile accesses.

No,

It is as I wrote, a memory clobber is the only way to guarantee that
the asm statement is not move somewhere else.

I changed the example to use volatile and compiled it
with gcc-Version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5)

volatile int x;
void
test()
{
x = 1;
asm volatile("nop");
x = 0;
}

gcc -S -O2 test.c gives:


test:
.LFB0:
.cfi_startproc
movl$1, x(%rip)
movl$0, x(%rip)
#APP
# 6 "test.c" 1
nop
# 0 "" 2
#NO_APP
ret
.cfi_endproc


While it works with asm volatile("nop" ::: "memory").
Likewise for "cli" and "sti" if you try to implement critical sections.
Although, these instructions do not touch any memory, we
need the memory clobber to prevent code motion.


If the goal is to order things wrt x, why wouldn't you just reference x?

   x = 1;
   asm volatile("nop":"+m"(x));
   x = 0;

If you have a dependency, stating it explicitly seems a much better 
approach than hoping that the implied semantics of a memory clobber 
might get you what you want.  Not only is it crystal clear for the 
optimizers what the ordering needs to be, people maintaining the code 
can better understand your intent as well.


In summary: Yes inline asm can float.  Clobbers might help.  But I don't 
believe this is related to the "remove basic asm in functions" work.  If 
a warning here is merited (and I'm not yet convinced), a separate case 
should be made for it.


However it does seem like a good fit for a section in a "How to convert 
basic asm to extended asm" doc.  The extended docs don't mention this 
need or how extended can be used to address it.  It's a good reason for 
basic asm users to switch.


dw

Re: basic asm and memory clobbers - Proposed solution

2015-12-11 Thread David Wohlferd


On 12/1/2015 7:41 PM, Jeff Law wrote:
> My strong preference is still to document the desired semantics for 
GCC and treat anything that does not adhere to those semantics as a bug.


Despite nearly 100 posts over 2 threads, we don't seem to be reaching 
either a consensus or a conclusion.  How do we move this from discussion 
to decision?


If the decision were mine to make, I'd just deprecate "basic asm in a 
function" and be done with it.  But it's not.  I don't know how the GCC 
project makes its decisions on issues like this, but the closest we've 
gotten is a "strong preference" for an approach with which I disagree.


I'm going to try to sum up what we've discussed, along with my take on 
the pluses and minuses of each proposed solution.  If after reading this 
the global reviewers make a final decision (even one I disagree with), 
I'll try to help move it forward.  Otherwise, I guess the discussion 
fades away and 24414 just sits for another 10 years.


-
There are three problems related to basic asm that we are trying to solve:

#1 The existing (and historical) docs don't describe basic asm's 
behavior regarding clobbers (registers, memory, etc).
#2 People have written basic asm code based on incorrect assumptions 
about its behavior (easy to do given #1).
#3 Because basic asm has no dependencies (except 'volatile'), 
improvements to optimizers can move it in unexpected ways, sometimes 
breaking existing projects.


-
And here are the three solutions that have been proposed:

Solution 1:
Just document the current behavior of basic asm.

People who have written incorrect code can be pointed at the text and 
told to fix their code.  People whose code is damaged by optimizations 
can rewrite using extended to add dependencies (possibly doc this?).


Solution 2:
Change the docs to say that basic asm clobbers everything (memory, all 
registers, etc) or perhaps just memory (some debate here), but that due 
to bugs that will eventually be addressed, it doesn't currently work 
this way.  Eventually (presumably in a future phase 1) modify the code 
to implement this.


People who have written their code incorrectly may have some 
hard-to-find problems solved for them.  This is particularly valuable 
for projects that are no longer being maintained.  And while clobbers 
aren't the best solution to dependencies, they can help.


Solution 3:
Deprecate (and eventually disallow) the use of basic asm within 
functions (perhaps excluding asm("") and naked functions).  Do this by 
emitting warnings (and eventually fatal errors).  Doc this plan.


Unless GCC adds a built-in assember, basic asm is always going to be a 
problem.  The whole point of extended asm is to provide the information 
GCC needs to properly interface C code with asm. Warnings can point out 
where people have failed to provide that information, allowing them to 
correct it.



Here's my take on the pros/cons of each:

Solution 1: Document the current behavior of basic asm.

Pro:
- Simplest to implement.
- Most backward compatible.

Con:
- Doesn't solve the problems for people who wrote their code wrong, 
except to let them know that they have.
- Doesn't help with dependencies.  Mentioning the problem and pointing 
them to extended might help.  A bit.

- GCC will need to continue to tweak optimizers to work around problems.


Solution 2: Change basic to clobber memory and/or everything.

Pro:
- Docing the current and intended behavior lets people know what to 
expect going forward, even though we aren't prepared to implement the 
change in phase 3.
- When the code change is checked in, long-time problems (of a kind that 
are REALLY hard to find) may be fixed.

- Adding a memory clobber (or clobber all) helps with dependencies. A bit.

Con:
- We haven't agreed exactly what the implementation details are. Docing 
them before writing the implementation is risky.

- Docing "someday" fixes is risky in general since someday may never come.
- Implementation may be harder than it sounds (for example how to handle 
frame pointers).
- New users of basic asm won't be able to depend on any specific 
behavior (since we're explicitly saying the behavior will change). To be 
sure they'll always get the behavior they want, they'll have to use 
extended.
- Existing users who realize they should have used the memory clobber 
won't want to wait for a future version of the compiler to fix this.  
They'll have to use extended too.
- Existing users who want to future-proof their code to avoid having 
clobbers thrust upon them will also have to use extended asm.
- While changing the "clobber nothing" semantics to "clobber everything" 
is WAY safer than doing the reverse, it's still not 100% safe.  By 
definition basic asm is a fragile area.  Any changes can conceivably 
result in failures.
- Even without failures, adding memory/re

Re: basic asm and memory clobbers - Proposed solution

2015-12-12 Thread David Wohlferd


On 12/12/2015 1:51 AM, Andrew Haley wrote:

Solution 2:
Change the docs to say that basic asm clobbers everything (memory, all
registers, etc) or perhaps just memory (some debate here), but that due
to bugs that will eventually be addressed, it doesn't currently work
this way.

You've missed the most practical solution, which meets most common
usage: clobber memory, but not registers.


Actually, that's intended to be part of Solution #2 ("or perhaps just 
memory").  Sorry if that wasn't clear.



That allows most of the
effects that people intuitively want and expect,


The nice thing about guessing what people want and expect is that it can 
be molded however we need to suit our point of view.  I could make a 
similar case that what people want is the behavior from other compilers 
that have built-in assemblers.  Those compilers can 'see' what registers 
are being used by the asm and adapt.  Since GCC can't provide that 
capability, people might 'expect' it to clobber every register if that's 
what is needed to support the same behavior.


I'm not trying to advocate for clobbering registers.  My point is simply 
that I don't see how we can *know* what people want and expect.  You can 
say they expect memory.  Jeff can say they expect memory+registers.  I 
can say they expect it to work the same way tomorrow that it did 
yesterday.  Who's right?  I'm not prepared to say.  But I don't think 
any of us are wrong.



but avoids the breakage of register clobbers.


You are right "just memory clobber" does remove some performance issues, 
and the breakage that could result from clobbering registers.  Also, 
while I have never written nor read an optimizer, my guess is that the 
biggest 'win' as far as positioning the basic asm comes from the memory 
clobber (compared to clobbering registers).  So this does get us most of 
the benefits of "clobber everything" with less effort and fewer 
consequences.


However breakage and performance issues can still result solely from 
adding memory clobbers.  And as I mentioned, "just memory clobber" may 
not be the behavior people expect.  And if we aren't solving that, might 
there be a second update later to add registers?  Talk about confusing 
semantics...



It allows basic asm to be used in a
sensible way by pushing and popping all used registers.


If I were using basic asm, this would indeed seem like a sensible approach.

However, it is not the most efficient.  If I can clobber registers, 
push/pop are just wasted cycles.  This just seems like another argument 
for deprecating basic asm and pushing people to extended.


--
Imagine for a moment:

If the only way right now to do inline asm in gcc was extended, and I 
proposed adding basic, how could I justify this 'new' feature? Other 
than 'top-level,' I can't think of a single benefit that it would 
provide.  Pointless push/pops, problems with positioning and the other 
headaches it causes have no upside.


Contra-wise, what if starting today all new inline asm were written 
using extended?  How would that be a bad thing?  What would be the 
downside?  Yes, people can still write bad code with extended, but its 
semantics are well-understood and have been for a long time (certainly 
compared to basic).  Additionally, the abilities it provides allow 
people to solve problems that basic never will.


Which means (to me), the only real justification for the continued 
existence of basic asm is backward compatibility.  Which makes the 
arguments for changing its behavior (whether a little or a lot) kinda weird.


I still vote for doing everything we can think of to discourage people 
from using basic and begin using extended:


- Change the docs to flat out deprecate basic (excluding top-level).
- Add the warning so people's integrated dev environments will show the 
suspect lines.
- Make the warning a default, but overridable (-Wno-only-top-basic-asm), 
so people who HAVE to support the old syntax still can.
- Make sure the docs for the warning describe (link to?) how to change 
asm from basic to extended and why.
- Any bugs people report or posts people make involving basic asm get 
resolved as 'Try it with extended.'


Every year these items are in place would make basic asm less important 
as people slowly move to extended.  Maybe by the time we get around to 
flagging it as a fatal error (say in 2025), no one will care.


True, this won't magically solve people's problems for them if they've 
been using basic asm wrong.  But it does point out that there are 
(potential) problems, where those problems are and motivates people to 
fix them (which seems like quite reasonable behavior for a compiler).  
More importantly, we haven't broken or degraded anything that was 
already working.


I'm just afraid that instead of pursuing any of these solutions, we are 
going to pursue Solution #0: Do nothing.  A rather unsatisfying outcome 
after all this effort.


dw

PS I was surprised by how many people

Re: basic asm and memory clobbers - Proposed solution

2015-12-13 Thread David Wohlferd

Is there a decision maker still teetering on the edge of making a call 
here?  Or have they all moved on and we are just talking among 
ourselves?  I keep worrying that if I don't reply, someone will swoop 
in, read the last message in the thread, and charge off to make a 
changes based on that.  So...

On 12/13/2015 1:25 AM, Bernd Edlinger wrote:

That is also my preferred solution,

I'm really not sure what the point of doing the memory clobber is. Yes, 
I see that it is easier to code than "clobber everything."  And yes, it 
might help with certain types of bugs for people who have or might 
someday misuse asm.

But if we are trying to give users what they expect, shouldn't we be 
doing the "clobber everything" Jeff suggested 
(https://gcc.gnu.org/ml/gcc/2015-11/msg00081.html)?  Surely that's the 
"safest" answer, and it's as likely to be what people expect as anything 
else.  Adding the memory clobber will already break backward 
compatibility and risk breaking existing code.  Why do all that for half 
a solution?  Yes it's harder, but please don't tell me we plan to change 
things yet again some day to do the rest?

> The rationale for this is: there are lots of ways to write basic asm
> statements, that may appear to work, if they clobber memory.

> because you can use all global values simply by name, users will
> expect that to be supported.

> people who know next
> to nothing about assembler will be encouraged to "fix" this warning in a
> part of the code which they probably do not understand at all.

So in the end, we are proposing a potentially breaking change, not 
because we are violating a standard, not to give users what (some people 
think) they expect, and not even because someone has actually reported a 
problem.  But because someone might use it wrong, and if we just point 
the statement out to them, they might not know how to correct it.

That seems like a lot of risk for so very little gain.

And if we do end up changing this, using basic asm will become more 
dangerous than ever.  In addition to all the old problems (that will 
still exist in earlier versions), now its behavior VARIES between 
versions.  Programmers who want specific behavior (rather than our 
assumption-du-jour) will be forced to CHANGE THEIR CODE to account for 
this variability.

So my solution highlights at-risk statements that must be changed to use 
extended asm.  Your solution is that people will need to realize for 
themselves that the statements must change to extended in order to work 
the same way in newer versions, and then must find them all for themselves.

Further, if people (correctly) fix the warnings I'm proposing, then they 
get correct behavior for all supported versions of the compiler.  If 
people do the nothing you are proposing, they may still have problems 
even after the fix (if they assume registers get clobbered) or may have 
new problems (performance or otherwise), and will still have problems 
using any version of the compiler earlier than the one expected to be 
released sometime late next year.

I prepared a patch for next stage1,
the details are still under discussion, but it is certainly do-able with
not too much effort.

See https://gcc.gnu.org/ml/gcc-patches/2015-12/msg00938.html

I was unaware of this conversion.  Unfortunate that you didn't think to 
cc me on this, as there are several points I would have made regarding 
the proposed code and doc changes.  But mostly, I would have said what 
Bernd Schmidt said: "I'm not sure there was any consensus in that other 
thread" and "I think it would be best if we could deprecate basic asms 
in functions, or at least warn about them in -Wall."

I have been talking myself blue in the fingers on this topic since early 
November and I honestly don't know if I have convinced anyone of 
anything.  At this point, I really don't know what more I can say.  It's 
pretty much all in https://gcc.gnu.org/ml/gcc/2015-12/msg00092.html

FWIW.

dw

Re: basic asm and memory clobbers - Proposed solution

2015-12-15 Thread David Wohlferd


On 12/14/2015 1:53 AM, Andrew Haley wrote:
> This just seems like another argument for deprecating basic asm and 
pushing people to extended.

Yes.  I am not arguing against deprecation.  We should do that.


You know, there are several people who seem to generally support this 
direction.  Not enough to call it a consensus, but perhaps the beginning 
of one:


- Andrew Haley
- David Wohlferd
- Richard Henderson
- Segher Boessenkool
- Bernd Schmidt

Anyone else want to add their name here?

Maybe it's the implementation details that have other people concerned.  
My thought is that for v6 we change the docs to say something like:


-
Unlike top level, using basic asm within a function is deprecated. No 
new code should use this feature, but should use extended asm instead.  
Existing code should begin replacing such usage. Instances of affected 
code can be found using -Wonly-top-basic-asm. For help making this 
conversion, see "How to convert Basic asm to Extended asm."

-

With this, we only need to add the warning as a non-default.  This will 
(hopefully) stop new users from using this, and can begin the work of 
removing it from existing code.


The problem with that approach is that (I'm told) some people don't read 
the docs.  Imagine that.  Such being the case, they won't even know this 
is happening.  That's why I think that in the next phase 1 (v7?), we 
should change the warning to 'on' by default (or maybe as part of 
-Wall?).  People will still be able to override it with no-, but at 
least it will raise awareness and chase more of it out of people's code.


What about the final step of actually removing support for basic asm in 
a function as some people propose?  Should we really do this?  If so, 
when?  That's a more difficult question.  If we make the warning a 
default in v7, then it would be *at least* v8 before this should be 
considered.  Trying to make any plan that far ahead seems... 
optimistic.  Perhaps by then we'll have more feedback upon which to make 
a decision.


We *could* be more aggressive and start right off with making the 
warning 'on' by default.  But to give people the best chance to prepare, 
perhaps starting with the non-default is best.


Even if we don't all agree about _removing_ "basic asm in a function," 
can we find consensus that having less of it is a good thing?  Because 
this approach gets us that.


dw

Re: basic asm and memory clobbers - Proposed solution

2015-12-16 Thread David Wohlferd


On 12/15/2015 12:42 PM, paul_kon...@dell.com wrote:

In the codebase for the product I work on, I see about 200 of them.  Many of those are the likes of 
asm("sync") for MIPS, which definitely wants to be treated as if it were asm ("sync" : : 
: "memory").


That's right, I meant to ask you about this last time you mentioned this.

Now that you are aware that this is a problem, what do you intend to do 
about it?  Jeff is saying that this may not be fixed until at least v7, 
so waiting for a compiler fix may take a while.


Will you be updating your source?

Are you just finding these with grep, or have you tried the 
-Wonly-top-basic-asm patch?



That's not counting the hundreds I see in gdb/stubs -- those are "outside a 
function" flavor.


Fortunately, these aren't a problem for memory or register clobbers.

dw

Re: basic asm and memory clobbers - Proposed solution

2015-12-16 Thread David Wohlferd


On 12/15/2015 1:13 PM, Jeff Law wrote:

Sadly, I'm putting most of this discussion into my gcc-7 queue anyway.


Fair enough.  If "clobbers" is what we're going to do, that sounds more 
like a phase 1 thing.


That said, some people who have this problem may prefer to fix it sooner 
rather than later.  If we soften the text for the docs to remove the 
word "deprecate," would just adding -Wonly-top-basic-asm (non-default) 
be useful for v6?


dw

Re: basic asm and memory clobbers - Proposed solution

2015-12-16 Thread David Wohlferd


On 12/15/2015 5:01 PM, paul_kon...@dell.com wrote:

On Dec 15, 2015, at 5:22 PM, David Wohlferd  wrote:

On 12/14/2015 1:53 AM, Andrew Haley wrote:

This just seems like another argument for deprecating basic asm and pushing 
people to extended.

Yes.  I am not arguing against deprecation.  We should do that.

You know, there are several people who seem to generally support this 
direction.  Not enough to call it a consensus, but perhaps the beginning of one:

- Andrew Haley
- David Wohlferd
- Richard Henderson
- Segher Boessenkool
- Bernd Schmidt

Anyone else want to add their name here?

No, but I want to speak in opposition.


Fair enough.


"Deprecate" means two things: warn now, remove later.


Yup.  That's what I'm proposing.  Although "later" could be a decade 
down the road.  That's how long 24414 has been sitting.



For reasons stated by others, I object to "remove later".



So "warn now, remove never" I would support, but not "deprecate".


So how about:

- Update the basic asm docs to describe basic asm's current (and 
historical) semantics (ie clobber nothing).
- Emphasize how that might be different from users' expectations or the 
behavior of other compilers.
- Warn that this could change in future versions of gcc.  To avoid 
impacts from this change, use extended.

- Mention -Wonly-top-basic-asm as a way to locate affected statements.

Would that be something you could support?

What's your take on making -Wonly-top-basic-asm a default (either now or 
v7)?  Is making it a non-default a waste of time because no one will 
ever see it?  Or is making it a default too aggressive? What about 
adding it to -Wall?


dw

Re: basic asm and memory clobbers - Proposed solution

2015-12-16 Thread David Wohlferd


On 12/15/2015 2:43 PM, Joseph Myers wrote:

On Tue, 15 Dec 2015, David Wohlferd wrote:


Unlike top level, using basic asm within a function is deprecated. No new code
should use this feature, but should use extended asm instead.  Existing code
should begin replacing such usage. Instances of affected code can be found
using -Wonly-top-basic-asm. For help making this conversion, see "How to
convert Basic asm to Extended asm."

I think the typical use of basic asm is: you want to manipulate I/O
registers or other such state unknown to the compiler (not any registers
the compiler might use itself), and you want to do it in a way that is
maximally compatible with as many compilers as possible (hence limiting
yourself to the syntax subset that's in the C++ standard, for example).
Compatibility with a wide range of other compilers is the critical thing
here; this is not a GCC-invented feature, and considerations for
deprecating an externally defined feature are completely different from
considerations for GCC-invented features.



Do you have evidence that it is
now unusual for compilers to support basic asm without supporting
GCC-compatible extended asm, or that other compiler providers generally
consider basic asm deprecated?


On the contrary, I would be surprised to learn that there are ANY 
compilers (other than clang) that support gcc's extended asm format.  
And although there is no standard that seems to require it, I'm not 
certainly not prepared to say that basic asm is "generally deprecated."


But the fact that there is no standard may make "doing what other 
compilers do" challenging.


For example quoting from Bernd's email regarding the windriver diab 
compiler: "non-scratch registers must be preserved."  Implying that 
scratch registers (which they apparently only list for ARM) are 
considered clobbered.


That seems like a sensible approach (and avoids the frame pointer 
problem).  But I'm not prepared to extrapolate from that how all 
compilers do or should handle basic asm.  However it does mean that the 
suggestion being proposed here to have basic asm only clobber memory 
would not be compatible with windriver's approach to basic asm.  And 
Jeff's proposal for gcc to clobber "all registers" wouldn't be 
compatible with them either.


I agree that "oh, surprise! gcc does this differently than compiler X!" 
is a bad thing.  But without standards, trying to be compatible with how 
"everyone else" does it may not be practical.  You'll note that 
windriver made no particular effort to be compatible with gcc. And of 
course any change will make gcc v7 work differently than gcc v4, v5, v6.


If compatibility with other compilers is an important criteria when 
determining how gcc should handle basic asm, someone's going to need to 
do some research and some prioritizing.


In the meantime, raising awareness of this issue via docs and warnings 
seems a low-cost way to start.


dw

Re: basic asm and memory clobbers - Proposed solution

2015-12-18 Thread David Wohlferd


On 12/17/2015 6:03 AM, Jeff Law wrote:

On 12/17/2015 03:39 AM, Andrew Haley wrote:

On 17/12/15 01:41, David Wohlferd wrote:

On the contrary, I would be surprised to learn that there are ANY
compilers (other than clang) that support gcc's extended asm format.


Prepare to be surprised: Sun Studio compilers seem to support it
just fine.

And Intel's ICC.


Ok, I admit it:  I'm surprised.

dw

Re: AW: basic asm and memory clobbers - Proposed solution

2015-12-18 Thread David Wohlferd


On 12/17/2015 11:30 AM, Bernd Edlinger wrote:

On Thu, 17 Dec 2015 15:13:07, Bernd Schmidt wrote:

   What's your take on making -Wonly-top-basic-asm a default (either now or
   v7)?  Is making it a non-default a waste of time because no one will
   ever see it?  Or is making it a default too aggressive? What about
   adding it to -Wall?

Depends if anyone has one in system headers I guess. We could try to add it to 
-Wall.

Sorry, but if I have to object.

Adding this warning to -Wall is too quickly and will bring the ia64, tilegx and 
mep ports into trouble.


It doesn't look to me like adding the warnings will affect gcc itself.  
But I do see how it could have an impact on people using gcc on those 
platforms, if the warning causes them to convert to extended asm.



Each of them invokes some special semantics for basic asm:


I'm collecting these for my "How to convert basic asm to extended asm" 
document.  This may need to go in the gcc wiki instead of the user 
guide, since people may find important conversion tips like these 
asynchronous to gcc's releases.



mep: mep_interrupt_saved_reg looks for ASM_INPUT in the body, and saves 
different registers if found.


I'm trying to follow this code.  A real challenge since I know nothing 
about mep.  But what I see is:


- This routine only applies to functions marked as 
__attribute__((interrupt)).
- To correctly generate entry/exit, it walks thru each register (up to 
FIRST_PSEUDO_REGISTER) to see if it is used by the routine.  If there is 
any basic asm within the routine (regardless of its contents), the 
register is considered 'in use.'


The net result is that every register gets saved/restored by the 
entry/exit of this routine if it contains basic asm.  The reason this is 
important is that if someone just adds a colon, it would suddenly *not* 
save/restore all the registers.  Depending on what the asm does, this 
could be bad.


Does that sound about right?

This is certainly worth mentioning in the 'convert' doc.

I wonder how often this 'auto-clobber' feature is used, though.  I don't 
see it mentioned in the 'interrupt' attribute docs for mep, and it's not 
in the basic asm docs either.  If your interrupt doesn't need many 
registers, it seems like you'd want to know this and possibly use 
extended.  And you'd really want to know if you are doing a (redundant) 
push/pop in your interrupt.



tilegx:  They never wrap {} around inline asm. But extended asm, are handled 
differently, probably automatically surrounded by braces.


I know nothing about tilegx either.  I've tried to read the code, and it 
seems like basic asm does not get 'bundled' while extended might be.


Bundling for tilegx (as I understand it) is when you explicitly fill 
multiple pipelines by doing something like this:


   { add r3,r4,r5 ; add r7,r8,r9 ; lw r10,r11 }

So if you have a basic asm statement, you wouldn't want it automatically 
bundled by the compiler, since your asm could be more than 3 statements 
(the max?).  Or your asm may do its own bundling. So it makes sense to 
never output braces when outputting basic asm.


I know I'm guessing about what this means, but doesn't it seem like 
those same limitations would apply to extended?  I wonder if this is a 
bug.  I don't see any discussion of bundling (of this sort) in the docs.



ia64:  ASM_INPUT emits stop-bits.  extended asm does not the same.  That was 
already mentioned by Segher.


I already had this one from Segher's email.


Given all this, I'm more convinced than ever of the value of 
-Wonly-top-basic-asm.  Basic asm is quirky, has undocumented bits and 
its behavior is incompatible with other compilers, even though it uses 
the same syntax.  If I had any of this in my projects, I'd sure want to 
find it and look it over.


But maybe Bernd is right and it's best to leave the warning disabled in 
v6, even by -Wall.  I may ask this question again in the next phase 1...


With that in mind, how do you feel about the basic plan:

- Update the basic asm docs to describe basic asm's current (and 
historical) semantics (ie clobber nothing).
- Emphasize how that might be different from users' expectations or the 
behavior of other compilers.
- Warn that this could change in future versions of gcc.  To avoid 
impacts from this change, use extended.
- Reference the "How to convert from basic asm to extended asm" guide 
(where ever it ends up living).

- Mention -Wonly-top-basic-asm as a way to locate affected statements.
- -Wonly-top-basic-asm is disabled by default and not enabled by -Wall 
or -Wextra.


Does this seem like a safe and useful change for v6?

dw

Re: basic asm and memory clobbers - Proposed solution

2015-12-19 Thread David Wohlferd


On 12/18/2015 11:55 AM, Bernd Edlinger wrote:

On 18.12.2015 10:27, David Wohlferd wrote:

On 12/17/2015 11:30 AM, Bernd Edlinger wrote:
Adding this warning to -Wall is too quickly and will bring the ia64, 
tilegx and mep ports into trouble. 

It doesn't look to me like adding the warnings will affect gcc
itself.  But I do see how it could have an impact on people using gcc
on those platforms, if the warning causes them to convert to extended
asm.


At least we should not start a panic until we have really understood all
the details, how to do that.


Phase 1 is a better place to start a panic.


mep: mep_interrupt_saved_reg looks for ASM_INPUT in the body, and
saves different registers if found.

I'm trying to follow this code.  A real challenge since I know nothing
about mep.  But what I see is:

- This routine only applies to functions marked as
__attribute__((interrupt)).
- To correctly generate entry/exit, it walks thru each register (up to
FIRST_PSEUDO_REGISTER) to see if it is used by the routine.  If there
is any basic asm within the routine (regardless of its contents), the
register is considered 'in use.'

The net result is that every register gets saved/restored by the
entry/exit of this routine if it contains basic asm.  The reason this
is important is that if someone just adds a colon, it would suddenly
*not* save/restore all the registers.  Depending on what the asm does,
this could be bad.

Does that sound about right?


Yes.


Seems like a doc update would be appropriate here too then, if anyone 
wanted to volunteer...



This is certainly worth mentioning in the 'convert' doc.

I wonder how often this 'auto-clobber' feature is used, though.  I
don't see it mentioned in the 'interrupt' attribute docs for mep, and
it's not in the basic asm docs either.  If your interrupt doesn't need
many registers, it seems like you'd want to know this and possibly use
extended.  And you'd really want to know if you are doing a
(redundant) push/pop in your interrupt.


tilegx:  They never wrap {} around inline asm. But extended asm, are
handled differently, probably automatically surrounded by braces.

I know nothing about tilegx either.  I've tried to read the code, and
it seems like basic asm does not get 'bundled' while extended might be.

Bundling for tilegx (as I understand it) is when you explicitly fill
multiple pipelines by doing something like this:

{ add r3,r4,r5 ; add r7,r8,r9 ; lw r10,r11 }

So if you have a basic asm statement, you wouldn't want it
automatically bundled by the compiler, since your asm could be more
than 3 statements (the max?).  Or your asm may do its own bundling. So
it makes sense to never output braces when outputting basic asm.

I know I'm guessing about what this means, but doesn't it seem like
those same limitations would apply to extended?  I wonder if this is a
bug.  I don't see any discussion of bundling (of this sort) in the docs.


I wold like to build a cross compiler, but currently that target seems
to be broken. I have to check
that target anyways, because of my other patch with the memory clobbers.

I see in tilegx_asm_output_opcode, that they do somehow automatically
place braces.
An asm("pseudo":) has a special meaning, and can be replaced with "" or "}".
However the static buf[100] means that any extended asm string > 95
characters, invokes
the gcc_assert in line 5487.  In the moment I would _not_ recommend
changing any asm
statements without very careful testing.


Yes, the handling of extended asm there is very strange.

If bundles can only be 3 instructions, then appending an entire extended 
asm in a single (already in-use) bundle seems odd.  But maybe that's 
just because I don't understand tilegx.


I'm not sure it's just changing basic asm to extended I would be 
concerned about.  I'd worry about using extended asm at all...



ia64:  ASM_INPUT emits stop-bits. extended asm does not the same.
That was already mentioned by Segher.

I already had this one from Segher's email.


Given all this, I'm more convinced than ever of the value of
-Wonly-top-basic-asm.  Basic asm is quirky, has undocumented bits and
its behavior is incompatible with other compilers, even though it uses
the same syntax.  If I had any of this in my projects, I'd sure want
to find it and look it over.

But maybe Bernd is right and it's best to leave the warning disabled
in v6, even by -Wall.  I may ask this question again in the next phase
1...


Aehm, yes, maybe the warning could by then be something more reasonable
like:
"Warning: the semantic of basic asm has changed to include implicit
memory clobber,
if you think that is a problem for you, please convert it to basic asm,
otherwise just relax."


I don't think Jeff wants to pursue changes to basic asm&

doc maintainer questions

2015-12-19 Thread David Wohlferd

I have been discussing adding some content to the basic asm docs. As 
part of this work, I want to add a discussion of "How to convert basic 
asm to extended asm."  However it doesn't seem like this is a good fit 
for the User Guide.  This is both because the UG doesn't generally talk 
about "How To" write code, and because the text may need updates more 
often than the UG gets released.


I'm thinking this might be a better fit for the gcc wiki.  But that 
brings up a few questions:


1) Is it appropriate for the UG to link to sections in the wiki?  I see 
that we do, but should we?

2) How do I get 'edit' access to the wiki?
3) Where in the wiki should it go?

dw

Re: basic asm and memory clobbers - Proposed solution

2015-12-20 Thread David Wohlferd


On 12/20/2015 10:26 AM, Bernd Edlinger wrote:

On 19.12.2015 19:54, David Wohlferd wrote:

mep: mep_interrupt_saved_reg looks for ASM_INPUT in the body, and
saves different registers if found.

I'm trying to follow this code.  A real challenge since I know nothing
about mep.  But what I see is:

- This routine only applies to functions marked as
__attribute__((interrupt)).
- To correctly generate entry/exit, it walks thru each register (up to
FIRST_PSEUDO_REGISTER) to see if it is used by the routine. If there
is any basic asm within the routine (regardless of its contents), the
register is considered 'in use.'

The net result is that every register gets saved/restored by the
entry/exit of this routine if it contains basic asm.  The reason this
is important is that if someone just adds a colon, it would suddenly
*not* save/restore all the registers.  Depending on what the asm does,
this could be bad.

Does that sound about right?


Yes.

Seems like a doc update would be appropriate here too then, if anyone
wanted to volunteer...

To confirm this, I built a cross-compiler, but It was difficult, because
of pr64402.
Yes, a function with __attribute__((interrupt)) spills lots of
registers, when it just
contains asm(""); but almost nothing, if asm("":); is used.  That is
remarkable.


So if you use extended and clobber a couple registers asm("":::"r0", 
"r1"), does it spill just those specific registers?  That would be cool.


I don't write interrupt handlers, and certainly not on mep.  But the 
little I know about them says that performance is an important (and 
sometimes critical) characteristic.  There would be risk in changing 
this to extended (if you used a register but forgot to clobber it), but 
depending on the interrupt, it could be a nice performance 'win.'


If no one else is prepared to step up to write this, I can.  I'm just 
uncomfortable doing so since I can't try it myself.  And I feel weird 
writing a patch for mep given that I know nothing about it.


But since Bernd has tried it, maybe something like this added to the 
'interrupt' attribute on 
https://gcc.gnu.org/onlinedocs/gcc/MeP-Function-Attributes.html

-
Be aware that if the function contains any basic @code{asm} 
(@pxref{Basic Asm}), all registers (whether referenced in the asm or 
not) will be preserved upon entry and restored upon exiting the 
interrupt. More efficient code can be generated by using extended 
@code{asm} (@pxref{Extended Asm}) and explicitly listing only the 
specific registers that need to be preserved (or none if your asm 
preserves any registers it uses).

-


tilegx:  They never wrap {} around inline asm. But extended asm, are
handled differently, probably automatically surrounded by braces.

I know nothing about tilegx either.  I've tried to read the code, and
it seems like basic asm does not get 'bundled' while extended might be.

Bundling for tilegx (as I understand it) is when you explicitly fill
multiple pipelines by doing something like this:

 { add r3,r4,r5 ; add r7,r8,r9 ; lw r10,r11 }

So if you have a basic asm statement, you wouldn't want it
automatically bundled by the compiler, since your asm could be more
than 3 statements (the max?).  Or your asm may do its own bundling. So
it makes sense to never output braces when outputting basic asm.

I know I'm guessing about what this means, but doesn't it seem like
those same limitations would apply to extended?  I wonder if this is a
bug.  I don't see any discussion of bundling (of this sort) in the
docs.


I wold like to build a cross compiler, but currently that target seems
to be broken. I have to check
that target anyways, because of my other patch with the memory clobbers.

I see in tilegx_asm_output_opcode, that they do somehow automatically
place braces.
An asm("pseudo":) has a special meaning, and can be replaced with ""
or "}".
However the static buf[100] means that any extended asm string > 95
characters, invokes
the gcc_assert in line 5487.  In the moment I would _not_ recommend
changing any asm
statements without very careful testing.

Yes, the handling of extended asm there is very strange.

If bundles can only be 3 instructions, then appending an entire
extended asm in a single (already in-use) bundle seems odd.  But maybe
that's just because I don't understand tilegx.

I'm not sure it's just changing basic asm to extended I would be
concerned about.  I'd worry about using extended asm at all...

I also built a tilegx cross compiler, but that was even more difficult,
because of pr68917, which hit me when building the libgcc.

I tried a while, but was unable to find any example of an extended asm
that gets
auto-braces, or which can trigger the mentioned gcc_assert. It looks
like, in all
cases the bundles are closed already before t

"cc" clobber

2016-01-26 Thread David Wohlferd

It is well known that on i386, the "cc" clobber is always set for 
extended asm, whether it is specified or not.  I was wondering how much 
difference it might make if the generated code actually followed what 
the user specified (expectation: not much).  But implementing this 
turned up a different question.


I started by just commenting out the code in ix86_md_asm_adjust that 
unconditionally clobbered the flags.  I figured this would allow the 
'normal' "cc" handling to occur.  But apparently there is no 'normal' 
"cc" handling.


So I went back to ix86_md_asm_adjust and tried to handle the "cc" if it 
was specified in the clobbers argument.  But apparently "cc" doesn't get 
added to that clobbers list.


Hmm.

Tracing back to see how the "memory" clobber (which does get added to 
the clobber list) is handled brings me to expand_asm_stmt() in 
cfgexpand.c.  Following the example set by the memory clobber, it looks 
like I want something like this:


  else if (j == -3)
  {
#if defined(__i386__) || defined(__x86_64__)
rtx x = gen_rtx_REG (CCmode, FLAGS_REG);
clobber_rvec.safe_push (x);

x = gen_rtx_REG (CCFPmode, FPSR_REG);
clobber_rvec.safe_push (x);
#endif
  }

Now I can check for this in the clobbers to ix86_md_asm_adjust and 
SET_HARD_REG_BIT as appropriate.  Tada.


It's working, but can that be right?  Why do I need to do this for 
i386?  How do other platforms handle "cc"?


Other than not rejecting it as an invalid clobber, I can't find any code 
that seems to recognize "cc."  Has "cc" become just an unenforced 
comment on all platforms?  Or did I just miss it?


dw

Re: "cc" clobber

2016-02-01 Thread David Wohlferd


On 1/26/2016 4:31 PM, Bernd Schmidt wrote:

On 01/27/2016 12:12 AM, David Wohlferd wrote:

I started by just commenting out the code in ix86_md_asm_adjust that
unconditionally clobbered the flags.  I figured this would allow the
'normal' "cc" handling to occur.  But apparently there is no 'normal'
"cc" handling.


I have a dim memory that there's a difference between the "cc" and 
"CC" spellings. You might want to check that.


I checked, but my scan of the current code isn't turning up anything for 
"CC" related to clobbers either.


While presumably "cc" did something at one time, apparently now it's 
just an unenforced comment (on extended asm).  Not a problem, just a bit 
of a surprise.


dw

Re: "cc" clobber

2016-02-01 Thread David Wohlferd


On 2/1/2016 6:58 AM, Ulrich Weigand wrote:

I think on many targets a clobber "cc" works because the backend
actually defines a register named "cc" to correspond to the flags.
Therefore the normal handling of clobbering named hard registers
catches this case as well.

This doesn't work on i386 because there the flags register is called
"flags" in the back end.


Doh!  Of course.  This makes perfect sense.  Thanks.

dw

Re: "cc" clobber

2016-02-01 Thread David Wohlferd


On 2/1/2016 12:40 PM, Richard Henderson wrote:

On 02/02/2016 01:58 AM, Ulrich Weigand wrote:

I think on many targets a clobber "cc" works because the backend
actually defines a register named "cc" to correspond to the flags.
Therefore the normal handling of clobbering named hard registers
catches this case as well.


Yes.  C.f. Sparc ADDITIONAL_REGISTER_NAMES.


This doesn't work on i386 because there the flags register is called
"flags" in the back end.


Once upon a time i386 used cc0.  A survey of existing asm showed that 
almost no one clobbered "cc", and that in the process of changing i386 
from cc0 to an explicit flags register we would break almost 
everything that used asm.


The only solution that scaled was to force a clobber of the flags 
register.


That was 1999.  I think you'll buy nothing but pain in trying to 
change this now.


I expect you are right.  After experimenting, the cases where this might 
buy you any benefit are just too uncommon, and the 'benefit' is just too 
small.


The one place where any of this would (sort of) be useful is checking 
for the "cc" clobber conflicting with the output parameters.  This 
didn't used to be a thing, but now that i386 can 'output' flags, it is.  
The compiler currently accepts both of these and they both produce the 
same code:


   asm("": "=@ccc"(x) : : );
   asm("": "=@ccc"(x) : : "cc");

I assert (pr69095) that the second one should give an error (docs: 
"Clobber descriptions may not in any way overlap with an input or output 
operand").  Creating a check for this was more challenging than I 
expected.  I kept assuming that there 'had' to be existing code to 
handle cc and I could tie into it if I could only figure out where it was.


But now that I have this written, I'm still vacillating about whether it 
is useful.  It seems like I could achieve the same result by adding 
"Using @cc overrides the "cc" clobber" to the docs.  But hey, it also 
checks for duplicate "memory" and "cc" clobbers, so there's that...


dw

Bug maintenance

2016-04-28 Thread David Wohlferd

As part of the work I've done on inline asm, I've been looking thru the 
bugs for it.  There appear to be a number that have been fixed or 
overtaken by events over the years, but the bug is still open.


Is closing some of these old bugs of any value?

If so, how do I pursue this?

dw

Re: Bug maintenance

2016-05-08 Thread David Wohlferd


On 4/28/2016 2:01 AM, Richard Biener wrote:

On Thu, Apr 28, 2016 at 9:35 AM, David Wohlferd  wrote:

As part of the work I've done on inline asm, I've been looking thru the bugs
for it.  There appear to be a number that have been fixed or overtaken by
events over the years, but the bug is still open.

Is closing some of these old bugs of any value?

Yes, definitely.


If so, how do I pursue this?

I suppose adding a final comment to them will work, people (like me)
watching gcc-bugs can then do the actual closing.


Perfect.  That gives the OP a chance to respond as well.

Look for my updates to 30527, 39440 & 43319.

dw

Re: Bug maintenance

2016-05-08 Thread David Wohlferd


On 4/28/2016 9:41 AM, Martin Sebor wrote:

On 04/28/2016 01:35 AM, David Wohlferd wrote:

As part of the work I've done on inline asm, I've been looking thru the
bugs for it.  There appear to be a number that have been fixed or
overtaken by events over the years, but the bug is still open.

Is closing some of these old bugs of any value?

If so, how do I pursue this?


There are nearly 10,000 still unresolved bugs in Bugzilla, almost
half of which are New, and a third Unconfirmed, so I'm sure any
effort to help reduce the number is of value and appreciated.


That's exactly what prompted me to ask.  There's such a vast number of 
them, it's hard to believe that 9 year old bugs are still of interest.



I can share with you my own approach to dealing with them (others
might have better suggestions).  In cases where the commit that
fixed a bug is known, I mention it in the comment closing the bug.
I also try to indicate the version in which the bug was fixed (if
I can determine it using the limited number of versions I have
built).  Otherwise, when a test doesn't already exist (finding
out whether or not one does can be tedious), I add one before
closing the bug will help avoid a regression.


I'll see what I can do.

dw

Re: Bug maintenance

2016-05-08 Thread David Wohlferd


On 4/28/2016 12:23 PM, Andrew Pinski wrote:

On Thu, Apr 28, 2016 at 12:35 AM, David Wohlferd  
wrote:

As part of the work I've done on inline asm, I've been looking thru the bugs
for it.  There appear to be a number that have been fixed or overtaken by
events over the years, but the bug is still open.

Is closing some of these old bugs of any value?

Yes it is.  In fact this is how I got my start into GCC.


If so, how do I pursue this?

If you go through the bug reports and have a low rate of false
positives, I (and others) can get you permission to change the bug
reports (I started out with a bug report only account too).


I'll do my best.  But it's not always clear what might trigger a 
debate.  I swear to you, I never expected 24414 to blow up the way it did.


dw

Machine constraints list

2016-05-08 Thread David Wohlferd

Looking at the v6 release criteria 
(https://gcc.gnu.org/gcc-6/criteria.html) there are about a dozen 
supported platforms.


Looking at the Machine Constraints docs 
(https://gcc.gnu.org/onlinedocs/gcc/Machine-Constraints.html), there are 
34 architectures listed.  That's a lot of entries to scroll thru.  If 
these architectures aren't supported anymore, is it time to drop some of 
these from this page?


As a first pass, maybe something like this:

Keep
AArch64 family—config/aarch64/constraints.md
ARM family—config/arm/constraints.md
MIPS—config/mips/constraints.md
PowerPC and IBM RS6000—config/rs6000/constraints.md
S/390 and zSeries—config/s390/s390.h
SPARC—config/sparc/sparc.h
x86 family—config/i386/constraints.md

Drop
ARC —config/arc/constraints.md
AVR family—config/avr/constraints.md
Blackfin family—config/bfin/constraints.md
CR16 Architecture—config/cr16/cr16.h
Epiphany—config/epiphany/constraints.md
FRV—config/frv/frv.h
FT32—config/ft32/constraints.md
Hewlett-Packard PA-RISC—config/pa/pa.h
Intel IA-64—config/ia64/ia64.h
M32C—config/m32c/m32c.c
MeP—config/mep/constraints.md
MicroBlaze—config/microblaze/constraints.md
Motorola 680x0—config/m68k/constraints.md
Moxie—config/moxie/constraints.md
MSP430–config/msp430/constraints.md
NDS32—config/nds32/constraints.md
Nios II family—config/nios2/constraints.md
PDP-11—config/pdp11/constraints.md
RL78—config/rl78/constraints.md
RX—config/rx/constraints.md
SPU—config/spu/spu.h
TI C6X family—config/c6x/constraints.md
TILE-Gx—config/tilegx/constraints.md
TILEPro—config/tilepro/constraints.md
Visium—config/visium/constraints.md
Xstormy16—config/stormy16/stormy16.h
Xtensa—config/xtensa/constraints.md

dw

Re: Machine constraints list

2016-05-09 Thread David Wohlferd


On 5/9/2016 6:42 AM, paul_kon...@dell.com wrote:

On May 8, 2016, at 6:27 PM, David Wohlferd  wrote:

If these architectures aren't supported anymore, is it time to drop some of 
these from this page?

Your theory is quite mistaken.  A lot of the ones you labeled "drop" are 
supported.  Quite possibly all of them.


Ok, I see that.  Spot checking some of the architectures, they are still 
getting periodic checkins.


In my defense, I can't find any official list of which are 'tertiary' 
and which are deprecated (https://gcc.gnu.org/ml/gcc/2016-03/msg00010.html).


That said, there are still a lot of entries on that machine constraint page.

How about if I re-organize the list similar to function attributes 
(https://gcc.gnu.org/onlinedocs/gcc/Function-Attributes.html)?  Or at a 
minimum, add @anchors for each architecture so there are links?


dw

Deprecating basic asm in a function - What now?

2016-06-19 Thread David Wohlferd


Perhaps this post should be directed toward port maintainers?

Since several global maintainers have now suggested it, I have created a 
patch that deprecates basic asm when used in a function (attached).  It 
excludes (ie does not deprecate) top level asm, asm in "naked" 
functions, asm with empty instruction strings, and extended asm.


Building gcc using this patch turns up a few places that use this 
feature, so I fixed them.  Where possible, I used builtins to replace 
the asm.  For ease-of-review, these changes are in their own patch 
(attached) and obviously this patch should be checked in first.


But before I send these 2 off to gcc-patches, there's a problem. What 
about platforms other than x86/x64?  I don't speak other assembler 
languages, and have no setup with which to test them.


I could try to provide patches for other platforms, but it would 
probably be faster for platform experts to just make the changes 
themselves, rather than trying to review my efforts.  Especially if they 
also want to move to builtins (which I hope they do).


I could just send the patches and let the chips fall where they may, but 
if there's a less disruptive approach, let me know.


dw

PS I have done a scan for uses of basic asm to get some idea of the 
scope of the remaining work.  My results:


All basic asm in trunk: 1,105 instances.
- Exclude 273 instances with empty strings leaving 832.
- Exclude 271 instances for boehm-gc project leaving 561.
- Exclude 202 instances for testsuite project leaving 359.
- Exclude 282 instances that are (apparently) top-level leaving

~77 instances of basic-asm-in-a-function to be fixed for gcc builds.  
Most of these are in gcc/config or libgcc/config with just a handful per 
platform.  Lists available upon request.


FWIW...
Index: gcc/ada/init.c
===
--- gcc/ada/init.c	(revision 237502)
+++ gcc/ada/init.c	(working copy)
@@ -2141,7 +2141,7 @@
 #if defined (__i386__) && !defined (VTHREADS)
   /* This is used to properly initialize the FPU on an x86 for each
  process thread. Is this needed for x86_64 ???  */
-  asm ("finit");
+  asm ("finit":);
 #endif
 
   /* Similarly for SPARC64.  Achieved by masking bits in the Trap Enable Mask
@@ -2652,7 +2652,7 @@
   /* This is used to properly initialize the FPU on an x86 for each
  process thread.  */
 
-  asm ("finit");
+  asm ("finit":);
 
 #endif  /* Defined __i386__ */
 }
Index: libatomic/config/x86/fenv.c
===
--- libatomic/config/x86/fenv.c	(revision 237502)
+++ libatomic/config/x86/fenv.c	(working copy)
@@ -68,10 +68,10 @@
   if (excepts & FE_DENORM)
 {
   struct fenv temp;
-  asm volatile ("fnstenv\t%0" : "=m" (temp));
-  temp.__status_word |= FE_DENORM;
-  asm volatile ("fldenv\t%0" : : "m" (temp));
-  asm volatile ("fwait");
+  __builtin_ia32_fnstenv (&temp);
+  temp.__status_word |= FE_DENORM;
+  __builtin_ia32_fldenv (&temp);
+  asm ("fwait" ::"m" (temp):); /* Trigger the fp exception.  */
 }
   if (excepts & FE_DIVBYZERO)
 {
@@ -88,18 +88,18 @@
   if (excepts & FE_OVERFLOW)
 {
   struct fenv temp;
-  asm volatile ("fnstenv\t%0" : "=m" (temp));
-  temp.__status_word |= FE_OVERFLOW;
-  asm volatile ("fldenv\t%0" : : "m" (temp));
-  asm volatile ("fwait");
+  __builtin_ia32_fnstenv (&temp);
+  temp.__status_word |= FE_OVERFLOW;
+  __builtin_ia32_fldenv (&temp);
+  asm ("fwait" ::"m" (temp):); /* Trigger the fp exception.  */
 }
   if (excepts & FE_UNDERFLOW)
 {
   struct fenv temp;
-  asm volatile ("fnstenv\t%0" : "=m" (temp));
-  temp.__status_word |= FE_UNDERFLOW;
-  asm volatile ("fldenv\t%0" : : "m" (temp));
-  asm volatile ("fwait");
+  __builtin_ia32_fnstenv (&temp);
+  temp.__status_word |= FE_UNDERFLOW;
+  __builtin_ia32_fldenv (&temp);
+  asm ("fwait" ::"m" (temp):); /* Trigger the fp exception.  */
 }
   if (excepts & FE_INEXACT)
 {
Index: libcilkrts/runtime/config/x86/os-unix-sysdep.c
===
--- libcilkrts/runtime/config/x86/os-unix-sysdep.c	(revision 237502)
+++ libcilkrts/runtime/config/x86/os-unix-sysdep.c	(working copy)
@@ -85,7 +85,11 @@
 _mm_pause();
 #   endif
 #elif defined __i386__ || defined __x86_64
+#ifdef __GNUC__
+  __builtin_ia32_pause ();
+#else
 __asm__("pause");
+#endif
 #else
 #   warning __cilkrts_short_pause empty
 #endif
Index: libgcc/config/i386/sfp-exceptions.c
===
--- libgcc/config/i386/sfp-exceptions.c	(revision 237502)
+++ libgcc/config/i386/sfp-exceptions.c	(working copy)
@@ -59,10 +59,10 @@
   if (_fex & FP_EX_DENORM)
 {
   struct fenv temp;
-  asm volatile ("fnstenv\t%0" : "=m" (temp));
-  temp.__status_word |= FP_EX_DENORM;
-  asm volatile ("fldenv\t%0" : : "m"

Re: Deprecating basic asm in a function - What now?

2016-06-22 Thread David Wohlferd

In the end, my problems with basic-asm-in-functions (BAIF) come down to 
reliably generating correct code.


Optimizations are based on the idea of "you can safely modify x if you 
can prove y."  But given that basic asm are opaque blocks, there is no 
way to prove, well, much of anything.  Adding 'volatile' and now 
'memory' do help.  But the very definition of "opaque blocks" means that 
you absolutely cannot know what weird-ass crap someone might be trying.  
And every time gcc tweaks any of the optimizers, there's a chance that 
this untethered blob could bubble up or down within the code.  There is 
no way for any optimizer to safely decide whether a given position 
violates the intent of the asm, since it can have no idea what that 
intent is.  Can even a one line displacement of arbitrary assembler code 
cause race conditions or data corruption without causing a compiler 
error?  I'll bet it can.


And also, there is the problem of compatibility.  I am told that the 
mips "sync" instruction (used in multi-thread programming) requires a 
memory clobber to work as intended.  With that in mind, when I see this 
code on some blog, is it safe?


   asm("sync");

And of course the answer is that without context, there's no way to 
know.  It might have been safe for the compiler in the blogger's 
environment.  Or maybe he was an idiot.  It's certainly not safe in any 
released version of gcc.  But unless the blog reader knows the blogger's 
level of intelligence, development environment, and all the esoteric 
differences between that environment and their own, they could easily 
copy/paste that right into their own project, where it will compile 
without error, and "nearly always" work just fine...


With Bernd's recent change, people who have this (incorrect) code in 
their gcc project will suddenly have their code start working correctly, 
and that's a good thing.  Well, they will a year or so from now.  If 
they immediately start using v7.0.  And if they prevent everyone from 
trying to compile their code using 6.x, 5.x, 4.x etc where it will never 
work correctly.  So yes, it's a fix. But it can easily be years before 
this change finally solves this problem.  However, changing this to use 
extended:


   asm("sync":::"memory");

means that their very next source code release will immediately work 
correctly on every version of gcc.


You make the point about how people changing this code could easily muck 
things up.  And I absolutely agree, they could.  On the other hand, it's 
possible that their code is already mucked up and they just don't know 
it.  But even if they change this code to asm("sync":::) (forcing it to 
continue working the same way it has for over a decade), the 
programmer's intent is clear (if wrong).  A knowledgeable mips 
programmer could look at that and say "Hey, that's not right."  While 
making that same observation with asm("sync") is all but impossible.


BAIF is fragile.  That, combined with: unmet user expectations, bad user 
code, inconsistent implementations between compilers, changing 
implementations within compilers, years of bad docs, no "best practices" 
guide and an area that is very complex all tell me that this is an area 
filled with unfixable, ticking bombs.  That's why I think it should be 
deprecated.


Other comments inline:


On 20/06/16 18:36, Michael Matz wrote:

I see zero gain by deprecating them and only churn.  What would be the
advantage again?

Correctness.

As said in the various threads about basic asms, all correctness
problems can be solved by making GCC more conservative in handling them
(or better said: not making it less conservative).

If you talk about cases where basic asms diddle registers expecting GCC to
have placed e.g. local variables into specific ones (without using local
reg vars, or extended asm) I won't believe any claims ...


It is very likely that many of these basic asms are not
robust

... of them being very likely without proof.


I don't have a sample of people accessing local variables, but I do have 
one where someone was using 'scratch' registers in BAIF assuming the 
compiler would just "handle it."  And before you call that guy a dummy, 
let me point out that under some compilers, that's a perfectly valid 
assumption for asm.  And even if not, it may have been working "just 
fine" in his implementation for years.  Which doesn't make it right.



They will have stopped
working with every change in compilation options or compiler version.  In
contrast I think those that did survive a couple years in software very
likely _are_ correct, under the then documented (or implicit) assumptions.
Those usually are: clobbers and uses memory, processor state and fixed
registers.


As I was saying, a history of working doesn't make it right.  It just 
means the ticking hasn't finished yet.


How would you know if you have correctly followed the documented rules?  
Don't expect the compiler to flag these violations for you.



in t

Re: Deprecating basic asm in a function - What now?

2016-06-22 Thread David Wohlferd

On 6/21/2016 9:43 AM, Jeff Law wrote:
> I think there's enough resistance to deprecating basic asms within a 
function that we should probably punt that idea.

I don't disagree that there has been pushback.  I just wish less of it 
was of the form "Because I don't wanna."  A few examples of "Here's 
something that has to be in basic asm because..." might have produced a 
more interesting discussion.  I think if people had to defend (even to 
themselves) why they were using BAIF, there might be more converts.

And I *get* that it takes time to re-write this, and people have 
schedules, lives, a need for sleep.  But even under the most insanely 
aggressive schedule I can imagine (if gcc continue to release ~1/year), 
it will be at least a year before there's a release that has the 
(disable-able) warning, and another year before we could even think 
about actually removing this.  So someone who plans to use v8.0 in their 
production code on the day it is released still has a minimum of *two 
years* to get ready.

> I do think we should look to stomp out our own uses of basic asms 
within functions just from a long term maintenance standpoint.

Fixes.patch (from the start of this thread) seems mostly uncontested.  
Send it to patches?

If someone wanted to clean up a bunch of these, they should take a look 
at CRT_CALL_STATIC_FUNCTION in gcc/config.  This is defined for nearly 
every platform, and most of them do it with basic asm.  I gotta wonder 
if there's a better way to do this.  Isn't there an attribute for 'section'?

> Finally I think we should continue to bring the implementation of 
basic asms more in-line with expectations and future proofing them

I believe there are compilers that do safely use inline asm. However it 
appears that they accomplish this trick by parsing the asm.  Not 
something I expect to see added to gcc anytime soon...

> since I'm having a hard time seeing a reasonable path to deprecating 
their use.

Umm.  Hmm.  Seems like the ideal answer here would be something that 
prevents new code from using BAIF, without putting the old-timers in an 
uproar.

So how about the old "empty threat" gambit?  We could *say* we are going 
to deprecate it, put the (disable-able) warning into the code and the 
stern-sounding text in the docs, and then *leave* it like that for a 
decade or so.  The idea being that new code won't use it, but old code 
will still be supported.  Hopefully 10 years from now, there might be so 
little code that uses BAIF that finally removing it may no longer be so 
controversial.  It's not a lie, since deprecate means "express 
disapproval of" and yeah, that's about right.  And since we don't 
specify precisely *when* we intend to remove it...

dw

Unused variable in avx512fintrin.h

2016-08-09 Thread David Wohlferd


I'm looking at gcc/config/i386/avx512fintrin.h, and I see this:

extern __inline __m256i
__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
_mm512_cvtsepi64_epi32 (__m512i __A)
{
  __v8si __O;
  return (__m256i) __builtin_ia32_pmovsqd512_mask ((__v8di) __A,
   (__v8si)
   _mm256_undefined_si256 (),
   (__mmask8) -1);
}

Since r208793, __O appears to be unused?

For someone with checkin permissions (ie not me), removing this seems 
like an "as obvious."


dw

fxsrintrin.h

2016-08-18 Thread David Wohlferd

According to the docs 
(https://gcc.gnu.org/onlinedocs/gcc/x86-Built-in-Functions.html), 
__builtin_ia32_fxsave() has return type 'void.'  Given that, does this 
code (from gcc/config/i386/fxsrintrin.h) make sense?


_fxsave (void *__P)
{
  return __builtin_ia32_fxsave (__P);
}

Returning a void?  Is that a thing?  Similar question for _fxrstor, 
_fxsave64, and _fxrstor64.


And again in xsaveintrin.h for _xsave, _xrstor, _xsave64 and _xrstor64?

dw

Re: fxsrintrin.h

2016-08-18 Thread David Wohlferd

Interesting.  Seems slightly strange, but I've seen stranger.  I guess 
it's seen as "cleaner" than forcing this into 2 statements.


IAC, it seems wrong for headers, since they can be used from either C or 
C++.  Also, seems unnecessary here, since 'return' is implied by the 
fact that the 'next' statement is the end of the routine.


dw


On 8/18/2016 10:50 PM, lhmouse wrote:

Given the `_fxsave()` function returning `void`, it is invalid C but valid C++:

# WG14 N1256 (C99) / N1570 (C11)
6.8.6.4 The return statement
Constraints
1 A return statement with an expression shall not appear in a function
whose return type is void. ...

# WG21 N1804 (C++03)
6.6.3 The return statement [stmt.return]
3 A return statement with an expression of type “cv void” can be used
only in functions with a return type of cv void; the expression
is evaluated just before the function returns to its caller.

# WG21 N4582 (C++1z)
6.6.3 The return statement [stmt.return]
2 ... A return statement with an operand of type void shall be used
only in a function whose return type is cv void. ...


--  
Best regards,
lh_mouse
2016-08-19

-----
发件人：David Wohlferd 
发送日期：2016-08-19 11:51
收件人：gcc@gcc.gnu.org
抄送：
主题：fxsrintrin.h

According to the docs
(https://gcc.gnu.org/onlinedocs/gcc/x86-Built-in-Functions.html),
__builtin_ia32_fxsave() has return type 'void.'  Given that, does this
code (from gcc/config/i386/fxsrintrin.h) make sense?

  _fxsave (void *__P)
  {
return __builtin_ia32_fxsave (__P);
  }

Returning a void?  Is that a thing?  Similar question for _fxrstor,
_fxsave64, and _fxrstor64.

And again in xsaveintrin.h for _xsave, _xrstor, _xsave64 and _xrstor64?

dw

Re: doc maintainer questions

2016-10-02 Thread David Wohlferd


On 9/20/2016 12:36 PM, Gerald Pfeifer wrote:

[ Old e-mail alert ]

On Sat, 19 Dec 2015, David Wohlferd wrote:

I have been discussing adding some content to the basic asm docs. As part of
this work, I want to add a discussion of "How to convert basic asm to extended
asm."  However it doesn't seem like this is a good fit for the User Guide.
This is both because the UG doesn't generally talk about "How To" write code,
and because the text may need updates more often than the UG gets released.

For the latter, could referring to gcc.gnu.org/onlinedocs (in particular
the trunk version there) be a viable option?

That gets updated daily, so not a question of waiting for the next
release.


1) Is it appropriate for the UG to link to sections in the wiki?  I
see that we do, but should we?

Generally, I think that is fine.  It does require Internet access
and you run into the opposite of what you describe above (the Wiki
might have moved to coverage of a later version of GCC), but then
so does what I described at the beginning of this message.

Generally, I also think that we should not spread things across too
many places, and our documentation should be mostly self contained.

(The particular use case you had in mind, seems fine for the Wiki
or the web pages -- and I'm thinking of our porting_to.html pages
that we have had for the last couple of major releases.)


In the time since my email was sent, there has been good news and bad 
news on this subject:


The good news is that the wiki content got created (see 
https://gcc.gnu.org/wiki/ConvertBasicAsmToExtended) and a reference has 
been added to the docs (see 
https://gcc.gnu.org/onlinedocs/gcc/Basic-Asm.html).


The bad news is that the original intent of this doc was that it would 
be included as part of completely deprecating using Basic Asm in 
functions (BAIF), or at least adding a warning to help people find and 
convert at-risk code.  However when I sent the patches to do these 
things, I was unable to convince anyone to sign off on either of them.


Moving on, I am contemplating creating some additional content on a 
related subject, so perhaps you would care to voice your opinion as to 
the value and best location for this content.


The gcc docs are able to "get away" with not documenting a great deal of 
behavior for the compiler.  This is because the compiler is merely 
implementing the C/C++ standards.  If someone wants to know how a 
particular feature is intended to work, they can read the standard and 
assume that's what the compiler does.


However this approach doesn't apply when gcc adds extensions to the 
standard.  In these cases, the gcc docs must provide 100% of the 
information for using the feature.  Anything not listed as explicitly 
supported is (by definition) "undefined" behavior. Having a program's 
function depend upon undefined behavior is risky, and is usually 
described as "a bad thing."


I believe the decision to continue to support BAIF necessitates 
documenting (somewhere) this gcc extension to clearly specify which uses 
are supported and which are not.  As a first cut, I intend to define a 
list of common programming tasks and document which ones are and are not 
supported.  Obviously this list can never be complete, but by explicitly 
listing some of the more common tasks, we can at least begin to steer 
people away from known-bad ideas. Off the top of my head:


- read/write global variables
- read/write function parameters
- read/write local variables
- return value from function
- modify registers (without restoring)
- modify flags (without restoring)
- modify the stack (push/pop/mov to/mov from)
- call a function
- invoke top-level asm
- jump outside single asm
- throw exceptions
- accessing variables declared in asm

That's probably enough for a start.

Unfortunately, anything we don't list remains "undefined behavior." 
Icky, I know, but right now EVERYTHING is undefined.  We are leaving 
people to guess about what might be safe, and hope that "It's worked 
that way for a long time" or "Lots of people do it" implies that it will 
always work that way.  Talk about icky...


I intend to categorize each of these as one of:

A) Unsupported - While this might appear to work, that is merely 
happenstance and cannot be depended upon.  Even if there are no 
circumstances today where this will fail, this behavior is not supported 
and may fail in future releases.


B) Partial - This operation is safe under specific conditions.

C) Supported - This operation is safe and will not be mangled by 
optimizations.


At first read, I see 1(?) for category C, 3 for category B, and the rest 
are A.  However, as "the guy who wants to kill BAIF," perhaps I'm being 
too harsh.


Indeed, the biggest challenge for this project will be finding someone 
who is prepared to sign off on the accura

RFC: Doc update for attribute

2014-05-12 Thread David Wohlferd

After updating gcc's docs about inline asm, I'm trying to improve some 
of the related sections.  One that I feel has problems with clarity is  
__attribute__ naked.


I have attached my proposed update.  Comments/corrections are welcome.

In a related question:

To better understand how this attribute is used, I looked at the Linux 
kernel.  While the existing docs say "only ... asm statements that do 
not have operands" can safely be used, Linux routinely uses asm WITH 
operands.  Some examples:


memory clobber operand: 
http://lxr.free-electrons.com/source/arch/arm/kernel/kprobes.c#L377
Input arguments: 
http://lxr.free-electrons.com/source/arch/arm/mm/copypage-feroceon.c#L17


Since I don't know why "asm with operands" was excluded from the 
existing docs, I'm not sure whether what Linux does here is supported or 
not (maybe with some limitations?).  If someone can clarify, I'll add it 
to this text.


Even without discussing "asm with operands," I believe this text is an 
improvement.


Thanks in advance,
dw
Index: extend.texi
===
--- extend.texi	(revision 210349)
+++ extend.texi	(working copy)
@@ -3330,16 +3330,15 @@
 
 @item naked
 @cindex function without a prologue/epilogue code
-Use this attribute on the ARM, AVR, MCORE, MSP430, NDS32, RL78, RX and SPU
-ports to indicate that the specified function does not need prologue/epilogue
-sequences generated by the compiler.
-It is up to the programmer to provide these sequences. The
-only statements that can be safely included in naked functions are
-@code{asm} statements that do not have operands.  All other statements,
-including declarations of local variables, @code{if} statements, and so
-forth, should be avoided.  Naked functions should be used to implement the
-body of an assembly function, while allowing the compiler to construct
-the requisite function declaration for the assembler.
+This attribute is available on the ARM, AVR, MCORE, MSP430, NDS32, RL78, RX 
+and SPU ports.  It allows the compiler to construct the requisite function 
+declaration, while allowing the body of the function to be assembly code. 
+The specified function will not have prologue/epilogue sequences generated 
+by the compiler; it is up to the programmer to provide these sequences if 
+the function requires them. The expectation is that only Basic @code{asm} 
+statements will be included in naked functions (@pxref{Basic Asm}). While it 
+is discouraged, it is possible to write your own prologue/epilogue code 
+using asm and use ``C'' code in the middle.
 
 @item near
 @cindex functions that do not handle memory bank switching on 68HC11/68HC12

Re: RFC: Doc update for attribute

2014-05-16 Thread David Wohlferd

Thank you for your response.  This is exactly what I wanted to know. One 
last question:



+While it
+is discouraged, it is possible to write your own prologue/epilogue code
+using asm and use ``C'' code in the middle.

I wouldn't remove the last sentence since IMO it's not the intent of the feature
to ever support that and the compiler doesn't guarantee it and may result
in wrong code given that `naked' is a fragile low-level feature.


I'm assuming you meant "would remove."

I wasn't comfortable including that sentence, but I was following the 
existing docs.  Since they said you could "only" use basic asm, 
following that with a warning to "avoid" locals/if/etc was really 
confusing without this text.


Also, as ugly as this is, apparently some people really do this (comment 
6): https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43404#c6


We don't have to doc every crazy thing people try to do with gcc. But 
since it's out there, maybe we should this time?  If only to discourage it.


I'm *slightly* more in favor of keeping it.  But if you still feel it 
should go, it's gone.


Thanks,
dw

Re: RFC: Doc update for attribute

2014-05-20 Thread David Wohlferd

After thinking about this some more, I believe I have some better text.  
Previously I used the word "discouraged" to describe this practice.  The 
existing docs use the term "avoid."  I believe what you want is 
something more like the attached.  Direct and clear, just like docs 
should be.


If you are ok with this, I'll send it to gcc-patches.

dw


+While it
+is discouraged, it is possible to write your own prologue/epilogue 
code

+using asm and use ``C'' code in the middle.
I wouldn't remove the last sentence since IMO it's not the intent of 
the feature
to ever support that and the compiler doesn't guarantee it and may 
result

in wrong code given that `naked' is a fragile low-level feature.


I'm assuming you meant "would remove."

I wasn't comfortable including that sentence, but I was following the 
existing docs.  Since they said you could "only" use basic asm, 
following that with a warning to "avoid" locals/if/etc was really 
confusing without this text.


Also, as ugly as this is, apparently some people really do this 
(comment 6): https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43404#c6


We don't have to doc every crazy thing people try to do with gcc. But 
since it's out there, maybe we should this time?  If only to 
discourage it.


I'm *slightly* more in favor of keeping it.  But if you still feel it 
should go, it's gone.


Index: extend.texi
===
--- extend.texi	(revision 210624)
+++ extend.texi	(working copy)
@@ -3332,16 +3332,15 @@
 
 @item naked
 @cindex function without a prologue/epilogue code
-Use this attribute on the ARM, AVR, MCORE, MSP430, NDS32, RL78, RX and SPU
-ports to indicate that the specified function does not need prologue/epilogue
-sequences generated by the compiler.
-It is up to the programmer to provide these sequences. The
-only statements that can be safely included in naked functions are
-@code{asm} statements that do not have operands.  All other statements,
-including declarations of local variables, @code{if} statements, and so
-forth, should be avoided.  Naked functions should be used to implement the
-body of an assembly function, while allowing the compiler to construct
-the requisite function declaration for the assembler.
+This attribute is available on the ARM, AVR, MCORE, MSP430, NDS32,
+RL78, RX and SPU ports.  It allows the compiler to construct the
+requisite function declaration, while allowing the body of the
+function to be assembly code. The specified function will not have
+prologue/epilogue sequences generated by the compiler. Only Basic
+@code{asm} statements can safely be included in naked functions
+(@pxref{Basic Asm}). While using Extended @code{asm} or a mixture of
+Basic @code{asm} and ``C'' code may appear to work, they cannot be
+depended upon to work reliably and are not supported.
 
 @item near
 @cindex functions that do not handle memory bank switching on 68HC11/68HC12
@@ -6269,6 +6268,8 @@
 efficient code, and in most cases it is a better solution. When writing 
 inline assembly language outside of C functions, however, you must use Basic 
 @code{asm}. Extended @code{asm} statements have to be inside a C function.
+Functions declared with the @code{naked} attribute also require Basic 
+@code{asm} (@pxref{Function Attributes}).
 
 Under certain circumstances, GCC may duplicate (or remove duplicates of) your 
 assembly code when optimizing. This can lead to unexpected duplicate 
@@ -6388,6 +6389,8 @@
 
 Note that Extended @code{asm} statements must be inside a function. Only 
 Basic @code{asm} may be outside functions (@pxref{Basic Asm}).
+Functions declared with the @code{naked} attribute also require Basic 
+@code{asm} (@pxref{Function Attributes}).
 
 While the uses of @code{asm} are many and varied, it may help to think of an 
 @code{asm} statement as a series of low-level instructions that convert input

Question for ARM person re asm_fprintf

2014-07-21 Thread David Wohlferd

I have been looking at asm_fprintf in final.c, and I think there's a 
design flaw.  But since the change affects ARM and since I have no 
access to an ARM system, I need a second opinion.


asm_fprintf allows platforms to add support for new format specifiers by 
using the ASM_FPRINTF_EXTENSIONS macro.  ARM uses this to add support 
for %@ and %r specifiers.  Pretty straight-forward.


However, it isn't enough to add these two items to the case statement in 
asm_fprintf.  Over in c-format.c, there is compile-time checking that is 
done against calls to asm_fprintf to validate the format string.  %@ and 
%r have been added to this checking (see asm_fprintf_char_table), but 
NOT in a platform-specific way.


This means that using %r or %@ will successfully pass the format 
checking on all platforms, but will ICE on non-ARM platforms since there 
are no case statements in asm_fprintf to support them.


Compiling the code in asm_fprintf-1.c (see the patch) with this patch 
correctly reports "unknown conversion type character" for both 'r' and 
'@' in  x86_64-pc-cygwin.  It would be helpful if someone could confirm 
that it still compiles without error under ARM after applying this 
patch.  I'm reluctant to post this to gcc-patches when it has never been 
run.


dw
Index: gcc/c-family/c-format.c
===
--- gcc/c-family/c-format.c	(revision 212900)
+++ gcc/c-family/c-format.c	(working copy)
@@ -637,8 +637,9 @@
   { "I",   0, STD_C89, NOARGUMENTS, "",  "",   NULL },
   { "L",   0, STD_C89, NOARGUMENTS, "",  "",   NULL },
   { "U",   0, STD_C89, NOARGUMENTS, "",  "",   NULL },
-  { "r",   0, STD_C89, { T89_I,   BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN  }, "",  "", NULL },
-  { "@",   0, STD_C89, NOARGUMENTS, "",  "",   NULL },
+#ifdef ASM_FPRINTF_TABLE
+  ASM_FPRINTF_TABLE
+#endif
   { NULL,  0, STD_C89, NOLENGTHS, NULL, NULL, NULL }
 };
 
Index: gcc/config/arm/arm.h
===
--- gcc/config/arm/arm.h	(revision 212900)
+++ gcc/config/arm/arm.h	(working copy)
@@ -888,6 +888,12 @@
 fputs (reg_names [va_arg (ARGS, int)], FILE);	\
 break;
 
+/* Used in c-format.c to add entries to the table used to validate calls 
+   to asm_fprintf. */
+#define ASM_FPRINTF_TABLE \
+  { "r",   0, STD_C89, { T89_I,   BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN  }, "",  "", NULL }, \
+  { "@",   0, STD_C89, NOARGUMENTS, "",  "",   NULL },
+
 /* Round X up to the nearest word.  */
 #define ROUND_UP_WORD(X) (((X) + 3) & ~3)
 
Index: gcc/doc/tm.texi
===
--- gcc/doc/tm.texi	(revision 212900)
+++ gcc/doc/tm.texi	(working copy)
@@ -8611,8 +8611,39 @@
 The varargs input pointer is @var{argptr} and the rest of the format
 string, starting the character after the one that is being switched
 upon, is pointed to by @var{format}.
+See also ASM_FPRINTF_TABLE.
+
+Example:
+@smallexample
+#define ASM_FPRINTF_EXTENSIONS(FILE, ARGS, P)		\
+  case '@':		\
+fputs (ASM_COMMENT_START, FILE);			\
+break;		\
+			\
+  case 'r':		\
+fputs (REGISTER_PREFIX, FILE);			\
+fputs (reg_names [va_arg (ARGS, int)], FILE);	\
+break;
+@end smallexample
 @end defmac
 
+@defmac ASM_FPRINTF_TABLE
+When using ASM_FPRINTF_EXTENSIONS, you must also use this macro to define
+table entries for the printf format checking performed in c-format.c. 
+This macro must contain format_char_info entries for each printf format 
+being added.
+
+Example:
+@smallexample
+#define ASM_FPRINTF_TABLE \
+  @{ "r", 0, STD_C89, \
+ @{ T89_I, BADLEN, BADLEN, BADLEN, BADLEN, BADLEN, \
+  BADLEN, BADLEN, BADLEN, BADLEN, BADLEN, BADLEN @}, \
+  "", "", NULL @}, \
+  @{ "@",   0, STD_C89, NOARGUMENTS, "",  "",   NULL @},
+@end smallexample
+@end defmac
+
 @defmac ASSEMBLER_DIALECT
 If your target supports multiple dialects of assembler language (such as
 different opcodes), define this macro as a C expression that gives the
Index: gcc/doc/tm.texi.in
===
--- gcc/doc/tm.texi.in	(revision 212900)
+++ gcc/doc/tm.texi.in	(working copy)
@@ -6370,8 +6370,39 @@
 The varargs input pointer is @var{argptr} and the rest of the format
 string, starting the character after the one that is being switched
 upon, is pointed to by @var{format}.
+See also ASM_FPRINTF_TABLE.
+
+Example:
+@smallexample
+#define ASM_FPRINTF_EXTENSIONS(FILE, ARGS, P)		\
+  case '@':		\
+fputs (ASM_COMMENT_START, FILE);			\
+break;		\
+			\
+  case 'r':		\
+fputs (REGISTER_PREFIX, FILE);			\
+fputs (reg_names [va_arg (ARGS, int)], FILE);	\
+break;
+@end smallexample
 @end defmac
 
+@defmac ASM_FPRINTF_TABLE
+When using ASM_FPRINTF_EXTENSIONS, you must also use this macro to define
+ta

Re: C as intermediate language, signed integer overflow and -ftrapv

2014-07-23 Thread David Wohlferd

I believe that sometimes gcc is promoting the ints to long longs when 
doing the overflow testing.  If I try to overflow a long long, I get the 
trap as expected.


See also https://gcc.gnu.org/bugzilla/show_bug.cgi?id=19020

dw

On 7/23/2014 7:56 AM, Thomas Mertes wrote:

C is popular as intermediate language. This means that some compilers
generate C and use a C compiler as backend. Wikipedia lists several
languages, which use C as intermediate language:
Eiffel, Sather, Esterel, some dialects of Lisp (Lush, Gambit),
Haskell (Glasgow Haskell Compiler), Squeak's Smalltalk-subset Slang,
Cython, Seed7 and Vala.

When C is used as backend the features needed from a C compiler are
different from the features that a human programmer of C needs.

One such feature is the detection of signed integer overflow. It is
not hard, to detect signed integer overflow with a generated C
program, but the performance is certainly not optimal. Signed integer
overflow is undefined behavior in C and the access to some overflow
flag of the CPU is machine dependent. So the generated C code must
recogize overflow before it happens or use unsigned computations and
recognize the overflow afterwards. I have doubts that this leads to
optimal performance.

The C compiler is much better suited to do signed integer overflow
checks. The C compiler can do low level tricks, that would be
undefined behavior in C, and the C compiler also knows about overflow
flags and other details of the CPU. Maybe the CPU can be switched to
a mode where it traps signed integer overflow for free.

The gcc compiler option -ftrapv promises to do exactly that, but it
seems broken. At least my test suite shows that both gcc version
4.6.3 and 4.8.1 fail to detect integer overflow with -ftrapv. The
detection fails even for addition and subtraction. I know that 4.6.3
and 4.8.1 are old, but I found nothing in the internet that tells
me that the situation is better now. So for gcc as C compiler backend
-ftrapv cannot be used and overflow checking in the generated C code
is necessary.

Clang supports -ftrapv reliably. Signed integer overflow raises the
signal SIGILL, which can be catched. Btw. SIGILL seems to be a better
choice, because under windows (7) SIGABRT causes some text to be
written to the console. Is it possible to choose a -ftrapv signal?

A sanitizer such as ubsan is good as tool to find errors in C
programs. But I don't think that ubsan is well suited to do overflow
detection with maximum performance. Is just not the goal of this
tool.

The argumantation that nobody uses -ftrapv is self-fulfilling
prophecy. How can someone expect that a buggy feature is used.
The counterexample is clang, where -ftrapv works and is also used
(E.g. by the integer overflow detection of Seed7).

Conclusion:
Signed integer overflow detection with -ftrapv is NOT something that
nobody uses. It is an important feature. Especially when C is used as
intermediate language. When it works it results in a significant
speed up of signed overflow detection. A sanitizer has a different
purpose and cannot be used as replacement.

I can offer some help with this issue:
I have test programs for cases with integer overflow and for cases
where the result is as big or as small as possible without causing an
overflow. The test programs are not written in C, but are licensed
with the GPL, and it would be possible to convert them to C with
reasonable effort. Maybe this is not necessary, because clang must
have some test suites for -ftrapv.

Greetings Thomas Mertes

--
Seed7 Homepage:  http://seed7.sourceforge.net
Seed7 - The extensible programming language: User defined statements
and operators, abstract data types, templates without special
syntax, OO with interfaces and multiple dispatch, statically typed,
interpreted or compiled, portable, runs under linux/unix/windows.

Re: Question for ARM person re asm_fprintf

2014-07-23 Thread David Wohlferd




Not that the following would constitute the actual testing
usually required for a patch, but:

/path/to/toplevel/configure --target=arm-eabi && make all-gcc
# Yay, the compiler-proper for a "bare iron" ARM compiler.

./gcc/xgcc -B./gcc -S test.c

Woot, compiled your first ARM program. :)  Just emitting text
assembly code, and #include's won't work, but a missing-case
leading-to-abort would be prominently noticed as an internal
compiler error.


I just tried this and I did get it to build and run.  And it's true that 
invalid formats do cause an error, even just using -S.  But I'm still 
glad that kyrill tried it too.


So I've sent this to gcc-patches.  Now comes the hard part: Patience...

dw

[RFD] Using the 'memory constraint' trick to avoid memory clobber doesn't work

2014-09-24 Thread David Wohlferd

Hans-Peter Nilsson: I should have listened to you back when you raised 
concerns about this.  My apologies for ever doubting you.


In summary:

- The "trick" in the docs for using an arbitrarily sized struct to force 
register flushes for inline asm does not work.
- Placing the inline asm in a separate routine can sometimes mask the 
problem with the trick not working.
- The sample that has been in the docs forever performs an unhelpful, 
unexpected, and probably unwanted stack allocation + memcpy.


Details:

Here is the text from the docs:

---
One trick to avoid [using the "memory" clobber] is available if the size 
of the memory being accessed is known at compile time. For example, if 
accessing ten bytes of a string, use a memory input like:


"m"( ({ struct { char x[10]; } *p = (void *)ptr ; *p; }) )
---

When I did the re-write of gcc's inline asm docs, I left the description 
for this (essentially) untouched.  I just took it on faith that "magic 
happens" and the right code gets generated.  But reading a recent post 
raised questions for me, so I tried it.  And what I found was that not 
only does this not work, it actually just makes a mess.


I started with some code that I knew required some memory clobbering:

#include 

int main(int argc, char* argv[])
{
  struct
  {
int a;
int b;
  } c;

  c.a = 1;
  c.b = 2;

  int Count = sizeof(c);
  void *Dest;

  __asm__ __volatile__ ("rep; stosb"
   : "=D" (Dest), "+c" (Count)
   : "0" (&c), "a" (0)
   //: "memory"
  );

  printf("%u %u\n", c.a, c.b);
}

As written, this x64 code (compiled with -O2) will print out "1 2", even 
though someone might (incorrectly) expect the asm to overwrite the 
struct with zeros.  Adding the memory clobber allows this code to work 
as expected (printing "0 0").


Now that I have code I can use to see if registers are getting flushed, 
I removed the memory clobber, and tried just 'clobbering' the struct:


#include 

int main(int argc, char* argv[])
{
  struct
  {
int a;
int b;
  } c;

  c.a = 1;
  c.b = 2;

  int Count = sizeof(c);
  void *Dest;

  __asm__ __volatile__ ("rep; stosb"
   : "=D" (Dest), "+c" (Count)
   : "0" (&c), "a" (0),
   "m" ( ({ struct foo { char x[8]; } *p = (struct foo *)&c ; 
*p; }) )

  );

  printf("%u %u\n", c.a, c.b);
}

I'm using a named struct (foo) to avoid some compiler messages, but 
other than that, I believe this is the same as what's in the docs. And 
it doesn't work.  I still get "1 2".


At this point I realized that code I've seen using this trick usually 
has the asm in its own routine.  When I try this, it still fails.  
Unless I start cranking up the size of x from 8 to ~250.  At ~250, 
suddenly it starts working.  Apparently this is because at this point, 
gcc decides not to inline the routine anymore, and flushes the registers 
before calling the non-inline code.


And why does changing the size of the structure we are pointing to 
result in increases in the size of the routine?  Reading the -S output, 
the "*p" at the end of this constraint generates a call to memcpy the 
250 characters onto the stack, which it passes to the asm as %4, which 
is never used.  Argh!


Conclusion:

What I expected when using that sample code from the docs was that any 
registers that contain values from the struct would get flushed to 
memory.  This was intended to be a 'cheaper' alternative to doing a 
full-on "memory" clobber.  What I got instead was an unexpected/unneeded 
stack allocation and memcpy, and STILL didn't get the values flushed.  
Yeah, not exactly the 'cheaper' I was hoping for.


Is the example in the docs just written incorrectly?  Did this get 
broken somewhere along the line?  Or am I just using it wrong?


I'm using gcc version 4.9.0 (x86_64-win32-seh-rev2, Built by MinGW-W64 
project).  Remember to compile these x64 samples with -O2.


dw

Re: [RFD] Using the 'memory constraint' trick to avoid memory clobber doesn't work

2014-09-26 Thread David Wohlferd



On 9/25/2014 12:36 AM, Yury Gribov wrote:

On 09/24/2014 12:31 PM, Richard Biener wrote:
On Wed, Sep 24, 2014 at 9:43 AM, David Wohlferd 
 wrote:

Hans-Peter Nilsson: I should have listened to you back when you raised
concerns about this.  My apologies for ever doubting you.

In summary:

- The "trick" in the docs for using an arbitrarily sized struct to 
force

register flushes for inline asm does not work.
- Placing the inline asm in a separate routine can sometimes mask the
problem with the trick not working.
- The sample that has been in the docs forever performs an unhelpful,
unexpected, and probably unwanted stack allocation + memcpy.

Details:

Here is the text from the docs:

---
One trick to avoid [using the "memory" clobber] is available if the 
size of

the memory being accessed is known at compile time. For example, if
accessing ten bytes of a string, use a memory input like:

 "m"( ({ struct { char x[10]; } *p = (void *)ptr ; *p; }) )


Well - this can't work because you essentially are using a _value_
here (looking at the GIMPLE - I'm not sure if a statement expression
evaluates to an lvalue.

It should work if you simply do this without a stmt expression:

   "m" (*(struct { char x[10]; } *)ptr)

because that's clearly an lvalue (and the GIMPLE correctly says so):

   :
   c.a = 1;
   c.b = 2;
   __asm__ __volatile__("rep; stosb" : "=D" Dest_4, "=c" Count_5 : "0"
&c, "a" 0, "m" MEM[(struct foo *)&c], "1" 8);
   printf ("%u %u\n", 1, 2);

note that we still constant propagated 1 and 2 for the reason that
the asm didn't get any VDEF.  That's because you do not have any
memory output!  So while it keeps 'c' live it doesn't consider it
modified by the asm.  You'd still need to clobber the memory,
but "m" clobbers are not supported, only "memory".

Thus fixed asm:


   __asm__ __volatile__ ("rep; stosb"
: "=D" (Dest), "+c" (Count)
: "0" (&c), "a" (0),
"m" (*( struct foo { char x[8]; } *)&c)
: "memory"
   );

where I'm not 100% sure if the "m" input is now pointless (that is,
if a "memory" clobber also constitutes a use of all memory).


Or maybe even
  __asm__ __volatile__ ("rep; stosb"
   : "=D" (Dest), "+c" (Count), "+m" (*(struct foo { char x[8]; } 
*)&c)

   : "0" (&c), "a" (0)
  );
to avoid the big-hammer memory clobber?

-Y



Thank you both for the responses.  At this point I've started composing 
some replacement text for the docs (below), cuz clearly what's there is 
both inadequate and wrong.  All comments are welcome.


While the code in Richard's response will always produce the correct 
results, the intent here is to use memory constraints to *avoid* the 
performance penalties of the memory clobber.  The existing docs say this 
should work, and I've seen a number of places using it (linux kernel, 
glibc, etc).  If this does work, we should update the docs to say how 
it's done.  If this doesn't work, we should say that too.


Looking at Yury's response, that code does work (in c, not c++).  At 
least it does sometimes.  The problem is that sometimes gcc can "lose 
track" of what a pointer points to.  And when it does, gcc can get 
confused about what to flush.  Here's a simple example that shows this 
(4.9.0, x64, -O2, 'c'):


#include 

inline void *memset( void * Dest, int c, size_t Count) {
   void *dummy;

   __asm__ (
  "rep stosb"
  : "=D" (dummy), "+c" (Count), "=m" (*( struct foo { char x[8]; } 
*)Dest)

  : "0" (Dest), "a" (c)
  : "cc"//, "memory"
   );

   return Dest;
}

int main() {
   struct {
 int a;
   } c;

   void *v;
   asm volatile("#" : "=r" (v) : "0" (&c)  );

   c.a = 0x30303030;

   //memset(&c, 0, sizeof(c));
   memset(v, 0, sizeof(c));
   printf("%x\n", c.a);
}

This code will work if you pass &c to memset.  But it will fail if you 
use v.  Oh, the wonders of aliasing.


And this is why I'm having a problem doc'ing this.  I love the potential 
benefits of using this.  But if you are writing general-purpose 
routines, how can you hope to know whether the code calling you will 
pass you a pointer that will work with this trick? This kind of thing 
can introduce *horribly* hard to track down problems.  Note that the 
memory clobber always works, but potentially with a performance penalty. 



I could simply skip describing the whole problem with aliasing, but that 
just hides the problem.  I hate wh

Re: [RFD] Using the 'memory constraint' trick to avoid memory clobber doesn't work

2014-10-02 Thread David Wohlferd




You want

"=m" (*( struct foo  { char x[8]; } __attribute__((may_alias)) *)Dest)


Thank you.  With your help, that worse-than-useless sample in the docs 
is getting closer to something people can actually use.


Except for one last serious problem:  This trick only works for very 
specific (and very small) sizes of x (something else the existing docs 
fail to mention).


On x64: Sizes of 1/2/4/8/16 correctly clobber Dest (and only Dest) as 
expected.  Trying to use any other values (5, 12, 32, etc) seem to force 
a full memory clobber.  This is the case REGARDLESS of the actual size 
of the underlying structure.  For example clobbering 16 bytes of a 12 
byte struct only clobbers the 12 bytes of Dest (as expected), but 
clobbering 12 bytes of a 12 byte struct performs a full memory clobber.  
This presents a problem for general purpose routines like memset, where 
the actual size of the block cannot be hardcoded into the struct.


glibc (which uses this trick to avoid using the memory clobber) uses a 
size of 0xfff.  Being larger than 16, this always causes a full 
memory clobber, not the "memory constraint" clobber I assume they were 
expecting.  OTOH, since they don't use may_alias, this is may actually 
be a good thing...


While I really like the idea of using memory constraints to avoid all 
out memory clobbers, 16 bytes is a pretty small maximum memory block, 
and x32 only supports a max of 8.  Unless there's some way to use larger 
sizes (say SSIZE_MAX), this feature hardly seems worth documenting.


dw

Re: [RFD] Using the 'memory constraint' trick to avoid memory clobber doesn't work

2014-11-13 Thread David Wohlferd

Sorry for the (very) delayed response.  I'm still looking for feedback 
here so I can fix the docs.


To refresh: The topic of conversation was the (extremely) wrong 
explanation that has been in the docs since forever about how to use 
memory constraints with inline asm to avoid the performance hit of a 
full memory clobber.  Trying to understand how this really works has led 
to some surprising results.


Me:
>> While I really like the idea of using memory constraints to avoid 
all out
>> memory clobbers, 16 bytes is a pretty small maximum memory block, 
and x86
>> only supports a max of 8.  Unless there's some way to use larger 
sizes (say

>> SSIZE_MAX), this feature hardly seems worth documenting.

Richard:
> I wonder how you figured out that a 12 byte clobber performs a full
> memory clobber?

Here's the code (compiled with gcc version 4.9.0 x86_64-win32-seh-rev2, 
using -m64 -O2 -fdump-final-insns):



#include 

#define MYSIZE 3

inline void
__stosb(unsigned char *Dest, unsigned char Data, size_t Count)
{
   struct _reallybigstruct { char x[MYSIZE]; }
  *p = (struct _reallybigstruct *)Dest;

   __asm__ __volatile__ ("rep stos{b|b}"
  : "+D" (Dest), "+c" (Count), "=m" (*p)
  : [Data] "a" (Data)
  //: "memory"
   );
}

int main()
{
   unsigned char buff[100];
   buff[5] = 'A';

   __stosb(buff, 'B', sizeof(buff));
   printf("%c\n", buff[5]);
}


In summary:

   1) Create a 100 byte buffer, and set buff[5] to 'A'.
   2) Call __stosb, which uses inline asm to overwrite all of buff with 
'B'.
   3) Use a memory constraint in __stosb to flush buff.  The size of 
the memory constraint is controlled by a #define.


With this, I have a simple way to test various sizes of memory 
constraints to see if the buffer gets flushed.  If it *is* flushing the 
buffer, printing buff[5] after __stosb will print 'B'.  If it is *not* 
flushing, it will print 'A'.


Results:
   - Since buff[5] is the 6th byte in the buffer, using memory 
constraint sizes of 1, 2 & 4 (not surprisingly) all print 'A', showing 
that no flush was done.
   - Sizes of 8 and 16 print 'B', showing that the flush was done. This 
is also the expected result, since I am now flushing enough of buff to 
include buff[5].
   - The surprise comes from using a size of 3 or 5.  These also print 
'B'.  WTF?  If 4 doesn't flush, why does 3?


I believe the answer comes from reading the RTL.  The difference between 
sizes of 3 and 16 comes here:


  (set (mem/c:TI (plus:DI (reg/f:DI 7 sp)
 (const_int 32 [0x20])) [ MEM[(struct _reallybigstruct *)&buff]+0 
S16 A128])

 (asm_operands/v:TI ("rep stos{b|b}") ("=m") 2 [

   (set (mem/c:BLK (plus:DI (reg/f:DI 7 sp)
 (const_int 32 [0x20])) [ MEM[(struct _reallybigstruct 
*)&buff]+0 S3 A128])

 (asm_operands/v:BLK ("rep stos{b|b}") ("=m") 2 [

While I don't actually speak RTL, TI clearly refers to TIMode. 
Apparently when using a size that exactly matches a machine mode, asm 
memory references (on i386) can flush the exact number of bytes.  But 
for other sizes, gcc seems to falls back to BLK mode, which doesn't.


I don't know the exact meaning of BLK on a "set" or "asm_operands." Does 
it cause a full clobber?  Or just a complete clobber of buff? Attempting 
to answer that question leads us to the second bit of code:



#include 

#define MYSIZE 8

inline void
__stosb(unsigned char *Dest, unsigned char Data, size_t Count)
{
   struct _reallybigstruct { char x[MYSIZE]; }
  *p = (struct _reallybigstruct *)Dest;

   __asm__ __volatile__ ("rep stos{b|b}"
  : "+D" (Dest), "+c" (Count), "=m" (*p)
  : [Data] "a" (Data)
  //: "memory"
   );
}
int main()
{
   unsigned char buff[100], buff2[100];
   buff[5] = 'A';
   buff2[5] = 'M';
   asm("#" : : "r" (buff2));

   __stosb(buff, 'B', sizeof(buff));
   printf("%c %c\n", buff[5], buff2[5]);
}


Here I've added a buff2, and I set buff2[5] to 'M' (aka ascii 77), which 
I also print.  I still perform the memory constraint against buff, then 
I check to see if it is affecting buff2.


I start by compiling this with a size of 8 and look at the -S output.  
If this is NOT doing a full clobber, gcc should be able to just print 
buff2[5] by moving 77 into the appropriate register before calling 
printf.  And indeed, that's what we see.


/APP
 # 17 "mem2.cpp" 1
rep stosb
 # 0 "" 2
/NO_APP
movzbl  37(%rsp), %edx
movl$77, %r8d
leaq.LC0(%rip), %rcx
callprintf

If using a size of 3 *is* causing a full memory clobber, we would expect 
to see the value getting read from memory before calling printf.  And 
indeed, that's also what we see.


/APP
 # 17 "mem2.cpp" 1
rep stosb
 # 0 "" 2
/NO_APP
movzbl  37(%rsp), %edx
leaq.LC0(%rip), %rcx
movzbl  149(%rsp), %r8d

I don't know the internals of gcc well enough to understand exactly why 
this is happening.  But from a user's poin

Re: [RFD] Using the 'memory constraint' trick to avoid memory clobber doesn't work

2014-11-13 Thread David Wohlferd



On 11/13/2014 6:02 AM, Richard Biener wrote:

On Thu, Nov 13, 2014 at 2:53 PM, Hans-Peter Nilsson  wrote:

On Thu, 13 Nov 2014, David Wohlferd wrote:

Sorry for the (very) delayed response.  I'm still looking for feedback here so
I can fix the docs.

Thank you for your diligence.


As I said before, triggering a full memory clobber for anything over 16 bytes
(and most sizes under 16 bytes) makes this feature all but useless.  So if
that's really what's happening, we need to decide what to do next:

1) Can this be "fixed?"
2) Do we want to doc the current behavior?
3) Or do we just remove this section?

I think it could be a nice performance win for inline asm if it could be made
to work right, but I have no idea what might be involved in that.  Failing
that, I guess if it doesn't work and isn't going to work, I'd recommend
removing the text for this feature.

Since all 3 suggestions require a doc change, I'll just say that I'm prepared
to start work on the doc patch as soon as someone lets me know what the plan
is.

Richard?  Hans-Peter?  Your thoughts?

I've forgot if someone mentioned whether we have a test-case in
our test-suite for this feature.


I'm looking thru gcc/testsuite/*.c to see if I can spot anything. It's 
not easy since there is a lot of asm and the people who write these are 
apparently allergic to using comments to describe what they are testing.



If we don't, then 3; removal.
If we do, I guess it's flawed or at least not agreeing with the
documentation?  Then it *might* be worth the effort fixing that
and additional test-coverage (depending on the person stepping
up...) but 3 is IMHO still an arguably sane option.

Well, as what is missing is just an optimization I'd say we should
try to fix it.


While I'd love to be the one to fix this, the fact of the matter is that 
most of gcc is a black box to me.  Even if you told me roughly where to 
start, I'd have no idea of the downstream impacts of anything I changed.


So while I understand that it looks like I'm just finding work for other 
people, fixing something like this is simply beyond me.  That said, I'm 
certainly prepared to outline what I see as the interesting test cases 
and to do some testing if someone else is willing to step up and do this 
optimization.



And surely the docs should not promise that optimization
will happen - it should just mention that doing this might allow
optimization to happen.


I can agree with this.  I am quite confident there will be occasions 
where gcc has no option but to fall back to doing a full clobber to 
ensure correct function (a possibility which the current docs also fails 
to mention).  So yes, the docs should be limited in what it promises here.


Which brings us to the question: what do we do now?  The 15th is fast 
approaching.  Can something like this get done before then? Can it be 
checked in for 5.0 after the 15th?  Or does it need to wait for 6.0?


If it does need to wait for 6.0, what do we want to do with the docs in 
the meantime?  Given how wrong they are currently, I'd hate to ship yet 
another release with that ugly text.  But trying to describe the best 
way to take advantage of optimizations that haven't been written yet 
is... hard.


Since (as I understand it) 5.0 docs *can* be checked in after the 15th, 
my recommendations:


 - If someone is prepared to step up and do this work for v5.0, then 
I'll wait and write the docs when they are done and can describe how it 
works.
 - If this is going to wait for 6.0, then if someone does (at least) 
enough investigative work to be able to describe how this will 
eventually work, I'll update the 5.0 docs in a general way talking about 
ways gcc *may* be able to optimize.  It should be possible to phrase 
this so code people write today will work even better tomorrow.
 - Worst case is if no one has the time to look at this for the 
foreseeable future.  In that case, I'm with Hans-Peter.  Let's take the 
existing text out.  Following the existing text makes things *worse*, 
and the current implementation is so limited that I'd be surprised if 
anyone's code actually uses it successfully.  New text can get added 
when the new code is.


Hmm.  I just had a thought: Is it possible the problem I'm seeing here 
is platform-specific?  Maybe this works perfectly for non-i386 code?  
That would certainly change my recommendations here.


dw

Re: [RFD] Using the 'memory constraint' trick to avoid memory clobber doesn't work

2014-11-15 Thread David Wohlferd




Can you please file a bug in bugzilla if you haven't already done so?
We can still fix bugs during the next three months ;)


Well, I opened one.  Briefly.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63900

dw

Re: organization of optimization options in manual

2015-01-17 Thread David Wohlferd



On 1/17/2015 12:13 PM, Sandra Loosemore wrote:
BTW, as a GCC user I'm also very frustrated by the (lack of) 
organization in the extensions chapter; the information about 
attributes and built-in functions is all mixed up with sections on 
random syntactic extensions like "Dollar Signs in Identifier Names", 
etc. Maybe somebody else could work on a proposal for reordering the 
material in that chapter in parallel?


I actually took a shot at this a while back, but didn't think anyone 
would be interested.  60+ items in an unordered list seemed a bit much 
for human consumption.  Even when you *know* something is on the page it 
can be hard to find.


Let me dig that back out and post it here in a new thread.

dw

organization of C Extensions in manual

2015-01-17 Thread David Wohlferd

The page for Extensions to the C Language Family 
(https://gcc.gnu.org/onlinedocs/gcc/C-Extensions.html) is very long (60+ 
items) and completely unordered.  This makes it hard to find things, 
even when you know they are there.  I have taken a first shot at 
grouping these.  Hopefully it can at least serve as the basis for 
discussion.


These section headings were just the general terms that first occurred 
to me.  Feel free to propose more computer-y terminology. And yes, some 
of these topics are a stretch to be included in the sections in which I 
have placed them.  Still, I believe this is better than what is there now.


Better section names and alternative groupings welcome.

Arrays and Vectors
Designated Inits: Labeling elements of initializers.
Pointers to Arrays: Pointers to arrays with qualifiers work as 
expected.

Subscripting: Any array can be subscripted, even if not an lvalue.
Variable Length: Arrays whose length is computed at run time.
Vector Extensions: Using vector instructions through built-in 
functions.

Zero Length: Zero-length arrays.

Attributes
Attribute Syntax: Formal syntax for attributes.
Function Attributes: Declaring that functions have no side effects, 
or that they can never return.

Inline: Defining inline functions (as fast as macros).
Label Attributes: Specifying attributes on labels.
Thread-Local: Per-thread variables.
Type Attributes: Specifying attributes of types.
Variable Attributes: Specifying attributes of variables.
Volatiles: What constitutes an access to a volatile object.

BuiltIns
__atomic Builtins: Atomic built-in functions with memory model.
__sync Builtins: Legacy built-in functions for atomic memory access.
Cilk Plus Builtins: Built-in functions for the Cilk Plus language 
extension.
Integer Overflow Builtins: Built-in functions to perform 
arithmetics and arithmetic overflow checking.
Object Size Checking: Built-in functions for limited buffer 
overflow checking.

Other Builtins: Other built-in functions.
Pointer Bounds Checker builtins: Built-in functions for Pointer 
Bounds Checker.

Target Builtins: Built-in functions specific to particular targets.
x86 specific memory model extensions for transactional memory: x86 
memory models.


Computed values
Alignment: Inquiring about the alignment of a type or variable.
Conditionals: Omitting the middle operand of a ‘?:’ expression.
Initializers: Non-constant initializers.
Offsetof: Special syntax for implementing offsetof.
Pointer Arith: Arithmetic on void-pointers and function pointers.
Return Address: Getting the return or frame address of a function.
Statement Exprs: Putting statements and declarations inside 
expressions.

Target Format Checks: Format checks specific to particular targets.
Typeof: typeof: referring to the type of an expression.
Variadic Macros: Macros with a variable number of arguments.

Constant Values
Binary constants: Binary constants using the ‘0b’ prefix.
Character Escapes: ‘\e’ stands for the character .
Compound Literals: Compound literals give structures, unions or 
arrays as values.

Escaped Newlines: Slightly looser rules for escaped newlines.
Hex Floats: Hexadecimal floating-point constants.
Labels as Values: Getting pointers to labels, and computed gotos.

Data types
__int128: 128-bit integers---__int128.
Complex: Data types for complex numbers.
Decimal Float: Decimal Floating Types.
Empty Structures: Structures with no members.
Fixed-Point: Fixed-Point Types.
Floating Types: Additional Floating Types.
Half-Precision: Half-Precision Floating Point.
Long Long: Double-word integers---long long int.

Naming
Alternate Keywords: __const__, __asm__, etc., for header files.
Cast to Union: Casting to union type from any member of the union.
Dollar Signs: Dollar sign is allowed in identifiers.
Function Names: Printable strings which are the name of the current 
function.

Incomplete Enums: enum foo;, with details to follow.
Local Labels: Labels local to a block.
Mixed Declarations: Mixing declarations and code.
Named Address Spaces: Named address spaces.
Unnamed Fields: Unnamed struct/union fields within structs/unions.

Misc
C++ Comments: C++ comments are recognized.
Case Ranges: `case 1 ... 9' and such.
Constructing Calls: Dispatching a call to another function.
Function Prototypes: Prototype declarations and old-style definitions.
Nested Functions: As in Algol and Pascal, lexical scoping of functions.
Pragmas: Pragmas accepted by GCC.
Using Assembly Language with C: Instructions and extensions for 
interfacing C with assembler.


In a related question, should we change the "Extensions" page to only 
contain these section headers ("Data types", "Builtins" etc) and make 
submenus (ie new pages) out of the items under them?  From a str

Re: organization of C Extensions in manual

2015-01-25 Thread David Wohlferd


> H, looking at your list, I think it is better to leave things that
> are hard to categorize where they are, as top-level sections within the
> chapter, rather than trying to group them arbitrarily with other things
> that are hard to categorize; the latter only makes things things harder
> to find by burying them.

I understand what you are saying.  I have said uncomplimentary things 
about other documentation that had the information I was looking for in 
the 'wrong' place.


OTOH, leaving everything ungrouped is what we have now, and that's also 
a mess.  An unordered list with more than about a dozen entries also 
makes it "hard to find things by burying them."


While I have disagreed with some of your suggestions (see comments 
below), some of them made good sense.  In an attempt to keep this post 
readable, I have included a complete, updated list of groupings at the 
end, rather than piecemeal inline.


>> Arrays and Vectors
>>  Designated Inits: Labeling elements of initializers.
>
> I think this one might better be placed in a section on initializers.

Since the only place 'Designated Inits' can be used is on arrays, 
placing this in the 'Arrays and Vectors' section seems appropriate.


>>  Vector Extensions: Using vector instructions through built-in 
functions.

>
> I'm not sure where this one belongs, but it doesn't seem to go with
> array extensions.  It's not really about builtins, either.

The section is entitled 'Arrays and Vectors.'  I don't see why a topic 
about vectors doesn't fit.


>
Extracting your next comments, you object to Inline, Thread-Local and 
Volatiles being grouped under 'Attributes.'  Honestly, I don't have a 
problem implying that these are attributes.  But since you do, I pored 
over some standards docs, and it seems like they refer to these things 
as 'Specifiers.'  I have created a new section and moved these three there.


>> Computed values
>> Constant Values

> I don't really like this division.  To me, it would be more useful to
> have a section on purely syntactic extensions (the binary constants,
> character escapes, dollar signs in identifiers), and one on expression
> values.

I don't know what the term "syntactic extensions" would mean in this 
context.  Not knowing what it means makes it difficult to know what 
items to place under it.


> I don't think the variadic macros section belongs in either of those
> groups and should be left separate for now.

Ok.

>> Naming

> A lot of these things don't have anything to do with naming, and
> burying them here would just make them hard to find.  How about
> moving the purely syntactic things to that section and leave the
> others at top-level for now, unless some other grouping suggests
> itself appropriate with the leftovers.

Ok, here you've got me.  Forcing some of these items into 'Naming' 
required a bit of imagination.  I could walk you thru the contortions I 
used to justify them, but finding a better grouping seems like a more 
sensible plan (see below).


>> Misc
> The first two items might go into the syntactic extensions category.
> Pragmas and inline asm definitely should stay at top level. Not
> sure about the others.

It is not clear to me that moving them back to top level provides any 
advantages over putting them in the 'Misc' group.


Folding in your changes (especially re-working Naming and 
Constants/Computed) gives me this.  I can already guess which one is 
going to make you wince, but perhaps you can come up with a better 
section name.


Arrays and Vectors
Designated Inits: Labeling elements of initializers.
Pointers to Arrays: Pointers to arrays with qualifiers work as 
expected.

Subscripting: Any array can be subscripted, even if not an lvalue.
Variable Length: Arrays whose length is computed at run time.
Vector Extensions: Using vector instructions through built-in 
functions.

Zero Length: Zero-length arrays.

Assignments
Cast to Union: Casting to union type from any member of the union.
Conditionals: Omitting the middle operand of a ‘?:’ expression.
Compound Literals: Compound literals give structures, unions or 
arrays as values.

Initializers: Non-constant initializers.
Statement Exprs: Putting statements and declarations inside 
expressions.


Attributes
Attribute Syntax: Formal syntax for attributes.
Function Attributes: Declaring that functions have no side effects, 
or that they can never return.

Label Attributes: Specifying attributes on labels.
Type Attributes: Specifying attributes of types.
Variable Attributes: Specifying attributes of variables.

BuiltIns
__atomic Builtins: Atomic built-in functions with memory model.
__sync Builtins: Legacy built-in functions for atomic memory access.
Cilk Plus Builtins: Built-in functions for the Cilk Plus language 
extension.
Integer Overflow Builtins: Built-in functions to perform 
arithmetics and arithmetic overflow checking.
Object Size

Re: organization of C Extensions in manual

2015-01-25 Thread David Wohlferd



On 1/22/2015 1:23 PM, Joseph Myers wrote:

On Thu, 22 Jan 2015, Jeff Law wrote:


  Inline: Defining inline functions (as fast as macros).

Doesn't seem to belong here.

Given that inline isn't really an extension anymore, one could argue this
isn't relevant anymore.

Well, we need to document -std=gnu89 / __gnu_inline__ attribute semantics.
And the fact that a C99 feature is accepted as an extension in C89 mode is
something that needs documenting.  Though for C extensions maybe the
documentation would better describe exceptions - extensions from newer C
standards that are *not* enabled in older standard modes, or that are only
enabled when you use alternative keywords such as __inline.  Then you
wouldn't need to go into the details e.g. of long long, just that it's an
extension accepted in C89 / C++98 mode.


I can't speak to whether removing this topic is a good idea or not. For 
the moment, I have moved volatile, inline and thread-local from the 
Attributes section to their own section (Specifiers).  If it is time to 
remove inline from this page (or just re-work it), someone else will 
need to pursue this.




(For C++ there may be more features than for C that require -std=c++11 to
enable them, so listing exceptions may not be the right approach for C++.)

Ways in which GNU C features go beyond the corresponding standard feature
do also need documenting.  (So VLA documentation needs to refer to VLAs in
structures, and to the interaction with alloca, and to parameter forward
declarations, even if the basic feature is described in terms of a C99
feature also supported in C89 mode and for C++ and there's less need to
describe the C99 semantics.)

inline asm clobbers

2015-03-11 Thread David Wohlferd


Why does gcc allow you to specify clobbers using numbers:

   asm ("" : : "r" (var) : "0"); // i386: clobbers eax

How is this better than using register names?

This makes even less sense when you realize that (apparently) the 
indices of registers aren't fixed.  Which means there is no way to know 
which register you have clobbered in order to use it in the template.


Having just seen someone trying (unsuccessfully) to use this, it seems 
like there is no practical way you can.


Which makes me wonder why it's there.  And whether it still should be.

dw

Re: inline asm clobbers

2015-03-11 Thread David Wohlferd



On 3/11/2015 4:19 PM, Ian Lance Taylor wrote:

On Wed, Mar 11, 2015 at 3:58 PM, David Wohlferd  wrote:

Why does gcc allow you to specify clobbers using numbers:

asm ("" : : "r" (var) : "0"); // i386: clobbers eax

How is this better than using register names?

This makes even less sense when you realize that (apparently) the indices of
registers aren't fixed.  Which means there is no way to know which register
you have clobbered in order to use it in the template.

Having just seen someone trying (unsuccessfully) to use this, it seems like
there is no practical way you can.

Which makes me wonder why it's there.  And whether it still should be.

I don't know why it works.  It should be consistent, though.  It's
simply GCC's internal hard register number, which doesn't normally
change.


The reason I believe the order can change is this comment from i386.h:

/* Order in which to allocate registers.  Each register must be
   listed once, even those in FIXED_REGISTERS.  List frame pointer
   late and fixed registers last.  Note that, in general, we prefer
   registers listed in CALL_USED_REGISTERS, keeping the others
   available for storage of persistent values.

   The ADJUST_REG_ALLOC_ORDER actually overwrite the order,
   so this is just empty initializer for array.  */

My attempts to follow ADJUST_REG_ALLOC_ORDER were not particularly 
successful, but it did take me to this comment in ira.c:


/* This is called every time when register related information is
   changed.  */


I would agree that one should avoid it.  I'd be wary of removing it
from GCC at this point since it might break working code.


I hear you on this.  Removing existing functionality is definitely 
risky, so I agree with your caution.  And of course changing anything is 
much less important if the register order here really is fixed.


On the other hand, what if my fear about register order changing is 
correct?  In that case people who assume (as you have) that they don't 
change are clobbering random registers.  Also, the fellow I saw trying 
to use this (incorrectly) assumed that "0" was referring to the same 
thing as the template's "%0."


If people don't (or if in fact it is impossible to) use this safely, 
"breaking" their code by forcing them to use register names might be the 
best way to fix it.


dw

Re: inline asm clobbers

2015-03-11 Thread David Wohlferd




On 3/11/2015 4:41 PM, paul_kon...@dell.com wrote:

On Mar 11, 2015, at 7:19 PM, Ian Lance Taylor  wrote:

On Wed, Mar 11, 2015 at 3:58 PM, David Wohlferd  wrote:

Why does gcc allow you to specify clobbers using numbers:

   asm ("" : : "r" (var) : "0"); // i386: clobbers eax

How is this better than using register names?

This makes even less sense when you realize that (apparently) the indices of
registers aren't fixed.  Which means there is no way to know which register
you have clobbered in order to use it in the template.

Having just seen someone trying (unsuccessfully) to use this, it seems like
there is no practical way you can.

Which makes me wonder why it's there.  And whether it still should be.

I don't know why it works.  It should be consistent, though.  It's
simply GCC's internal hard register number, which doesn't normally
change.

I would agree that one should avoid it.  I'd be wary of removing it
from GCC at this point since it might break working code.

It certainly would.  It’s not all that common, but I have seen this done in 
production code.  Come to think of it, this certainly makes sense in machines 
where some instructions act on fixed registers.


Really?  While I've seen much code that uses clobbers, I have never 
(until this week) see anyone attempt to clobber by index.  Since I'm 
basically an i386 guy, maybe this is a platform thing?  Do you have some 
examples/links?



Register names would be nice as an additional capability.


Every example I've ever seen uses register names.  Perhaps that what 
you've seen before?


dw

Re: inline asm clobbers

2015-03-12 Thread David Wohlferd


Resending due to bounced email.

On 3/11/2015 6:19 PM, Ian Lance Taylor wrote:

On Wed, Mar 11, 2015 at 5:51 PM, David Wohlferd  wrote:

The reason I believe the order can change is this comment from i386.h:

/* Order in which to allocate registers.  Each register must be
listed once, even those in FIXED_REGISTERS.  List frame pointer
late and fixed registers last.  Note that, in general, we prefer
registers listed in CALL_USED_REGISTERS, keeping the others
available for storage of persistent values.

The ADJUST_REG_ALLOC_ORDER actually overwrite the order,
so this is just empty initializer for array.  */

That is REG_ALLOC_ORDER.  The index that appears in an asm statement
is the hard register number.  REG_ALLOC_ORDER is an array holding hard
register numbers.  The hard register numbers can change in principle,
by changing the source code, but I actually can't recall that ever
happening.

To wrap this up:

Like Ian said, the order of registers here apparently never changes.  I 
read more into that comment than I should have.  For good luck, I 
experimented with -fomit-frame-pointer, -ffixed-, etc, and nothing has 
any impact here.  The list is the list.


In fact, it turns out you can use this same format with register variables:

register int x asm("3"); // i386: ebx

So while I find it ugly, unnecessarily complex, and lacking in 
self-documenting-ness, it is not inherently buggy the way I feared it 
was, so I can't think of any good arguments for pulling it out.


Thanks to Ian and Paul for straightening me out.

dw

Re: inline asm clobbers

2015-03-12 Thread David Wohlferd




On 3/12/2015 7:24 AM, paul_kon...@dell.com wrote:

On Mar 11, 2015, at 8:53 PM, David Wohlferd  wrote:


...
I would agree that one should avoid it.  I'd be wary of removing it
from GCC at this point since it might break working code.

It certainly would.  It’s not all that common, but I have seen this done in 
production code.  Come to think of it, this certainly makes sense in machines 
where some instructions act on fixed registers.

Really?  While I've seen much code that uses clobbers, I have never (until this 
week) see anyone attempt to clobber by index.  Since I'm basically an i386 guy, 
maybe this is a platform thing?  Do you have some examples/links?

The example I remember was not in open code.  It may have been cleaned up by 
now, but as supplied to us by the vendor, there were some bits of assembly code 
that needed a scratch register and used a fixed register (t0 == %8) for that 
purpose rather than having GCC deal with temporaries.  So there was a clobber 
with “8” in it.  Obviously there’s a better way in that instance, but if GCC 
had removed the feature before we found and cleaned up that code, we would have 
had a failure on our hands.

An example of hardwired registers I remember is some VAX instructions (string 
instructions).  You could write those by name, of course, but if you didn’t 
know that GCC supports names, you might just use numbers.  On machines like VAX 
where the register names are really just numbers (R0, R1, etc.) that isn’t such 
a strange thing to do.


Register names would be nice as an additional capability.

Every example I've ever seen uses register names.  Perhaps that what you've 
seen before?

No; I didn’t know that gcc supports register names.  The examples I had seen 
all use numbers.  Or more often, preprocessor symbols.  It may be that’s 
because the most common asm code I run into is MIPS coprocessor references, and 
while general registers may be known by name, coprocessor registers may not be. 
 Or it may just be a case of lack of awareness.

paul


Ahh.  So perhaps as I suspected: this is more commonly used on non-i386 
platforms.  So clearly removing this is out of the question.


This brings us to the question of documentation.  Right now the docs 
only refer to register names.  I expect it would be helpful for people 
to understand what it means when they come across code that uses 
indices.  A few words in the 'clobbers' section and the two Reg Vars 
sections would probably cover it.


Perhaps some variation of:


In addition to specifying registers by name, it is also possible to use 
a register index (ie "3" to refer to the 3rd register).  The list of 
registers and their order is platform specific.  See the REGISTER_NAMES 
defined for your platform in the gcc source.



I'm not excited about pointing people vaguely toward the source, but 
that's the only place I know to find this info.


I realize this may seem a bit redundant for people who are used to 
registers named R0,R1,R2..., but on the i386, the order is: 
ax,dx,cx,bx,si...


dw

Optimization breaks inline asm code w/ptrs

2017-08-12 Thread David Wohlferd


Environment:
gcc 6.1
compiling for 64bit i386
optimizations: -O2

Consider this simple bit of code (from 
https://stackoverflow.com/a/45656087/2189500):


#include 

int getStringLength(const char *pStr){

int len;

__asm__  (
"repne scasb\n\t"
"not %%ecx\n\t"
"dec %%ecx"
:"=c" (len), "+D"(pStr)
:"c"(-1), "a"(0)
);

return len;
}

int main()
{
   char buff[50] = "hello world";
   int a = getStringLength(buff);
   printf("%s: %d\n", buff, a);
}

This code works as expected and prints out 11.  Yay.

However, if you add "buff[4] = 0;" before the call to getStringLength, 
it STILL prints out 11 (when optimizations are enabled), when it should 
print 4.


I would expect this kind of behavior if the asm were in 'main.' But it 
has always been my understanding that function calls performed an 
implicit memory clobber.  The fact that this clobber goes away during 
inlining means that code can stop working any time the compiler makes a 
different decision about whether or not to inline a function.  Ouch.


And before somebody asks:  Adding "+m"(pStr) does not help.

The result is that (apparently) you can NEVER safely pass a buffer 
pointer to inline asm without using the memory clobber.  If this is 
true, I don't believe it is widely known.


Given how 'heavy' memory clobbers are, I would hope that only pointers 
that have 'escaped' the function would get flushed before a function 
call.  But not flushing *anything* seems very bad.


dw

Re: Optimization breaks inline asm code w/ptrs

2017-08-13 Thread David Wohlferd


On 8/12/2017 10:14 PM, Andrew Pinski wrote:

On Sat, Aug 12, 2017 at 10:08 PM, Andrew Pinski  wrote:

On Sat, Aug 12, 2017 at 9:21 PM, David Wohlferd  wrote:

Environment:
gcc 6.1
compiling for 64bit i386
optimizations: -O2

Consider this simple bit of code (from
https://stackoverflow.com/a/45656087/2189500):

#include 

int getStringLength(const char *pStr){

 int len;

 __asm__  (
 "repne scasb\n\t"
 "not %%ecx\n\t"
 "dec %%ecx"
 :"=c" (len), "+D"(pStr)
 :"c"(-1), "a"(0)
 );

 return len;
}

int main()
{
char buff[50] = "hello world";
int a = getStringLength(buff);
printf("%s: %d\n", buff, a);
}

This code works as expected and prints out 11.  Yay.

However, if you add "buff[4] = 0;" before the call to getStringLength, it
STILL prints out 11 (when optimizations are enabled), when it should print
4.

I would expect this kind of behavior if the asm were in 'main.' But it has
always been my understanding that function calls performed an implicit
memory clobber.  The fact that this clobber goes away during inlining means
that code can stop working any time the compiler makes a different decision
about whether or not to inline a function.  Ouch.

And before somebody asks:  Adding "+m"(pStr) does not help.

But does adding:
"+m"(*pStr)

Help?


"+m"(pStr)  Just says pStr variable changes, not what it points to.


Using "+m"(*pStr) gives a compile error (read-only location used as output).

Using "m"(*pStr) as an (unused) input parameter has no effect.

Using "m"(*pStr) as a used input parameter can work.

But let's not get off track here.  The purpose of this question isn't 
"how do I make this code work?"  I can already give you at least 3 
different work-arounds.  The goal here is to understand how function 
parameters can (safely) be passed to inline asm.



I should ask why are you using inline-asm for this?  strlen will have
the best optimized version for your processor anyways.


I don't actually care about this particular example.  It's just a 
question I was answering for a user on SO (see link above) when someone 
pointed out the subtle, but more serious problem.  So it just serves as 
a 'minimal' example to illustrate the issue I'm asking about, which is:



The result is that (apparently) you can NEVER safely pass a buffer pointer
to inline asm without using the memory clobber.  If this is true, I don't
believe it is widely known.


This seems like a "by definition" thing.  By definition when writing a 
function, you should assume one of these two things:


1) It is perfectly reasonable for a function to assume that on entry all 
pointers to memory that the function can access have been clobbered and 
can thus those pointers can be safely passed to inline asm.


2) It is never (never never never) reasonable for a function to assume 
that on entry (or indeed at any time) that a pointer to memory can be 
used to read that memory via inline asm unless either a memory 
constraint is used, or a memory clobber is included.


Observation suggests that despite my expectations, #1 is false, which 
implies #2 is correct.  I don't know that this is generally understood.  
Passing function parameters (including pointers) to inline asm is not 
uncommon.  If #2 is the rule, I expect I'm not the first person to break it.


Next question: Is this by design?  Just because it does behave this way 
doesn't mean that it should.


As I say, I expected that calling a function always does an implied 
clobber (at least for escaped pointers if not a complete clobber). And 
indeed, if getStringLength isn't inlined (ie via attribute), the code 
always works as expected.  If there is an implicit clobber for noinline, 
why wouldn't there be one for inline?  Wouldn't this be something 
inherent in the definition of "declaring and calling a function?"


And perhaps more significantly: This exact same code compiled with -m32 
instead of -m64 works correctly (not sure about other architectures).  
Could this behavior be an unintended consequence of some overly 
aggressive optimization related to 64bit inlining?


How SHOULD this work?


Given how 'heavy' memory clobbers are, I would hope that only pointers that
have 'escaped' the function would get flushed before a function call.  But
not flushing *anything* seems very bad.


dw

99 matches

Mail list logo