Using the asm suffix
As a followup to my update to the inline asm docs, I'm cleaning up the docs for 'Asm Labels.' The changes I want to make are pretty straight-forward (attached; comments welcome). But then I came across this line of code (from https://github.com/rschmukler/cs537-p5/blob/master/xv6/proc.h#L38): extern struct proc *proc asm("%gs:4"); This x86 code says that 'proc' is located at an offset of 4 bytes from the gs register. There isn't any description of using asm like this in the current Asm Labels docs. But 'gs:4' isn't really a label. There also isn't any description of it in the Explicit Reg Vars section. But 'gs:4' isn't really a register either. So apparently using asm like this this isn't documented anywhere. Which makes me wonder: Is this not doc'ed because using 'asm' like this isn't supported? Or is there a supported feature here that needs to be doc'ed? dw Index: extend.texi === --- extend.texi (revision 226751) +++ extend.texi (working copy) @@ -8367,8 +8367,14 @@ You can specify the name to be used in the assembler code for a C function or variable by writing the @code{asm} (or @code{__asm__}) -keyword after the declarator as follows: +keyword after the declarator. +It is up to you to make sure that the assembler names you choose do not +conflict with any other assembler symbols. +@subsubheading Assembler names for data: + +This sample shows how to specify the assembler name for data: + @smallexample int foo asm ("myfoo") = 2; @end smallexample @@ -8379,33 +8385,30 @@ @samp{_foo}. On systems where an underscore is normally prepended to the name of a C -function or variable, this feature allows you to define names for the +variable, this feature allows you to define names for the linker that do not start with an underscore. It does not make sense to use this feature with a non-static local variable since such variables do not have assembler names. If you are trying to put the variable in a particular register, see @ref{Explicit -Reg Vars}. GCC presently accepts such code with a warning, but will -probably be changed to issue an error, rather than a warning, in the -future. +Reg Vars}. -You cannot use @code{asm} in this way in a function @emph{definition}; but -you can get the same effect by writing a declaration for the function -before its definition and putting @code{asm} there, like this: +@subsubheading Assembler names for functions: +To specify the assember name for functions, write a declaration for the +function before its definition and put @code{asm} there, like this: + @smallexample -extern func () asm ("FUNC"); - -func (x, y) - int x, y; -/* @r{@dots{}} */ +extern int func (int x, int y) asm ("MYFUNC"); + +int func (int x, int y) +@{ + /* @r{@dots{}} */ @end smallexample -It is up to you to make sure that the assembler names you choose do not -conflict with any other assembler symbols. Also, you must not use a -register name; that would produce completely invalid assembler code. GCC -does not as yet have the ability to store static variables in registers. -Perhaps that will be added. +@noindent +This specifies that the name to be used for the function @code{func} in +the assembler code should be @code{MYFUNC}. @node Explicit Reg Vars @subsection Variables in Specified Registers
Re: Using the asm suffix
There isn't any description of using asm like this in the current Asm Labels docs. And there shouldn't be. It's a hack. Ok, good. After experimenting with this, I wasn't looking forward to trying to describe what did and didn't work. dw
Re: Using the asm suffix
Thank you for the review and comments. On 8/17/2015 3:41 AM, Segher Boessenkool wrote: On Sun, Aug 16, 2015 at 06:33:40PM -0700, David Wohlferd wrote: On systems where an underscore is normally prepended to the name of a C -function or variable, this feature allows you to define names for the +variable, this feature allows you to define names for the linker that do not start with an underscore. Why remove this? This doc section (Controlling Names Used in Assembler Code) describes how the asm suffix affects both data and functions. However, it jumbles the two descriptions together. My intent here is to break this clearly into two @subsubheadings: 'Assembler names for data' and 'Assembler names for functions'. Since data is the first section, I removed the word 'function' here. It does not make sense to use this feature with a non-static local variable since such variables do not have assembler names. If you are trying to put the variable in a particular register, see @ref{Explicit -Reg Vars}. GCC presently accepts such code with a warning, but will -probably be changed to issue an error, rather than a warning, in the -future. +Reg Vars}. And this? Vague statements about possible changes that may or not ever be written are not helpful in docs. In this case the statement is particularly unhelpful since even the warning appears to be gone. +To specify the assember name for functions, write a declaration for the ^ typo Some typos are more embarrassing than others. +function before its definition and put @code{asm} there, like this: + @smallexample -extern func () asm ("FUNC"); - -func (x, y) - int x, y; -/* @r{@dots{}} */ +extern int func (int x, int y) asm ("MYFUNC"); + +int func (int x, int y) +@{ + /* @r{@dots{}} */ @end smallexample If you want to modernise the code, drop "extern" as well? :-) Ok. -Also, you must not use a -register name; that would produce completely invalid assembler code. GCC -does not as yet have the ability to store static variables in registers. -Perhaps that will be added. And why remove these? Again with the vague statements about possible future changes. Also, whether or not static variables can be stored in registers has nothing to do with Asm Labels. If this is still true, it belongs in Explicit Reg Vars. Update attached. dw Index: extend.texi === --- extend.texi (revision 226751) +++ extend.texi (working copy) @@ -8367,8 +8367,14 @@ You can specify the name to be used in the assembler code for a C function or variable by writing the @code{asm} (or @code{__asm__}) -keyword after the declarator as follows: +keyword after the declarator. +It is up to you to make sure that the assembler names you choose do not +conflict with any other assembler symbols. +@subsubheading Assembler names for data: + +This sample shows how to specify the assembler name for data: + @smallexample int foo asm ("myfoo") = 2; @end smallexample @@ -8379,33 +8385,30 @@ @samp{_foo}. On systems where an underscore is normally prepended to the name of a C -function or variable, this feature allows you to define names for the +variable, this feature allows you to define names for the linker that do not start with an underscore. It does not make sense to use this feature with a non-static local variable since such variables do not have assembler names. If you are trying to put the variable in a particular register, see @ref{Explicit -Reg Vars}. GCC presently accepts such code with a warning, but will -probably be changed to issue an error, rather than a warning, in the -future. +Reg Vars}. -You cannot use @code{asm} in this way in a function @emph{definition}; but -you can get the same effect by writing a declaration for the function -before its definition and putting @code{asm} there, like this: +@subsubheading Assembler names for functions: +To specify the assembler name for functions, write a declaration for the +function before its definition and put @code{asm} there, like this: + @smallexample -extern func () asm ("FUNC"); - -func (x, y) - int x, y; -/* @r{@dots{}} */ +int func (int x, int y) asm ("MYFUNC"); + +int func (int x, int y) +@{ + /* @r{@dots{}} */ @end smallexample -It is up to you to make sure that the assembler names you choose do not -conflict with any other assembler symbols. Also, you must not use a -register name; that would produce completely invalid assembler code. GCC -does not as yet have the ability to store static variables in registers. -Perhaps that will be added. +@noindent +This specifies that the name to be used for the function @code{func} in +the assembler code should be @code{MYFUNC}. @node Explicit Reg Vars @subsection Variables in Specified Registers
Re: Using the asm suffix
(Resending due to email glitch) Thanks again for your comments. On 8/18/2015 2:23 AM, Segher Boessenkool wrote: On Mon, Aug 17, 2015 at 09:55:48PM -0700, David Wohlferd wrote: On systems where an underscore is normally prepended to the name of a C -function or variable, this feature allows you to define names for the +variable, this feature allows you to define names for the linker that do not start with an underscore. Why remove this? This doc section (Controlling Names Used in Assembler Code) describes how the asm suffix affects both data and functions. However, it jumbles the two descriptions together. Probably because they are the same thing... My intent here is to break this clearly into two @subsubheadings: 'Assembler names for data' and 'Assembler names for functions'. Since data is the first section, I removed the word 'function' here. I missed that, sorry. Or, did you forget to add the same text to the "function" description? This patch would be much easier to review if you did one change per patch. I'm not sure what 'one change' might mean in this context. This topic is all a single texi node (ie it all ends up on a single html page). This page isn't that big. Would looking at the output help clarify this: Current: https://gcc.gnu.org/onlinedocs/gcc/Asm-Labels.html Proposed: http://limegreensocks.com/gcc/Asm-Labels.html It does not make sense to use this feature with a non-static local variable since such variables do not have assembler names. If you are trying to put the variable in a particular register, see @ref{Explicit -Reg Vars}. GCC presently accepts such code with a warning, but will -probably be changed to issue an error, rather than a warning, in the -future. +Reg Vars}. And this? Vague statements about possible changes that may or not ever be written are not helpful in docs. In this case the statement is particularly unhelpful since even the warning appears to be gone. I don't agree it is a vague statement about possible future changes; it is more like a statement of intent. Saying this will "probably" change at some (unspecified) time "in the future" seems the very definition of vague. Especially when you consider that the docs have been threatening this change for over 15 years now. It tells the reader "don't write code like this". If this is intended just for emphasis, how about replacing the existing text ("It does not make sense to use this feature with a non-static local variable since such variables do not have assembler names.") with "Do not use this feature with a non-static local variable." or maybe "It is not supported to use this feature with a non-static local variable since such variables do not have assembler names." Saying "It does not make sense" to do something doesn't mean the same thing (to me) as "Do not do this" or "It is not supported to..." And the warning is still there ("ignoring asm-specifier for non-static local variable"). Huh. I was unable to get gcc to produce this warning. Is there a trick? int main() { int a asm("asdf"); a = 13; return a; } I have tried 4.9.2 and 5, with -O2 and -O0, and I'm getting no warning (or error) messages. -Also, you must not use a -register name; that would produce completely invalid assembler code. GCC -does not as yet have the ability to store static variables in registers. -Perhaps that will be added. And why remove these? Again with the vague statements about possible future changes. Also, whether or not static variables can be stored in registers has nothing to do with Asm Labels. If this is still true, it belongs in Explicit Reg Vars. The first part ("must not use a register name") is an important warning. Clarifying this is a good idea. Although limiting it to only saying "don't use register names" seems a little, well, limiting. Who knows what kind of offsets or asm qualifiers they might try to cram in here? How about: "Only label names valid for your assembler are permitted within the asm." This would go right after the warning about conflicts. The second part (about statics) might well be better moved, but it should not be _re_moved just like that! And it is still true (gives an error, "multiple storage classes"). I have already started work on the changes I think are needed for Explicit Reg Vars (the last section of gcc docs I'm planning on doing). But I want to finish the (relatively easy) Asm Labels stuff first. Should I put this change in there? Or would someone just tell me all the Asm Labels stuff should go in its own patch? Also, I'm not seeing the multiple storage classes message either: int main() { static int a asm("asdf"); a = 5; return a; } dw
Re: Using the asm suffix
[snip] how about replacing the existing text ("It does not make sense to use this feature with a non-static local variable since such variables do not have assembler names.") with "Do not use this feature with a non-static local variable." or maybe "It is not supported to use this feature with a non-static local variable since such variables do not have assembler names." "You cannot use this feature ..." etc.? Keep the part about not having assembler names, it is useful. Due to the quirks of the English language, I'm not sure 'cannot' is the right word here. More correct would be 'cannot reliably' but I don't want to be that wishy-washy. And I'm a little iffy about the 'since such variables do not have assembler names,' as it seemed a bit bold to make assertions about the implementation details for all assemblers (past, present and future) for all platforms. But you are right, it does convey a bit of the 'why' for this limitation, so keeping it is a good idea. How about: "gcc does not support using this feature with a non-static local variable since typically such variables do not have assembler names." BTW, the trick for getting the "ignoring asm-specifier for non-static local variable" message was renaming my file from sta5.cpp to sta5.c. Seems like this should apply to both, but whatever. The first part ("must not use a register name") is an important warning. Clarifying this is a good idea. Although limiting it to only saying "don't use register names" seems a little, well, limiting. Who knows what kind of offsets or asm qualifiers they might try to cram in here? Register names is the common case to hurt you... "r0" etc. ;-) But as we have seen (%gs:4), people are willing to try other things. And rather than try to list all the things that don't work (register, registers with offsets, etc), I'm hoping we can find a way to specify the one thing that is supported. How about: "Only label names valid for your assembler are permitted within the asm." The assembler calls it "symbol names" (labels end in a colon). But it won't help: e.g. "r0" is a perfectly fine name for the assembler, too! It should say something like "if the string you put here is seen as something else than a symbol name by the assembler, anywhere the compiler puts it, you're on your own", but that is pretty vague as well. Ok, how about "Only symbol names that define labels are permitted within the asm." A bit awkward, but I believe it conveys the intent. You forgot to make it "register", so it is not a register variable. Ahh, true. I suppose we could say something about don't use 'register' with 'asm labels', but it doesn't seem worth the effort. dw
Re: Using the asm suffix
In order for the doc maintainers to approve this patch, I need to have someone sign off on the technical accuracy. Now that I have included the points we have discussed (attached), hopefully we are there. Original text: https://gcc.gnu.org/onlinedocs/gcc/Asm-Labels.html Proposed text: http://limegreensocks.com/gcc/Asm-Labels.html Still pending is the line I removed about 'static variables in registers' that belongs in the Reg Vars section. I have additional changes I want to make to Reg Vars sections, so once this patch is accepted, I'll post that work. dw Index: extend.texi === --- extend.texi (revision 226751) +++ extend.texi (working copy) @@ -8367,8 +8367,14 @@ You can specify the name to be used in the assembler code for a C function or variable by writing the @code{asm} (or @code{__asm__}) -keyword after the declarator as follows: +keyword after the declarator. +It is up to you to make sure that the assembler names you choose do not +conflict with any other assembler symbols, or reference registers. +@subsubheading Assembler names for data: + +This sample shows how to specify the assembler name for data: + @smallexample int foo asm ("myfoo") = 2; @end smallexample @@ -8379,33 +8385,30 @@ @samp{_foo}. On systems where an underscore is normally prepended to the name of a C -function or variable, this feature allows you to define names for the +variable, this feature allows you to define names for the linker that do not start with an underscore. -It does not make sense to use this feature with a non-static local -variable since such variables do not have assembler names. If you are -trying to put the variable in a particular register, see @ref{Explicit -Reg Vars}. GCC presently accepts such code with a warning, but will -probably be changed to issue an error, rather than a warning, in the -future. +GCC does not support using this feature with a non-static local variable +since such variables do not have assembler names. If you are +trying to put the variable in a particular register, see +@ref{Explicit Reg Vars}. -You cannot use @code{asm} in this way in a function @emph{definition}; but -you can get the same effect by writing a declaration for the function -before its definition and putting @code{asm} there, like this: +@subsubheading Assembler names for functions: +To specify the assembler name for functions, write a declaration for the +function before its definition and put @code{asm} there, like this: + @smallexample -extern func () asm ("FUNC"); - -func (x, y) - int x, y; -/* @r{@dots{}} */ +int func (int x, int y) asm ("MYFUNC"); + +int func (int x, int y) +@{ + /* @r{@dots{}} */ @end smallexample -It is up to you to make sure that the assembler names you choose do not -conflict with any other assembler symbols. Also, you must not use a -register name; that would produce completely invalid assembler code. GCC -does not as yet have the ability to store static variables in registers. -Perhaps that will be added. +@noindent +This specifies that the name to be used for the function @code{func} in +the assembler code should be @code{MYFUNC}. @node Explicit Reg Vars @subsection Variables in Specified Registers
Proposed doc update for Explicit Reg Vars 1/3
Having updated the docs for Basic asm, Extended asm, and Asm Labels, I am now sending my patches for the last of the inline asm sections: Explicit Reg Vars. My first attempt to update this got postponed (see https://gcc.gnu.org/ml/gcc-patches/2014-06/msg02369.html). This patch addresses the previous concerns. Note that there is nothing actually "wrong" with the existing text. It does not provide inaccurate information or miss key details. The problem is that (from a compiler user's point of view) the text is hard to follow. It reads as though people just have just dropped in new text as it occurs to them, without ever going back to put things in context, add formatting, etc. Using Explicit Register variables is a feature that doesn't get a lot of attention (why should it?), which probably explains why no one has taken time to polish it up. But now it has annoyed someone who is willing to work on it. The are 3 web pages associated with Explicit Reg Vars (Menu, Global, and Local), so I am sending this as 3 patches. The first patch (attached) is the menu page. The current menu page has a couple of flaws: 1) It tries to condense the entire contents of the other 2 pages into a single paragraph each. This loses a lot of detail and nuance, as well as introducing unnecessary duplication of information with the subpages. 2) What the heck is a "reg var"? Why can't we just call this a "register variable"? Instead of trying to fix/clarify/update the duplicated information, this patch removes it, then provides enough information to differentiate the two types of register variables, and finally directs people to the appropriate subpages which already contain all the information from this page (and more). It also changes the section names (and refs within the docs) to avoid the unnecessary abbreviations. As part of this, it also uses @anchor to ensure that any external links to the old names will still resolve correctly. Is this the standard? Or do we just let them fail? For people who find the HTML easier to review: Here's the current text: https://gcc.gnu.org/onlinedocs/gcc/Explicit-Reg-Vars.html And here's the new: http://limegreensocks.com/gcc/Explicit-Register-Variables.html dw Index: extend.texi === --- extend.texi (revision 228690) +++ extend.texi (working copy) @@ -7254,7 +7254,8 @@ * Extended Asm:: Inline assembler with operands. * Constraints::Constraints for @code{asm} operands * Asm Labels:: Specifying the assembler name to use for a C symbol. -* Explicit Reg Vars:: Defining variables residing in specified registers. +* Explicit Register Variables:: Defining variables residing in specified + registers. * Size of an asm:: How GCC calculates the size of an @code{asm} block. @end menu @@ -7774,7 +7775,8 @@ the optimizers to produce the best possible code. If you must use a specific register, but your Machine Constraints do not provide sufficient control to select the specific register you want, -local register variables may provide a solution (@pxref{Local Reg Vars}). +local register variables may provide a solution (@pxref{Local Register +Variables}). @item cvariablename Specifies a C lvalue expression to hold the output, typically a variable name. @@ -8004,7 +8006,8 @@ the compiler chooses the most efficient one based on the current context. If you must use a specific register, but your Machine Constraints do not provide sufficient control to select the specific register you want, -local register variables may provide a solution (@pxref{Local Reg Vars}). +local register variables may provide a solution (@pxref{Local Register +Variables}). Input constraints can also be digits (for example, @code{"0"}). This indicates that the specified input must be in the same place as the output constraint @@ -8086,7 +8089,8 @@ Clobber descriptions may not in any way overlap with an input or output operand. For example, you may not have an operand describing a register class with one member when listing that register in the clobber list. Variables -declared to live in specific registers (@pxref{Explicit Reg Vars}) and used +declared to live in specific registers (@pxref{Explicit Register +Variables}) and used as @code{asm} input or output operands must have no part mentioned in the clobber description. In particular, there is no way to specify that input operands get modified without also specifying them as output operands. @@ -8442,7 +8446,7 @@ GCC does not support using this feature with a non-static local variable since such variables do not have assembler names. If you are trying to put the variable in a particular register, see -@ref{Explicit Reg Vars}. +@ref{Explicit Register Variables}. @subsubheading Assembler names for functions: @@ -8461,50 +8465,34 @@ This specifies that the name to be used f
Proposed doc update for Explicit Reg Vars 2/3
Patch 2/3 is the update for the Global Register Variables page (attached). Reviewing this patch is going to be difficult. It's a lot easier to review a patch that just has a few lines of text being added. However, this leads to 'chunky' docs with a bunch of disjointed paragraphs (which is what this page has now). Eventually, someone needs to take the whole thing, and rearrange, reformat and reorganize it. Which is what this patch does. So there isn't any new content (except the part recently removed from asm labels about statics), just the old content rephrased and reformatted. There is, however, some decidedly unhelpful text that got removed (paraphrasing): - There are unsourced, unsubstantiated reports that on some platforms, certain things might or might not work. - Eventually the compiler may work differently than it does now. Argh! Who thinks this stuff is helpful? And while it was painful, I kept in the part that says "when specifying register names, make sure they are really register names." As for the review, perhaps reading the html is easier than reading the patch? Here's the current text: https://gcc.gnu.org/onlinedocs/gcc/Global-Reg-Vars.html And here's the new: http://limegreensocks.com/gcc/Global-Register-Variables.html dw Index: extend.texi === --- extend.texi (revision 228690) +++ extend.texi (working copy) @@ -8506,7 +8506,8 @@ @cindex global register variables @cindex registers, global variables in -You can define a global register variable in GNU C like this: +You can define a global register variable and associate it with a specified register +like this: @smallexample register int *foo asm ("a5"); @@ -8513,62 +8514,77 @@ @end smallexample @noindent -Here @code{a5} is the name of the register that should be used. Choose a -register that is normally saved and restored by function calls on your -machine, so that library routines will not clobber it. +Here @code{a5} is the name of the register that should be used. Note that +this is the same syntax used for defining local register variables, but for +a global variable the declaration appears outside a function. The +@code{register} keyword is required, and cannot be combined with +@code{static}. The register name must be a valid register name for the +target platform. -Naturally the register name is CPU-dependent, so you need to -conditionalize your program according to CPU type. The register -@code{a5} is a good choice on a 68000 for a variable of pointer -type. On machines with register windows, be sure to choose a ``global'' -register that is not affected magically by the function call mechanism. +Registers can be a limited resource on some systems and allowing the +compiler to manage their usage usually results in the best code. However, +under special circumstances it can make sense to reserve some globally. +For example this may be useful in programs such as programming language +interpreters that have a couple of global variables that are accessed +very often. -In addition, different operating systems on the same CPU may differ in how they -name the registers; then you need additional conditionals. For -example, some 68000 operating systems call this register @code{%a5}. +After defining a global register variable, for the duration of +the current compilation: -Eventually there may be a way of asking the compiler to choose a register -automatically, but first we need to figure out how it should choose and -how to enable you to guide the choice. No solution is evident. +@itemize @bullet +@item The register is reserved entirely for this use, and will not be +allocated for any other purpose. +@item The register is not saved and restored by any functions. +@item Stores into this register are never deleted even if they appear to be +dead, but references may be deleted, moved or simplified. +@end itemize -Defining a global register variable in a certain register reserves that -register entirely for this use, at least within the current compilation. -The register is not allocated for any other purpose in the functions -in the current compilation, and is not saved and restored by -these functions. Stores into this register are never deleted even if they -appear to be dead, but references may be deleted or moved or -simplified. +Note that these points @emph{only} apply to code that is compiled with the +definition. The behavior of code that is merely linked in (for example +code from libraries) is not affected. -It is not safe to access the global register variables from signal -handlers, or from more than one thread of control, because the system -library routines may temporarily use the register for other things (unless -you recompile them specially for the task at hand). +If you want to recompile source files that do not actually use your global +register variable so they do not use the specified reg
Proposed doc update for Explicit Reg Vars 3/3
Patch 3/3 is the update for the Local Register Variables page (attached). This patch starts with a question. Looking at bug https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64951 (register variable with template function) is this a bug that will be fixed? Or a limitation that should be doc'ed? Both the current docs and the patch ignore this bug. As with patch #2, this is primarily about reformatting/reorganizing. Although it also adds the limitation from asm labels re statics. I was hoping to modify the text to say that local register variables can "only" be used to call Extended asm. This would greatly simplify this section. But there has been pushback on this (despite the fact that no one has really suggested any other reasonable use). So I have softened this, and listed things from the existing docs that are explicitly not supported. For people who find the HTML easier to review: Here's the current text: https://gcc.gnu.org/onlinedocs/gcc/Local-Reg-Vars.html And here's the new: http://limegreensocks.com/gcc/Local-Register-Variables.html dw Index: extend.texi === --- extend.texi (revision 228690) +++ extend.texi (working copy) @@ -8604,7 +8604,7 @@ @cindex specifying registers for local variables @cindex registers for local variables -You can define a local register variable with a specified register +You can define a local register variable and associate it with a specified register like this: @smallexample @@ -8614,45 +8614,19 @@ @noindent Here @code{a5} is the name of the register that should be used. Note that this is the same syntax used for defining global register -variables, but for a local variable it appears within a function. +variables, but for a local variable the declaration appears within a +function. The @code{register} keyword is required, and cannot be combined +with @code{static}. The register name must be a valid register name for the +target platform. -Naturally the register name is CPU-dependent, but this is not a -problem, since specific registers are most often useful with explicit -assembler instructions (@pxref{Extended Asm}). Both of these things -generally require that you conditionalize your program according to -CPU type. - -In addition, operating systems on one type of CPU may differ in how they -name the registers; then you need additional conditionals. For -example, some 68000 operating systems call this register @code{%a5}. - -Defining such a register variable does not reserve the register; it -remains available for other uses in places where flow control determines -the variable's value is not live. - -This option does not guarantee that GCC generates code that has -this variable in the register you specify at all times. You may not -code an explicit reference to this register in the assembler -instruction template part of an @code{asm} statement and assume it -always refers to this variable. -However, using the variable as an input or output operand to the @code{asm} -guarantees that the specified register is used for that operand. -@xref{Extended Asm}, for more information. - -Stores into local register variables may be deleted when they appear to be dead -according to dataflow analysis. References to local register variables may -be deleted or moved or simplified. - -As with global register variables, it is recommended that you choose a -register that is normally saved and restored by function calls on -your machine, so that library routines will not clobber it. - -Sometimes when writing inline @code{asm} code, you need to make an operand be a -specific register, but there's no matching constraint letter for that -register. To force the operand into that register, create a local variable +The intended use for this feature is to specify registers +for input and output operands when calling Extended @code{asm} +(@pxref{Extended Asm}). This may be necessary if the constraints for a +particular machine don't provide sufficient control to select the desired +register. To force an operand into a register, create a local variable and specify the register in the variable's declaration. Then use the local -variable for the asm operand and specify any constraint letter that matches -the register: +variable for the @code{asm} operand and specify any constraint letter that +matches the register: @smallexample register int *p1 asm ("r0") = @dots{}; @@ -8661,11 +8635,11 @@ asm ("sysint" : "=r" (result) : "0" (p1), "r" (p2)); @end smallexample -@emph{Warning:} In the above example, be aware that a register (for example r0) can be -call-clobbered by subsequent code, including function calls and library calls -for arithmetic operators on other variables (for example the initialization -of p2). In this case, use temporary variables for expressions between the -register assignments: +@emph{Warning:} In the above example, be aware that a register
Re: Proposed doc update for Explicit Reg Vars 1/3
Abot the patches themselves... Hard to review again, sigh... I know, and I'm sorry. I just can't see any way to completely re-org the text without the patch becoming a nightmare. I was hoping the html links would make that easier, but I guess not. On the plus side, Explicit reg vars is the last section I plan to do this to. I appreciate you taking the time. The current menu page has a couple of flaws: 1) It tries to condense the entire contents of the other 2 pages into a single paragraph each. This loses a lot of detail and nuance, as well as introducing unnecessary duplication of information with the subpages. It provides an introduction, which is quite helpful in this case. Without some blurb the two-entry menu looks silly too. Can you move the intro to the separate pages instead of losing it altogether? I did keep a small amount of intro on the the menu page. If you feel there's more that we should keep, I'm certainly willing to re-visit this. Perhaps after we resolve the local/global stuff so we know what we really want to say. Jeff has already checked in the patch for this menu page (but not the Local or Global subpages), so you can see what I've left here on the gcc website (https://gcc.gnu.org/onlinedocs/gcc/Explicit-Register-Variables.html). Two spaces after a full stop Oops.Again.You can probably just automatically add this to every review you send me. It's just so automatic for me to type this way. In my (feeble) defense, the original text had this too. Lastly, if some external website is linking to "Explicit Reg Vars", what do we want to have happen now that we have renamed that to "Explicit Register Variables"? Should the link just fail? I've added the @anchor so it doesn't, but I'm not sure that's the standard for gcc. Who should I be asking? dw
Re: Proposed doc update for Explicit Reg Vars 2/3
- Eventually the compiler may work differently than it does now. That is helpful. It's a way signaling that things may change and that depending on the precise syntax and semantics may be unwise. From time to time, particularly with GCC extensions, it has been necessary to declare certain usage as invalid or to change the implementation is significantly visible ways. We never like doing that, but it is sometimes unavoidable. As a general rule, the more an extension exposes how GCC works internally, the more likely it has been to need significant changes over time. (asms being the easiest example to cite). So please consider keeping something which signals the semantics might change. The quote in question is: Eventually there may be a way of asking the compiler to choose a register automatically, but first we need to figure out how it should choose and how to enable you to guide the choice. No solution is evident. I struggle (and fail) to find anything in this text worth keeping. So for the example, "a5" is a particularly bad choice these days on the m68k as it's the PIC register. It may be advisable to just use r for some value of N and try to be processor agnostic here. "a5" is what the original text used, but I have no preference. I've changed it to r12, which works on my x64. dw
Re: Proposed doc update for Explicit Reg Vars 2/3
Line too long. I know quite a bit of doc does that, but that's no excuse :-) Reduced to < 79. +Registers can be a limited resource on some systems and allowing the They are a limited resource on almost all systems. "Scarce resource"? "Scarce" it is. I've left the rest alone for the moment, but how would you feel about: "Registers are a scarce resource on most systems and allowing the" +After defining a global register variable, for the duration of +the current compilation: It's probably better to say "for the current compilation unit"? There now is LTO and whatnot. Changed to "for the current compilation unit". +All global register variable declarations must precede all function +definitions. If such a declaration appears after function definitions, +the declaration would be too late to prevent the register from being used +for other purposes in the preceding functions. This isn't true anymore, not even with -fno-toplevel-reorder or -O0. I'm going to interpret this as a recommendation to remove this text, rather than just an FYI. Done. +When selecting a register, choose one that is normally saved and +restored by function calls on your machine. This ensures that code +which is unaware of this reservation (such as library routines) will +restore it before returning. The compiler also warns, possibly for the unlikely case that the user has not read the documentation. I'm going to interpret this comment as just an FYI, and NOT something that should be added to the docs. I've attached the new patch for the Globals. For review purposes, you can just diff it with the previous one. Viewed that way, the changes are pretty minor. dw Index: extend.texi === --- extend.texi (revision 229108) +++ extend.texi (working copy) @@ -8473,7 +8473,7 @@ GNU C allows you to associate specific hardware registers with C variables. In almost all cases, allowing the compiler to assign registers produces the best code. However under certain unusual -circumstances, more precise control over the variable storage is +circumstances, more precise control over the variable storage is required. Both global and local variables can be associated with a register. The @@ -8492,69 +8492,80 @@ @cindex registers, global variables in @cindex registers, global allocation -You can define a global register variable in GNU C like this: +You can define a global register variable and associate it with a specified +register like this: @smallexample -register int *foo asm ("a5"); +register int *foo asm ("r12"); @end smallexample @noindent -Here @code{a5} is the name of the register that should be used. Choose a -register that is normally saved and restored by function calls on your -machine, so that library routines will not clobber it. +Here @code{r12} is the name of the register that should be used. Note that +this is the same syntax used for defining local register variables, but for +a global variable the declaration appears outside a function. The +@code{register} keyword is required, and cannot be combined with +@code{static}. The register name must be a valid register name for the +target platform. -Naturally the register name is CPU-dependent, so you need to -conditionalize your program according to CPU type. The register -@code{a5} is a good choice on a 68000 for a variable of pointer -type. On machines with register windows, be sure to choose a ``global'' -register that is not affected magically by the function call mechanism. +Registers can be a scarce resource on some systems and allowing the +compiler to manage their usage usually results in the best code. However, +under special circumstances it can make sense to reserve some globally. +For example this may be useful in programs such as programming language +interpreters that have a couple of global variables that are accessed +very often. -In addition, different operating systems on the same CPU may differ in how they -name the registers; then you need additional conditionals. For -example, some 68000 operating systems call this register @code{%a5}. +After defining a global register variable, for the current compilation +unit: -Eventually there may be a way of asking the compiler to choose a register -automatically, but first we need to figure out how it should choose and -how to enable you to guide the choice. No solution is evident. +@itemize @bullet +@item The register is reserved entirely for this use, and will not be +allocated for any other purpose. +@item The register is not saved and restored by any functions. +@item Stores into this register are never deleted even if they appear to be +dead, but references may be deleted, moved or simplified. +@end itemize -Defining a global register variable in a certain register reserves that -register entirely for this use, at least within the current compilation. -The register is not alloca
Re: Proposed doc update for Explicit Reg Vars 3/3
I'm trying to sum up what was discussed here. What I'm hearing is (quoting Jeff): > the technical reality is I can't see a use outside the extended asm. Andrew has discussed some other uses, but as Jeff observed: > Given the way the optimizers and register allocation work, > I don't think we can make guarantees around [Andrew's] use > of the feature. It happens to still work and may work > forever, but I'm not going to set it in stone. If the only usage we are prepared to "set in stone" is extended asm, then it follows that that is the only supported usage. Areas of "non-guaranteed behavior" are things the docs should actively discourage people from using. Given this, I'm going to go ahead and re-work the local register variables page (probably tomorrow) stating extended asm is the only supported usage. Although I also think it's important to mention Andrew's point. If someone sees it in code somewhere, at least the docs will give them some idea what is going on. Stop me if I've got this all wrong... > dealing with any blow-back we get along the way Since no one is proposing any code changes here, it's not like people's programs are going to stop working tomorrow. It's possible that some future code change to the Local Register Variables feature (perhaps provoked by this doc change) will break something. But if the broken use case is deemed valid, it's the *code* change that should get the blow-back, not the doc change. Hopefully when that happens, the new use case will then be added to the Local Register Variables docs. Probably as a chunky block at the end of the page, but still... Or did I miss your point? Nobody has commented on https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64951 (register variable with template function). While I'm updating the page, is this a limitation of Local Register Variables that should be doc'ed? dw
Re: Proposed doc update for Explicit Reg Vars 2/3
I installed this patch from David with an update to use the "Registers are a scarce resource ..." text. 2 down, 1 to go. Thanks. dw
Re: Proposed doc update for Explicit Reg Vars 3/3
Given this, I'm going to go ahead and re-work the local register variables page (probably tomorrow) stating extended asm is the only supported usage. Although I also think it's important to mention Andrew's point. If someone sees it in code somewhere, at least the docs will give them some idea what is going on. Stop me if I've got this all wrong... Seems reasonable. An updated Local Register Variables patch is attached with the changes discussed. It also includes removing the extra space after '.' that Segher has been giving me grief about and Jeff's request re Globals: > Signaling that this stuff may change and that we'd like better solutions for certain issues is, IMHO, worth keeping. Please keep it. I remain unconvinced that this text is useful for compiler-users, and it seems odd to keep gcc's todo list in the user documentation. But I trust your judgement, so I have restored the verbatim text to the global page (at the bottom). Nobody has commented on https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64951 (register variable with template function). While I'm updating the page, is this a limitation of Local Register Variables that should be doc'ed? Your call -- folks will be switching from a development to a bugfixing mindset shortly as the gcc6 development window closes. Hopefully someone will take a look at this particular issue at that time. Ahh. I assumed that with your Jedi mind powers, you would just 'know' whether this was fixable. I'll just wait to see what happens. So, this pretty much finishes all the code and doc changes I wanted to do to gcc. While doing this work, I have found a number of bugzilla entries where it looks like the needed work has already been done (sometimes by the doc changes I've been sending), but the bug hasn't yet been resolved. While I can add comments, bugzilla won't let me resolve them or I would update them myself. When people are in bug-fixing mode seems like the right time to pursue this. How will I know when it's time? Is there a web page? dw Index: extend.texi === --- extend.texi (revision 229159) +++ extend.texi (working copy) @@ -8471,12 +8471,12 @@ @cindex specified registers GNU C allows you to associate specific hardware registers with C -variables. In almost all cases, allowing the compiler to assign -registers produces the best code. However under certain unusual +variables. In almost all cases, allowing the compiler to assign +registers produces the best code. However under certain unusual circumstances, more precise control over the variable storage is required. -Both global and local variables can be associated with a register. The +Both global and local variables can be associated with a register. The consequences of performing this association are very different between the two, as explained in the sections below. @@ -8579,6 +8579,10 @@ variables, and to restore them in a @code{longjmp}. This way, the same thing happens regardless of what @code{longjmp} does. +Eventually there may be a way of asking the compiler to choose a register +automatically, but first we need to figure out how it should choose and +how to enable you to guide the choice. No solution is evident. + @node Local Register Variables @subsubsection Specifying Registers for Local Variables @anchor{Local Reg Vars} @@ -8586,56 +8590,34 @@ @cindex specifying registers for local variables @cindex registers for local variables -You can define a local register variable with a specified register -like this: +You can define a local register variable and associate it with a specified +register like this: @smallexample -register int *foo asm ("a5"); +register int *foo asm ("r12"); @end smallexample @noindent -Here @code{a5} is the name of the register that should be used. Note -that this is the same syntax used for defining global register -variables, but for a local variable it appears within a function. +Here @code{r12} is the name of the register that should be used. Note +that this is the same syntax used for defining global register variables, +but for a local variable the declaration appears within a function. The +@code{register} keyword is required, and cannot be combined with +@code{static}. The register name must be a valid register name for the +target platform. -Naturally the register name is CPU-dependent, but this is not a -problem, since specific registers are most often useful with explicit -assembler instructions (@pxref{Extended Asm}). Both of these things -generally require that you conditionalize your program according to -CPU type. +As with global register variables, it is recommended that you choose +a register that is normally saved and restored by function calls on your +machine, so that calls to library routines will not clobber it. -In addition, operating systems on one type of CPU may differ in how they -name the r
inline asm and multi-alternative constraints
Does gcc's inline asm support multi-alternative constraints? Or are they only supported for md? The fact that it is doc'ed with the other constraints (https://gcc.gnu.org/onlinedocs/gcc/Constraints.html) says it works for inline. But https://gcc.gnu.org/bugzilla/show_bug.cgi?id=10396#c17 says it only works for md. I've got a patch ready to remove this section from the non-md docs (attached). But there probably needs to be more support than a 11 year old comment to approve it. Dropping a supported feature is always controversial. But if it doesn't work, perhaps less so. After all, doc'ing something that doesn't work is just as bad. dw PS If it *is* supported, then the docs need some work. Index: extend.texi === --- extend.texi (revision 229293) +++ extend.texi (working copy) @@ -7902,9 +7902,6 @@ When supported, the target will define the preprocessor symbol @code{__GCC_ASM_FLAG_OUTPUTS__}. -Because of the special nature of the flag output operands, the constraint -may not include alternatives. - Most often, the target has only one flags register, and thus is an implied operand of many instructions. In this case, the operand should not be referenced within the assembler template via @code{%0} etc, as there's Index: md.texi === --- md.texi (revision 229293) +++ md.texi (working copy) @@ -1093,7 +1093,6 @@ @ifclear INTERNALS @menu * Simple Constraints:: Basic use of constraints. -* Multi-Alternative:: When an insn has two alternative constraint-patterns. * Modifiers:: More precise control over effects of constraints. * Machine Constraints:: Special constraints for some particular machines. @end menu @@ -1450,6 +1449,7 @@ @code{sign_extend}. @end ifset +@ifset INTERNALS @node Multi-Alternative @subsection Multiple Alternative Constraints @cindex multiple alternative constraints @@ -1530,6 +1530,8 @@ the first, 1 for the second alternative, etc.). @xref{Output Statement}. @end ifset +@end ifset + @ifset INTERNALS @node Class Preferences @subsection Register Class Preferences
Re: inline asm and multi-alternative constraints
On 10/29/2015 1:47 PM, Richard Henderson wrote: On 10/27/2015 02:05 PM, Jeff Law wrote: On 10/25/2015 09:41 PM, David Wohlferd wrote: Does gcc's inline asm support multi-alternative constraints? Or are they only supported for md? dw PS If it *is* supported, then the docs need some work. I think Richard corrected me last I spoke on this topic :-) They *are* supported. ie, something like this should work on a ciscy target. asm("add %0,%1" : "=r,m"(x) : "rim,ri"(y)) Correct. They are supported, so long as the assembly can use the same text for all alternatives. Thus multiple alternatives in inline asm is basically useless for RISC targets, but occasionally useful for CISC targets. Aha. Thank you for the info. Ok, I have discarded the other patch. I'm not sure what this means for 10396, since that seemed to be the solution there. I have updated the non-md text with (most of) the changes I think it needs (attached). These changes are pleasantly minor, mostly just adding some example text and a bit of formatting. However. Trying to actually use the information on this page is turning up some problems. First, it could use a bit more clarity in the text that describes how gcc chooses among the alternatives. There are apparently 3 criteria that affect this decision: # of statements needed to copy params, order of alternatives, and flags. But it appears that they aren't all weighted equally. For example no number of '?' seem to be able to override an alternative that causes a reload (change the example below to use eax instead of ebx). But before I try to re-phrase this paragraph, I'm hoping someone can provide more details. What exactly are the rules here? Second, attempting to use ! and $ isn't working the way the docs led me to expect (actually, I can't make them work at all). Starting with this (contrived) i386 code, gcc selects the second alternative (using -O2 for x64). This makes sense to me. int main() { int x = 3; int y = 17; asm("or %1,%0" : "+r,b"(x) : "r,i"(y)); return x; } It took two '?' in front of the 'b' to convince gcc to use the first alternative. This seems reasonable, and shows that the first alternative is indeed viable. The problem started when I tried to replace the '??' with '!'. Being "severe," I figured one '!' should do the work of two '?'. But even using multiple '!' doesn't cause it to switch to the first alternative. Parsing the current docs: "! - Disparage severely the alternative that the '!' appears in. This alternative can still be used if it fits without reloading, but if reloading is needed, some other alternative will be used. " The alternative I am trying to disparage uses ebx. Using that register causes a push/pop. That seems to me like "reloading is needed," so I expected that by definition the other alternative would be used. Even if reloading isn't a factor (or if I don't understand it correctly), the alternative '!' is applied to should still be "severely" disparaged (ie at least as much as 2 '?'). But apparently it's not. Using '^' does change the selection if I use two of them. But according to the docs, '^' *only* applies if "the operand with the '^' needs a reload." Since the '^' is having an effect, doesn't that imply that there is a reload associated with the second alternative? But if there is, why didn't '!' work as expected? '$' also doesn't affect the selection, probably for the same reasons as '!'. I don't know if ! and $ are broken, or if the docs just aren't explaining them well enough. But something isn't working here. Lastly, while it is a core concept to compiler-writers, 'reloads' isn't really a compiler-user concept. Perhaps some of these flags aren't user manual appropriate (I'm looking at you !^$). dw PS Current text: https://gcc.gnu.org/onlinedocs/gcc/Multi-Alternative.html Proposed text: http://limegreensocks.com/gcc/Multi_002dAlternative.html Index: md.texi === --- md.texi (revision 229470) +++ md.texi (working copy) @@ -1465,6 +1465,8 @@ constraint for an operand is made from the letters for this operand from the first alternative, a comma, the letters for this operand from the second alternative, a comma, and so on until the last alternative. +All operands for a single instruction must have the same number of +alternatives. @ifset INTERNALS Here is how it is done for fullword logical-or on the 68000: @@ -1483,8 +1485,20 @@ @samp{%} in the cons
Re: inline asm and multi-alternative constraints
On 11/2/2015 3:43 PM, Sandra Loosemore wrote: On 11/02/2015 04:06 PM, Jeff Law wrote: On 10/30/2015 09:09 PM, David Wohlferd wrote: I have updated the non-md text with (most of) the changes I think it needs (attached). These changes are pleasantly minor, mostly just adding some example text and a bit of formatting. However. Trying to actually use the information on this page is turning up some problems. I think the fundamental problem here is we ought not be exposing those modifiers to the user. They're inherently tied to the details of the register allocation and reloading passes. This is what I'm thinking as well. I agree. The only reason I didn't delete them before is that removing support for an existing feature can be contentious. But I see no practical way to doc this for compiler-users. And even as an inline-asm aficionado, I can't think of any use for them anyway. Since there seems to be consensus, they're gone. Why would a user even need multi-alternative constraints in inline asm? An insn template might be instantiated in many different contexts and need to deal with different flavors of operands, but inline asm code is generally unique and the programmer writing it knows very well what the operands are supposed to be (this one is a register, that one is an address, etc). I wouldn't go that far. There are libraries that use inline asm (sometimes in headers), and a library cannot know how it may be used. The only practical approach for them is to list all options so gcc can do its best. Choosing the most efficient form of a logical-or instruction is hardly a good motivating example, either -- nobody writes inline asm to do that. However, it does make a good example for machine definitions, which also @includes this file. And while perhaps not an optimal sample for inline asm, it is a reasonable one, since other platforms (including x86) have a similar instruction which has similar limitations. I'm under the impression that the primary uses of inline asm are either to access machine instructions not exposed by builtins, or to provide a block of highly tuned code replacing all/most of a C function body. I have attached a new patch. It removes the flags and the paragraph that tries to describe how the alternative is chosen from the non-md docs. Now it just says the compiler will choose the most efficient alternative. Other than the line about "All operands for a single instruction must have the same number of alternatives", the 'internals' docs (which also includes this file) should be unaffected. dw Current text: https://gcc.gnu.org/onlinedocs/gcc/Multi-Alternative.html Proposed text: http://limegreensocks.com/gcc/Multi_002dAlternative.html Index: md.texi === --- md.texi (revision 229470) +++ md.texi (working copy) @@ -1465,6 +1465,8 @@ constraint for an operand is made from the letters for this operand from the first alternative, a comma, the letters for this operand from the second alternative, a comma, and so on until the last alternative. +All operands for a single instruction must have the same number of +alternatives. @ifset INTERNALS Here is how it is done for fullword logical-or on the 68000: @@ -1482,9 +1484,7 @@ @samp{0} for operand 1, and @samp{dmKs} for operand 2. The @samp{=} and @samp{%} in the constraints apply to all the alternatives; their meaning is explained in the next section (@pxref{Class Preferences}). -@end ifset -@c FIXME Is this ? and ! stuff of use in asm()? If not, hide unless INTERNAL If all the operands fit any one alternative, the instruction is valid. Otherwise, for each alternative, the compiler counts how many instructions must be added to copy the operands so that that alternative applies. @@ -1521,7 +1521,6 @@ the alternative only if the operand with the @samp{$} needs a reload. @end table -@ifset INTERNALS When an insn pattern has multiple alternatives in its constraints, often the appearance of the assembler code is determined mostly by which alternative was matched. When this is so, the C code for writing the @@ -1529,7 +1528,22 @@ the ordinal number of the alternative that was actually satisfied (0 for the first, 1 for the second alternative, etc.). @xref{Output Statement}. @end ifset +@ifclear INTERNALS +So the first alternative for the 68000's logical-or could be written as +@code{"+m" (output) : "ir" (input)}. The second could be @code{"+r" +(output): "irm" (input)}. However, the fact that two memory locations +cannot be used in a single instruction prevents simply using @code{"+rm" +(output) : "irm" (input)}. Using multi-alternatives, this might be +written as @code{"+m,r" (output) : "ir,irm" (input)}. This describes +all the available altern
Re: inline asm and multi-alternative constraints
On 11/6/2015 4:46 PM, Segher Boessenkool wrote: On Fri, Nov 06, 2015 at 03:29:43PM -0700, Jeff Law wrote: It's never easy to predict whether or not something like this will be contentious. Worst case is you post, it's contentious, we iterate a bit and reach some kind of resolution (ok, worst case is no resolution is reached, but that doesn't happen to often). In this case I simply don't see a way to sensibly document those modifiers without bringing in the implementation details of register class preferencing, reload, IRA & LRA. And once those details are brought into the picture, everyone loses. This very point is what made it clear to me that these flags should be removed. If there is no practical way to describe to the target audience how they work or when to use them, they don't belong here. I'll take partial credit for asking the questions that highlighted the problem, but Jeff taking the time to respond to them is what got us to the right answer here (thanks Jeff!). I'm sure there's someone out there using '?' and '!' in a multi-alternative asm constraint. They may even read the docs and complain and we can try to educate them why those modifiers are no longer documented. Another reason why we shouldn't document such things is that it makes it harder to change (anything about) those things later, although they really are implementation details. True. In this case, they were already documented and the change was to remove them, something I hesitate to do. But I think what Jeff checked in (thanks Jeff!) gives us the right answer for this page. The same goes for some constraints and almost all output modifiers. Are you suggesting more doc changes? Looking thru the pages you reference: - Starting with 'modifiers', "=+&" and (reluctantly) "%" seem reasonable for inline asm. But both "#*" seem sketchy. - Under 'simple constraints', "mringX" all (more or less) make sense to me. But "oV<>sp" are not things I can envision using. - The 'machine constraints' for i386 (the only machine I know) all seem reasonable. However for platforms that support autoincrement (powerpc?), apparently using "m" needs more docs (per https://gcc.gnu.org/ml/gcc/2008-03/msg01079.html). Are these the things to which you are referring? I've always assumed the parts that seem obscure here were due to my i386-centric view of the world. Are some of them actually md-only? There are other minor changes I'd make on some of these pages. But mostly they are not worth it unless I'm doing something else there too. So if there's something here you think needs changing, let me know and I'll take a crack at it. Other than that, I'll keep working my way thru the doc issues in the inline-asm bugs. I've done what I can for 10396. dw
basic asm and memory clobbers
It seems like a doc update is what is needed to close PR24414 (Old-style asms don't clobber memory). I'm working on this now (phase 1) in the unlikely event that someone is inspired to make a code change here instead. Like Richard Henderson, I rather expected basic (or "old-style") asm to perform a memory clobber (it doesn't). This bug is now over a decade old, so presumably the question of how this is going to work is settled. IAC, the docs should reflect the current behavior. Based on this issue plus what I have learned since last updating this page, I'm proposing the attached patch. dw CCing the commenters from the bug. Index: extend.texi === --- extend.texi (revision 229910) +++ extend.texi (working copy) @@ -7353,7 +7353,8 @@ @end itemize Safely accessing C data and calling functions from basic @code{asm} is more -complex than it may appear. To access C data, it is better to use extended +complex than it may appear. To access C data (including both local and +global register variables), use extended @code{asm}. Do not expect a sequence of @code{asm} statements to remain perfectly @@ -7376,6 +7377,12 @@ visibility of any symbols it references. This may result in GCC discarding those symbols as unreferenced. +Basic @code{asm} statements are not treated as though they used a "memory" +clobber, although they do implicitly perform a clobber of the flags +(@pxref{Clobbers}). Also, there is no implicit clobbering of registers, +so any registers changed must be restored to their original value before +exiting the @code{asm}. + The compiler copies the assembler instructions in a basic @code{asm} verbatim to the assembly language output file, without processing dialects or any of the @samp{%} operators that are available with
Re: inline asm and multi-alternative constraints
On 11/9/2015 1:52 PM, Jeff Law wrote: On 11/07/2015 12:50 AM, David Wohlferd wrote: - Starting with 'modifiers', "=+&" and (reluctantly) "%" seem reasonable for inline asm. But both "#*" seem sketchy. Right. =+& are no-brainer yes, as are the constants 0-9. % is probably OK as well. #* are similar to !? in that they are inherently tied into the register class preferencing implementation and documenting them would be inadvisable. Actually, #* are already doc'ed in the user guide. Are you advising they be removed? If so, the attached patch does this. It also removes references to define_peephole2 and define_splits from the user guide version of this page. There are other parts of this page that are more md than ug, but these are the ones that annoyed me the most. dw Original: https://gcc.gnu.org/onlinedocs/gcc/Modifiers.html Proposed: http://limegreensocks.com/gcc/Modifiers.html Index: md.texi === --- md.texi (revision 229910) +++ md.texi (working copy) @@ -1646,7 +1646,9 @@ GCC can only handle one commutative pair in an asm; if you use more, the compiler may fail. Note that you need not use the modifier if the two alternatives are strictly identical; this would only waste -time in the reload pass. The modifier is not operational after +time in the reload pass. +@ifset INTERNALS +The modifier is not operational after register allocation, so the result of @code{define_peephole2} and @code{define_split}s performed after reload cannot rely on @samp{%} to make the intended insn match. @@ -1665,7 +1667,6 @@ @samp{*} additionally disparages slightly the alternative if the following character matches the operand. -@ifset INTERNALS Here is an example: the 68000 has an instruction to sign-extend a halfword in a data register, and can also sign-extend a value by copying it into an address register. While either kind of register is
Re: inline asm and multi-alternative constraints
On 11/9/2015 2:03 AM, Richard Earnshaw wrote: On 09/11/15 09:57, Richard Earnshaw wrote: On 07/11/15 09:23, Segher Boessenkool wrote: On Fri, Nov 06, 2015 at 11:50:40PM -0800, David Wohlferd wrote: The same goes for some constraints and almost all output modifiers. Are you suggesting more doc changes? Looking thru the pages you reference: - Starting with 'modifiers', "=+&" and (reluctantly) "%" seem reasonable for inline asm. But both "#*" seem sketchy. Output modifiers, not constraint modifiers -- things like "%X0" in the output template. Many are only useful in the machine description, but some (like that 'X' for rs6000) are vital for asm as well. They're not just useful, they're essential on AArch64 and ARM. They're needed, for example, to get the 32/64-bit register sizing correct. On the other hand, we have %S on ARM which cannot ever be used from inline assembler: it matches the result of a match_operator rule with complex internal structure that could never be generated from user code. R. I don't know enough about ARM to be comfortable making a change here myself. But dropping this out of the docs is simple enough. All you need to do is wrap it with @ifset INTERNALS @end ifset Then it will still show up in the Machine Description section, but not the users' guide. dw
Re: basic asm and memory clobbers
On 11/9/2015 1:32 AM, Segher Boessenkool wrote: On Sun, Nov 08, 2015 at 04:10:01PM -0800, David Wohlferd wrote: It seems like a doc update is what is needed to close PR24414 (Old-style asms don't clobber memory). What is needed to close the bug is to make the compiler work properly. The question of course is, what does 'properly' mean? My assertion is that 10 years on, 'properly' means whatever it's doing now. Changing it at this point will probably break more than it fixes, and (as you said) there is a plausible work-around using extended asm. So while this bug could be resolved as 'invalid' (since the compiler is behaving 'properly'), I'm thinking to split the difference and 'fix' it with a doc patch that describes the supported behavior. Whether that means clobbering memory or not, I don't much care -- with the status quo, if you want your asm to clobber memory you have to use extended asm; if basic asm is made to clobber memory, if you want your asm to *not* clobber memory you have to use extended asm (which you can with no operands by writing e.g. asm("bork" : ); ). So both behaviours are available whether we make a change or not. But changing things now will likely break user code. Safely accessing C data and calling functions from basic @code{asm} is more -complex than it may appear. To access C data, it is better to use extended +complex than it may appear. To access C data (including both local and +global register variables), use extended @code{asm}. I don't think this makes things clearer. Register vars are described elsewhere already; The docs for local register variables describe this limitation. But globals does not. Whether this information belongs in local register, global register, basic asm, or all 3 depends on which section of the docs users will be reading when they need to know this information. if you really think it needs mentioning here, put it at the end (in its own sentence), don't break up this sentence. Ok. (dot space space). +Basic @code{asm} statements are not treated as though they used a "memory" +clobber, although they do implicitly perform a clobber of the flags +(@pxref{Clobbers}). They do not clobber the flags. Observe: Ouch. i386 shows the same thing for basic asm. Having to preserve the flags is ugly, but since that's the behavior, let's write it down. === void f(int a) { a = a >> 2; if (a <= 0) asm("OHAI"); if (a >= 0) asm("OHAI2"); } === Compiling this for powerpc gives (-m32, edited): f: srawi. 9,3,2# this sets cr0 ble 0,.L5 # this uses cr0 .L2: OHAI2 blr .p2align 4,,15 .L5: OHAI bnelr 0 # this uses cr0 b .L2 which shows that CR0 (which is "cc") is live over the asm. So are all other condition regs. It is true for cc0 targets I guess, but there aren't many of those left. Also, there is no implicit clobbering of registers, +so any registers changed must be restored to their original value before +exiting the @code{asm}. One of the important uses of asm is to set registers GCC does not know about, so you might want to phrase this differently. Ahh, good point. What would you say to "general purpose registers?" Update attached. dw Index: extend.texi === --- extend.texi (revision 229910) +++ extend.texi (working copy) @@ -7353,8 +7353,9 @@ @end itemize Safely accessing C data and calling functions from basic @code{asm} is more -complex than it may appear. To access C data, it is better to use extended -@code{asm}. +complex than it may appear. To access C data use extended @code{asm}. Do +not attempt to directly access local or global register variables from +within basic @code{asm} (@pxref{Explicit Register Variables}). Do not expect a sequence of @code{asm} statements to remain perfectly consecutive after compilation. If certain instructions need to remain @@ -7376,6 +7377,11 @@ visibility of any symbols it references. This may result in GCC discarding those symbols as unreferenced. +Basic @code{asm} statements are not treated as though they used a "memory" +clobber (@pxref{Clobbers}). Also, neither the flags nor the general-purpose +registers are clobbered, so any changes must be restored to their original +value before exiting the @code{asm}. + The compiler copies the assembler instructions in a basic @code{asm} verbatim to the assembly language output file, without processing dialects or any of the @samp{%} operators that are available with
Re: basic asm and memory clobbers
On 11/16/2015 1:29 PM, Jeff Law wrote: On 11/15/2015 06:23 PM, David Wohlferd wrote: On 11/9/2015 1:32 AM, Segher Boessenkool wrote: On Sun, Nov 08, 2015 at 04:10:01PM -0800, David Wohlferd wrote: It seems like a doc update is what is needed to close PR24414 (Old-style asms don't clobber memory). What is needed to close the bug is to make the compiler work properly. The question of course is, what does 'properly' mean? My assertion is that 10 years on, 'properly' means whatever it's doing now. Changing it at this point will probably break more than it fixes, and (as you said) there is a plausible work-around using extended asm. So while this bug could be resolved as 'invalid' (since the compiler is behaving 'properly'), I'm thinking to split the difference and 'fix' it with a doc patch that describes the supported behavior. I'd disagree. A traditional asm has to be considered an opaque blob that read/write/clobber any register or memory location. When I first encountered basic asm, my expectation was that of course it clobbers. It HAS to, right? But that said, let me give my best devil's advocate impersonation and ask: Why? - There is no standard that says it must do this. - I'm only aware of 1 person who has ever asked for this change. And the request has been deemed so unimportant it has languished for a very long time. - There is a plausible work-around with extended asm, which (mostly) has clear semantics regarding clobbers. - While the change probably won't introduce bad code, if it does it will be in ways that are going to be difficult to track down, in an area where few have the expertise to debug. - Existing code that currently does things 'right' (ie push/pop any modified registers) will suddenly be doing things 'wrong,' or at least wastefully. - Other than top-level asm, it seems like every existing basic asm will (probably) get a new performance penalty (memory usage + code size + cycles) to allow for situations they may already be handling correctly or that don't apply. True, these aren't particularly compelling reasons to not make the change. But I also don't see any compelling benefits to offset them. For existing users, presumably they have already found whatever solution they need and will just be annoyed that they have to revisit their code to see the impact of this change. Will they need to #if to ensure consistent performance/function between gcc versions? For future users, they will have the docs telling them the behavior, and pointing them to the (now well documented) extended asm. Where's the benefit? If someone were proposing basic asm as a new feature, I'd absolutely be arguing that it should clobber everything. Or I might argue that basic asm should only be allowed at top-level (where I don't believe clobbering matters?) and everything else should be extended asm so we KNOW what to clobber (hmm...). But changing this so gcc tries (probably futilely) to emulate other implementations of asm... That seems like a weak case to support a change to this long-time behavior. Unless there are other benefits I'm just not seeing? -- Ok, that's my best shot. You have way more expertise and experience here than I do, so I expect that after you think it over, you'll make the right call. And despite my attempt here to defend the opposite side, I'm not entirely sure what the right call is. But these seem like the right questions. Either way, let me know if I can help. It's also the case that assuming an old style asm can read or clobber any memory location is the safe, conservative thing to do. Well, safe-r. Even if you make this change, embedding basic asm in C routines still seems risky. Well, riskier than extended which is risky enough. So the right thing in my mind is to ensure that behaviour The right thing in my mind is to find ways to prod people into using extended asm instead of basic. Then they explicitly specify their requirements rather than depending on clunky all-or-nothing defaults. Maybe to the extent of gcc deprecating (non-top level) basic over time (-fallow-basic-asm=[none|top|any] where v6 defaults to 'any' and v7 defaults to 'top'). I'd be surprised if gcc went this way, but that doesn't mean it wouldn't be better. and document it. and to document it. Andrew's logic is just plain wrong in that BZ. Whether that means clobbering memory or not, I don't much care -- with the status quo, if you want your asm to clobber memory you have to use extended asm; if basic asm is made to clobber memory, if you want your asm to *not* clobber memory you have to use extended asm (which you can with no operands by writing e.g. asm("bork" : ); ). So both behaviours a
Re: basic asm and memory clobbers
Unless there are other benefits I'm just not seeing? When we fix 24414 by honoring the "uses/clobbers all hard registers and memory" semantics for old-style asms, those old-style asms will be *less* likely to cause problems in the presence of ever-improving optimization techniques. Ok, this is a good point. In fact, it may resolve existing problems that people don't know they have. However I still have concerns that people might be surprised by the change in behavior. Looking thru the linux kernel source (a significant collection of inline asm containing both basic (~878) and extended (4833) statements), it seems there are places where they really are going to want the "clobber nothing" semantics. For that reason, I'd like to propose adding 2 new clobbers to extended asm as part of this work: "clobberall" - This gives extended the same semantics as whatever the new basic asm will be using. "clobbernone" - This gives the same semantics as the current basic asm. Clobbernone may seem redundant, since not specifying any clobbers should do the same thing. But actually it doesn't, at least on i386. At present, there is no way for extended asm to not clobber "cc". I don't know if other platforms have similar issues. When basic asm changes, I expect that having a way to "just do what it used to do" is going to be useful for some people. Either way, let me know if I can help. About the only immediate task would be to ensure that the documentation for traditional asms clearly documents the desired semantics and somehow note that there are known bugs in the implementation (ie 24414, handling of flags registers, and probably other oddities) Given that gcc is at phase 3, I'm guessing this work won't be in v6? Or would this be considered "general bugfixing?" The reason I ask is I want to clearly document what the current behavior is as well as informing them about what's coming. If this isn't changing until v7, the text can be updated then to reflect the new behavior. And I suspect it's still a lot less intrusive than you might think. I tried to picture the most basic case I can think of that uses something clobber-able: for (int x=0; x < 1000; x++) asm("#stuff"); This generates very simple and highly performant code: movl$1000, %eax .L2: #stuff subl$1, %eax jne .L2 Using extended asm to simulate the clobberall gives: movl$1000, 44(%rsp) .L2: #stuff subl$1, 44(%rsp) jne .L2 It allocates an extra 4 bytes, and changed everything to memory accesses instead of using a register. Obviously not a huge performance impact on this tiny sample, but it does suggest to me that sometimes there could be. My point being simply that people may want the old behavior, so we need to be sure there's a way to get it (ie "clobbernone"). +Basic @code{asm} statements are not treated as though they used a "memory" +clobber, although they do implicitly perform a clobber of the flags +(@pxref{Clobbers}). They do not clobber the flags. Observe: Ouch. i386 shows the same thing for basic asm. Sadly, I suspect this isn't consistent across targets. Bigger ouch. I'll follow up on this after the discussion about changing basic asm is complete (which may render this moot). It likely depends on how the target models the flags. I'm not quite sure how to proceed here. I'm pretty sure no one wants me to write "basic asm doesn't clobber flags, except that maybe it does on some (unspecified) platforms." I've tried to follow the code, but without any particular success. I was hoping to see decode_reg_name_and_count (or decode_reg_name) being called from platform-specific routines and handling -3, but not so much. Using users as beta testers is normally frowned upon (outside of Microsoft), but perhaps the solution here is to just say that it doesn't clobber flags (currently the most common case?), and update the docs if and when people complain? Yes, that's bad, but saying nothing at all isn't any better. And we know it's true for at least 2 platforms. dw
Re: basic asm and memory clobbers
On 11/20/2015 2:17 AM, Andrew Haley wrote: On 20/11/15 01:23, David Wohlferd wrote: I tried to picture the most basic case I can think of that uses something clobber-able: for (int x=0; x < 1000; x++) asm("#stuff"); This generates very simple and highly performant code: movl$1000, %eax .L2: #stuff subl$1, %eax jne .L2 Using extended asm to simulate the clobberall gives: movl$1000, 44(%rsp) .L2: #stuff subl$1, 44(%rsp) jne .L2 It allocates an extra 4 bytes, and changed everything to memory accesses instead of using a register. Can you show us your code? I get xx: movl$1000, %eax .L2: #stuff subl$1, %eax jne .L2 rep; ret for void xx() { for (int x=0; x < 1000; x++) asm volatile("#stuff" : : : "memory"); } What you're describing looks like a bug: x doesn't have its address taken. The intent for 24414 is to change basic asm such that it will become (quoting jeff) "an opaque blob that read/write/clobber any register or memory location." Such being the case, "memory" is not sufficient: #define CLOBBERALL "eax", "ebx", "ecx", "edx", "r8", "r9", "r10", "r11", "r12", "r13", "r14", "r15", "edi", "esi", "ebp", "cc", "memory" int main() { for (int x=0; x < 1000; x++) asm("#":::CLOBBERALL); } dw
Re: basic asm and memory clobbers
On 11/19/2015 7:14 PM, Segher Boessenkool wrote: On Thu, Nov 19, 2015 at 05:23:55PM -0800, David Wohlferd wrote: For that reason, I'd like to propose adding 2 new clobbers to extended asm as part of this work: "clobberall" - This gives extended the same semantics as whatever the new basic asm will be using. "clobbernone" - This gives the same semantics as the current basic asm. I don't think this is necessary or useful. They are also awful names: "clobberall" cannot clobber everything (think of the stack pointer), I'm not emotionally attached to the names. But providing the same capability to extended that we are proposing for basic doesn't seem so odd. Shouldn't extended be able to do (at least) everything basic does? My first thought is that it allows people to incrementally start migrating from (new) basic to extended (something I think we should encourage). Or use it as a debug tool to see if the failure you are experiencing from your asm is due to a missing clobber. Since the capability will already be implemented for basic, providing a way to access it from extended seems trivial (if we can agree on a name). As you say, clobbering the stack pointer presents special challenges (although gcc has a specific way of dealing with stack register clobbers, see 52813). This is why I described the feature as having "the same semantics as whatever the new basic asm will be using." and "clobbernone" does clobber some (those clobbered by any asm), Seems like a quibble. Those other things (I assume you mean things like pipelining?) most users aren't even aware of (or they wouldn't be so eager to use inline asm in the first place). Would it be more palatable if we called it "v5BasicAsmMode"? "ClobberMin"? Clobbernone may seem redundant, since not specifying any clobbers should do the same thing. But actually it doesn't, at least on i386. At present, there is no way for extended asm to not clobber "cc". I don't know if other platforms have similar issues. Some do. The purpose is to stay compatible with asm written for older versions of the compiler. Backward compatibility is important. I understand that due to the cc0 change in x86, existing code may have broken without always clobbering cc. This was seen as the safest way to ensure that didn't happen. However no solution was/is available for people who correctly knew whether their asm clobbers the flags. Mostly I'm ok with that. All the ways that I can think of to try to re-allow people to start using the cc clobber are just not worth it. I simply can't believe there are many cases where there's going to be a benefit. But as I said: backward compatibility is important. Providing a way for people who need/want the old basic asm semantics seems useful. And I don't believe we can (quite) do that without clobbernone. When basic asm changes, I expect that having a way to "just do what it used to do" is going to be useful for some people. 24414 says the documented behaviour hasn't been true for at least fourteen years. It isn't likely anyone is relying on that behaviour. ? To my knowledge, there was no documentation of any sort about what basic asm clobbered until I added it. But what people are (presumably) relying on is that whatever it did in the last version, it's going to continue to do that in the next. And albeit with good intentions, we are planning on changing that. but perhaps the solution here is to just say that it doesn't clobber flags (currently the most common case?), and update the docs if and when people complain? Yes, that's bad, but saying nothing at all isn't any better. And we know it's true for at least 2 platforms. Saying nothing at all at least is *correct*. We don't know that saying "it doesn't clobber flags" is wrong either. All we know is that jeff said "I suspect this isn't consistent across targets." But that's neither here nor there. The real question is, if we can't say that, what can we say? - If 24414 is going in v6, then we can doc that it does the clobber and be vague about the old behavior. - If 24414 isn't going in v6, then what? I suppose we can say that it can vary by platform. We could even provide your sample code as a means for people to discover their platform's behavior. It isn't necessary for users to know what registers the compiler considers to be clobbered by an asm, unless they actually clobber something in the assembler code themselves. I'm not sure I follow. If someone has code that uses a register, currently they must restore the value before exiting the asm or risk disaster. So they might write asm("push eax ; DoSomethingWith eax ; pop eax").
Re: basic asm and memory clobbers
On 11/20/2015 3:14 AM, Andrew Haley wrote: On 20/11/15 10:37, David Wohlferd wrote: The intent for 24414 is to change basic asm such that it will become (quoting jeff) "an opaque blob that read/write/clobber any register or memory location." Such being the case, "memory" is not sufficient: #define CLOBBERALL "eax", "ebx", "ecx", "edx", "r8", "r9", "r10", "r11", "r12", "r13", "r14", "r15", "edi", "esi", "ebp", "cc", "memory" Hmm. I would not be at all surprised to see this cause reload failures. You certainly shouldn't clobber the frame pointer on any machine which needs one. If I don't clobber ebp, gcc just uses it: movl$1000, %ebp .L2: # subl$1, %ebp jne .L2 The original purpose of this code was to attempt to show that this kind of "clobbering everything" behavior (the proposed new behavior for basic asm) could have non-trivial impact on existing routines. While I've been told that changing the existing "clobber nothing" approach to this kind of "clobber everything" is "less intrusive than you might think," I'm struggling to believe it. It seems to me that one asm("nop") thrown into a driver routine to fix a timing problem could end up making a real mess. But actually we're kind of past that. When Jeff, Segher, (other) Andrew and Richard all say "this is how it's going to work," it's time for me to set aside my reservations and move on. So now I'm just trying my best to make sure that if it *is* an issue, people have a viable solution readily available. And to make sure it's all correctly doc'ed (which is what started this whole mess). dw
Re: basic asm and memory clobbers
On 11/20/2015 8:14 AM, Richard Henderson wrote: On 11/20/2015 04:34 PM, Jakub Jelinek wrote: Isn't that going to break too much code though? I mean, e.g. including libgcc... I don't know. My suspicion is very little. But that's actually what I'd like to know before we start adjusting code in other ways wrt basic asms. I can provide a little data here. In an effort to gain some perspective, I've been looking at inline asm usage in the linux kernel (4.3). Clearly this isn't "typical usage," but it is probably one of the biggest users of inline asm, and likely has the best justifications for doing so (being an OS and all). There are ~5,711 instances of inline asm in use. Of those, ~4,833 are extended and ~878 are basic. I don't have any numbers about how many are top level vs in function, but let me see what I can do. A quick look at libgcc shows that there are 109 extended and 45 basic asm statements. I'll see how many end up being top-level, but it looks like most of them. dw
Re: basic asm and memory clobbers
On 11/20/2015 3:55 PM, David Wohlferd wrote: On 11/20/2015 8:14 AM, Richard Henderson wrote: On 11/20/2015 04:34 PM, Jakub Jelinek wrote: Isn't that going to break too much code though? I mean, e.g. including libgcc... I don't know. My suspicion is very little. But that's actually what I'd like to know before we start adjusting code in other ways wrt basic asms. I can provide a little data here. In an effort to gain some perspective, I've been looking at inline asm usage in the linux kernel (4.3). Clearly this isn't "typical usage," but it is probably one of the biggest users of inline asm, and likely has the best justifications for doing so (being an OS and all). There are ~5,678 instances of inline asm in use. Of those, ~4,833 are extended and ~845 are basic. I don't have any numbers about how many are top level vs in function, but let me see what I can do. Ok, the news here is mixed. Of those 845: - Only 50 of them look like top level asm. I was hoping for more. - 457 are in 6 files in the lib/raid6 directory, so there's a bunch that can be done quickly. - That leaves 338 miscellaneous other uses spread throughout some 200 files across multiple platforms. That seems like a lot. Despite the concerns expressed by Jeff about the difficulties in changing from basic to extended, it looks to me like they don't need any conversion (other s/%/%%/). Adding the trailing colon should be sufficient to provide the semantics they have now, which apparently is deemed sufficient. A quick look at libgcc shows that there are 109 extended and 45 basic asm statements. I'll see how many end up being top-level, but it looks like most of them. Of the 45 basic asm statements, only 9 aren't top-level. They all appear to be trivial to change to extended. To sum up: - Some projects (like libgcc) are going to be simple to update. Maybe an hour's work all told. - Some projects (like testsuite) are going to take longer. While the changes are mostly straight-forward, the number of files involved will be a factor. - I took a look at the Mingw-w64 project. It has ~20 non-top level asms, so also pretty simple to update. - But some projects (like linux kernel) are going to be more challenging. Not so much making the changes (although that will take a while) as convincing yourself that the change was harmless and still compiles on all supported platforms. Yes, this represents a very limited sample. And one weighted towards projects that are more likely to use asm. But it does give some sense of the scale. So, what now? While I'd like to take the big step and start kicking out warnings for non-top-level right now, that may be too bold for phase 3. A more modest step for v6 would just provide a way to find them (maybe something like -Wnon-top-basic-asm or -Wonly-top-basic-asm) and doc the current behavior as well as the upcoming change. Adding the warning is not something I can do. But I'll create the doc patch once someone confirms that this is the plan. dw
Re: basic asm and memory clobbers
On 11/19/2015 5:53 PM, Sandra Loosemore wrote: On 11/19/2015 06:23 PM, David Wohlferd wrote: About the only immediate task would be to ensure that the documentation for traditional asms clearly documents the desired semantics and somehow note that there are known bugs in the implementation (ie 24414, handling of flags registers, and probably other oddities) Given that gcc is at phase 3, I'm guessing this work won't be in v6? Or would this be considered "general bugfixing?" The reason I ask is I want to clearly document what the current behavior is as well as informing them about what's coming. If this isn't changing until v7, the text can be updated then to reflect the new behavior. Documentation fixes are accepted all the way through Stage 4, since there's less risk of introducing regressions in user programs from accidental documentation mistakes than code errors. The code change isn't yet finalized. I'm hoping to doc something vaguely like: "basic asm (other than at top level) is being deprecated because blah> potentially unsafe due to optimizations . You can locate the statements that will no longer be supported using -Wonly-top-basic-asm. Change them to use extended asm instead." What's your take on having the user guide link to the gcc wiki? If we do make this change, I'd kinda like to create a "how to convert basic asm to extended." But it doesn't seem like a good fit for the user docs. But if the user docs don't reference the wiki, I doubt anyone would ever find it. OTOH, I'd discourage adding anything to the docs about anticipated changes in future releases, except possibly to note that certain features or behavior are deprecated and may be removed in future releases (with a suggestion about what you should do instead). I'd love to see the doc folks make a pass and remove every "some day this won't work" text that doesn't include this. If there is no way for users to prepare, you aren't helping. And remove all the "some day there might be a new feature" stuff too. It just wastes users' time trying to figure out if "some day" has arrived yet. And it makes them cry when the new feature, which is exactly what they need, isn't there yet. We've already got too many "maybe someday this will be fixed" notes in the manual that are not terribly useful to users. You'd get my vote to remove them all. If I got a vote. dw
Re: basic asm and memory clobbers
On 11/23/2015 2:04 AM, Andrew Haley wrote: On 21/11/15 12:56, David Wohlferd wrote: So, what now? While I'd like to take the big step and start kicking out warnings for non-top-level right now, that may be too bold for phase 3. A more modest step for v6 would just provide a way to find them (maybe something like -Wnon-top-basic-asm or -Wonly-top-basic-asm) and doc the current behavior as well as the upcoming change. Warnings would be good. Richard's suggestion was: > I'm suggesting that we don't accept [basic asm] at all inside a function. One must audit the source and make a conscious decision to write asm("bla" : ); instead. Accepting basic asm outside of a function is perfectly ok. I'm really not a compiler-writer, but I've taken a shot at implementing this. While Richard is talking about completely deprecating this feature (a direction I support), I've started by emitting warnings, and by having the warnings disabled by default. This allows people to experiment with the new direction without getting clobbered by it. My intent is something like this: -Wonly-top-basic-asm Warn if basic @code{asm} statements are used inside a function (ie not at file scope/top level). Due to the potential for unsafe optimizations, always use extended instead of basic asm inside functions. This warning is disabled by default and is not enabled by -Wall or -Wextra. I probably won't include the bits about Wall or Wextra in the actual doc patch. They're here more to provoke comments in case someone thinks this behavior should change. I'm open to suggestions about alternate names, too. I've got this working for both c and c++. It doesn't affect other places that use "asm" like "explicit register variables," "asm labels," "extended asm" or "top level basic asm." It is also pleasantly small. As written, it should be useful to find places in current code that are at risk. However (there's always a 'however'), it doesn't correctly handle "naked" functions (ie __attribute__((naked)) ). By definition, naked functions can *only* include basic asm (https://gcc.gnu.org/ml/gcc/2014-05/msg00172.html). So generating a warning for them is incorrect. I'll need help fixing that. I don't know if it is possible from within the parsers (where my current code is being added) to walk back up and get the attributes for the function. I assume not. In that case, I'll need some help finding some place up the call stack where you can. Suggestions welcome. The patch is at http://www.LimeGreenSocks.com/gcc/24414f.zip and includes test code. My warning still holds: there are modes of compilation on some machines where you can't clobber all registers without causing reload failures. This is why Jeff didn't fix this in 1999. So, if we really do want to clobber "all" registers in basic asm it'll take a lot of work. I was always reluctant to see this change made. In addition to the issues you mention, I had questions about the impact on the surrounding code. I like Richard's direction much better. We can start with a disabled warning, then upgrade as seems warranted. dw
Re: basic asm and memory clobbers
On 11/23/2015 12:37 PM, Jeff Law wrote: On 11/23/2015 03:04 AM, Andrew Haley wrote: On 21/11/15 12:56, David Wohlferd wrote: So, what now? While I'd like to take the big step and start kicking out warnings for non-top-level right now, that may be too bold for phase 3. A more modest step for v6 would just provide a way to find them (maybe something like -Wnon-top-basic-asm or -Wonly-top-basic-asm) and doc the current behavior as well as the upcoming change. Warnings would be good. My warning still holds: there are modes of compilation on some machines where you can't clobber all registers without causing reload failures. This is why Jeff didn't fix this in 1999. So, if we really do want to clobber "all" registers in basic asm it'll take a lot of work. Exactly. In retrospect, I probably should have generated more tests for those conditions back in '99. Essentially they'd document a class of problems we'd like to fix over time. I know some have been addressed in various forms, but it hasn't been systematic. My recommendation here is to: 1. Note in the docs what the behaviour should be. This guides where we want to go from an implementation standpoint. I think it'd be fine to *suggest* only using old style asms at the toplevel, but I'm less convinced that mandating that restriction is wise. I hear your concerns about mandating this. Perhaps starting by providing an option to find them, then (someday) enabling that option by default? 2. As we come across failures for adhere to the desired behaviour, fix or document them as known inconsistencies. If we find that some are inherently un-fixable, then we'll need to tighten the docs around those. It's your expectation that extended asm won't be sufficient to resolve these issues? The more I think about it, I'm just not keen on forcing all those old-style asms to change. If you mean you aren't keen to change them to "clobber all," I'm with you. If you are worried about changing them from basic to extended, what kinds of problems do you foresee? I've been reading a lot of basic asm lately, and it seems to me that most of it would be fine with a simple colon. Certainly no worse than the current behavior. dw
Re: basic asm and memory clobbers
On 11/23/2015 1:44 PM, paul_kon...@dell.com wrote: On Nov 23, 2015, at 4:36 PM, David Wohlferd wrote: ... The more I think about it, I'm just not keen on forcing all those old-style asms to change. If you mean you aren't keen to change them to "clobber all," I'm with you. If you are worried about changing them from basic to extended, what kinds of problems do you foresee? I've been reading a lot of basic asm lately, and it seems to me that most of it would be fine with a simple colon. Certainly no worse than the current behavior. I'm not sure. I have some asm("sync") which I think assume that this means asm("sync"::"memory") Another excellent reason to nudge people towards using extended asm. If you saw asm("sync":::"memory"), you would *know* what it did, without having to read the docs (which don't say anyway). I'm pretty confident that asm("") doesn't clobber memory on i386, but maybe that behavior is platform-specific. Since i386 doesn't have "sync", I assume you are on something else? If you have a chance to experiment, I'd love confirmation from other platforms that asm("blah") is the same as asm("blah":). Feel free to email me off list to discuss. dw
Re: basic asm and memory clobbers
On 11/24/2015 8:58 AM, paul_kon...@dell.com wrote: On Nov 23, 2015, at 8:39 PM, David Wohlferd wrote: On 11/23/2015 1:44 PM, paul_kon...@dell.com wrote: On Nov 23, 2015, at 4:36 PM, David Wohlferd wrote: ... The more I think about it, I'm just not keen on forcing all those old-style asms to change. If you mean you aren't keen to change them to "clobber all," I'm with you. If you are worried about changing them from basic to extended, what kinds of problems do you foresee? I've been reading a lot of basic asm lately, and it seems to me that most of it would be fine with a simple colon. Certainly no worse than the current behavior. I'm not sure. I have some asm("sync") which I think assume that this means asm("sync"::"memory") Another excellent reason to nudge people towards using extended asm. If you saw asm("sync":::"memory"), you would *know* what it did, without having to read the docs (which don't say anyway). I'm pretty confident that asm("") doesn't clobber memory on i386, but maybe that behavior is platform-specific. Since i386 doesn't have "sync", I assume you are on something else? Yes, MIPS. If you have a chance to experiment, I'd love confirmation from other platforms that asm("blah") is the same as asm("blah":). Feel free to email me off list to discuss. I'm really concerned with loosening the meaning of basic asm. I wish I could find the documentation that says, or implies, that it is a memory clobber. And/or that it is implicitly volatile. The problem is that it's clear from existing code that this assumption was made, and that defining it otherwise would break such code. For example, the code I quoted clearly won't work if stores are moved across the asm("sync"). Given the ever improving optimizers, these things are time bombs -- code that worked for years might suddenly break when the compiler is upgraded. If such breakage is to be done, it must at least come with a warning (which must default to ON). But I'd prefer to see the more conservative approach (more clobbers) taken. It looks like Ian has already addressed most of your concerns. Just to emphasize one point: The current behavior of basic asm is to NOT clobber memory. So your existing code that performs asm("sync") is already at risk. The 'fix' I am proposing is to give warnings for every use of basic asm inside functions (top-level asm is not a problem). Users should change all such code to use extended so that they can (must) explicitly specify what their asm clobbers (if anything). Armed with this information, the optimizers can do their work safely. And people maintaining the code can finally be clear about what the asm really does. And the code change should be simple. They can get the same "clobber nothing" behavior that basic asm has always performed by simply adding a colon on the end ( asm("sync":) ). Or they can use any of extended asm's features to get different behavior ( asm("sync":::"memory") ). Or to put that another way, correctly written basic asm can be converted to extended by just adding a colon. Incorrect code may require a bit more work. dw
basic asm and memory clobbers - Proposed solution
I have solved the problem with my previous patch. Here's the update (feedback welcome): http://www.LimeGreenSocks.com/gcc/24414g.zip Based on my understanding from the previous thread, this patch now does what it needs to do (code-wise) to resolve this "basic asm and memory clobbers" issue. As mentioned previously, this patch introduces a new warning (-Wonly-top-basic-asm), which is disabled by default. When enabled, it triggers a warning for any basic asm inside a function, unless the function has the "naked" attribute. An argument can be made that the default for this warning should be 'enabled.' Yes, this will break builds that use basic asm and -Werror, but it can easily be disabled with -Wno-only-top-basic-asm. And if we don't enable it, no one is going to know the check is even available. Then hidden problems like the one Paul was just describing still won't be found, and optimizations will continue to have unexpected side effects. OTOH, I can also see changing this to 'enabled' as more appropriate in the next phase 1. Now that I'm done with the code fix, I'm working on an update to the docs. Obviously they should be checked in as part of the code fix. I'm planning to actually use the word "deprecated" when describing the use of basic asm within functions. Seems like a big step. But there's no point in my proceeding any further until someone in authority agrees that this is the desired solution. I'm not actually sure who that is, but further work is a waste of time if no one is prepared to approve it. If you are that person, the questions to be answered are: 1) Is the idea of changing basic asm to "clobber everything" dead? 2) Is creating a warning for the use of "basic asm inside a function" the solution for this issue? 3) Should the warning be enabled by default in v6? 4) Should the warning be enabled by Wall or Wextra? 5) Should the v6 docs explicitly describe using "basic asm inside a function" as deprecated? If you're looking for my recommendations, I would say: Yes, Yes, (reluctantly) No, No and Yes. With this information in hand, I'll take a shot at finishing this off. For something that started out as a simple 3 sentence doc patch, this sure has turned into a project... dw
Re: basic asm and memory clobbers
On 11/26/2015 8:26 AM, Hans-Peter Nilsson wrote: On Thu, 26 Nov 2015, Segher Boessenkool wrote: On Thu, Nov 26, 2015 at 05:30:48AM -0500, Hans-Peter Nilsson wrote: On Fri, 20 Nov 2015, Richard Henderson wrote: I'd be perfectly happy to deprecate and later completely remove basic asm within functions. We've explictly promised (directed to kernel people IIRC) that the empty basic asm; 'asm ("")', has forward-compatible outlining magic, so people would not have to keep adding ever-new attributes like noinline,noclone to avoid finding their functions "spilling over", so please exclude that. To wit, from extend.texi: @item noinline @cindex @code{noinline} function attribute This function attribute prevents a function from being considered for inlining. @c Don't enumerate the optimizations by name here; we try to be @c future-compatible with this mechanism. If the function does not have side-effects, there are optimizations other than inlining that cause function calls to be optimized away, although the function call is live. To keep such calls from being optimized away, put @smallexample asm (""); @end smallexample @noindent (@pxref{Extended Asm}) in the called function, to serve as a special side-effect. Any asm without outputs (including basic asm) is volatile, so it can not be optimised away. Putting an empty asm in a noinline function should always work as you want, it's not because it is basic asm. I know, the point is that we've promised the above and we shouldn't back down on this mechanism to instead deprecate and warn about it, as seems to be the direction. That is indeed the direction I am heading unless someone stops me. I was not aware of this magic. Thanks for pointing it out. To be clear, wouldn't asm("":) have the same effect? Since we have committed to this, we don't want to change it lightly. Also, looking at existing inline asm, this is probably one of the most common uses. Allowing this exception will probably ease the migration, even if it makes the docs a little clunkier. So how about this: 1) Change the docs to say asm("":), so future coders will do "the right thing." 2) Allow the exception for v6. 3) Re-evaluate if-and-when we continue with the deprecation process. This happens to be expressed as a basic asm and was chosen because the right things happen regarding backwards- compatibility. More care has to be taken for forward- compatibility. I'm a bit worried that it still works only because of happenstance. I know what (I hope) you're thinking but it's hard to produce a test-case that fails when a future yet unknown optimization causes "spill-over", but I can imagine a quick return-path seeming inlinable in LTO. dw
Re: basic asm and memory clobbers
>> To be clear, wouldn't asm("":) have the same effect? > > That does not matter. It'd require source-code changes to > users' code. My suggestion was to allow the exception to the "basic asm in a function" warning, but change the docs to show using the new syntax. This does not require any user code change. But it kinda does matter whether or not it will work. Copy/pasting my suggestion: 1) Change the docs to say asm("":), so future coders will do "the right thing." 2) Allow the exception for v6. 3) Re-evaluate if-and-when we continue with the deprecation process. It may not be clear from this, but I don't expect step 3 to happen until at least v7. And saying "the right thing" may be a bit flip. But if deprecating basic asm is the path we are choosing, then telling users to use this syntax is how the text should read. Assuming it works. > BTW, does that syntax work for really olden gcc > (say 2.95 era)? The oldest docs I can find are 2.95.3, and they talk about the extended asm syntax, so I assume so: https://gcc.gnu.org/onlinedocs/gcc-2.95.3/gcc_4.html#SEC93 > Either way, we promised 'asm("")'. I'm pretty > sure the empty string is identifiable. It is. I've already updated my code to support this exception. dw
Re: basic asm and memory clobbers - Proposed solution
On 11/24/2015 6:50 PM, David Wohlferd wrote: I have solved the problem with my previous patch. Here's the update (feedback welcome): http://www.LimeGreenSocks.com/gcc/24414g.zip Based on my understanding from the previous thread, this patch now does what it needs to do (code-wise) to resolve this "basic asm and memory clobbers" issue. As mentioned previously, this patch introduces a new warning (-Wonly-top-basic-asm), which is disabled by default. When enabled, it triggers a warning for any basic asm inside a function, unless the function has the "naked" attribute. An argument can be made that the default for this warning should be 'enabled.' Yes, this will break builds that use basic asm and -Werror, but it can easily be disabled with -Wno-only-top-basic-asm. And if we don't enable it, no one is going to know the check is even available. Then hidden problems like the one Paul was just describing still won't be found, and optimizations will continue to have unexpected side effects. OTOH, I can also see changing this to 'enabled' as more appropriate in the next phase 1. Now that I'm done with the code fix, I'm working on an update to the docs. Obviously they should be checked in as part of the code fix. I'm planning to actually use the word "deprecated" when describing the use of basic asm within functions. Seems like a big step. But there's no point in my proceeding any further until someone in authority agrees that this is the desired solution. I'm not actually sure who that is, but further work is a waste of time if no one is prepared to approve it. If you are that person, the questions to be answered are: 1) Is the idea of changing basic asm to "clobber everything" dead? 2) Is creating a warning for the use of "basic asm inside a function" the solution for this issue? 3) Should the warning be enabled by default in v6? 4) Should the warning be enabled by Wall or Wextra? 5) Should the v6 docs explicitly describe using "basic asm inside a function" as deprecated? If you're looking for my recommendations, I would say: Yes, Yes, (reluctantly) No, No and Yes. With this information in hand, I'll take a shot at finishing this off. For something that started out as a simple 3 sentence doc patch, this sure has turned into a project... I have incorporated the feedback from Hans-Peter Nilsson, who pointed out that the 'noinline' function attribute explicitly documents behavior related to using asm(""). The patch now includes an exception for this. Given how often this bit of code is used in various projects, allowing this exception will surely ease the transition. I'm told that some people won't review patches unless they are included as attachments, so this time the patch is attached. The patch includes first cut docs for the warning message, but I'm still hoping to hear from someone before trying to update the basic asm docs. dw Index: gcc/c-family/c.opt === --- gcc/c-family/c.opt (revision 230734) +++ gcc/c-family/c.opt (working copy) @@ -585,6 +585,10 @@ C++ ObjC++ Var(warn_namespaces) Warning Warn on namespace definition. +Wonly-top-basic-asm +C C++ Var(warn_only_top_basic_asm) Warning +Warn on unsafe uses of basic asm. + Wsized-deallocation C++ ObjC++ Var(warn_sized_deallocation) Warning EnabledBy(Wextra) Warn about missing sized deallocation functions. Index: gcc/c/c-parser.c === --- gcc/c/c-parser.c (revision 230734) +++ gcc/c/c-parser.c (working copy) @@ -5880,7 +5880,18 @@ labels = NULL_TREE; if (c_parser_next_token_is (parser, CPP_CLOSE_PAREN) && !is_goto) + { +/* Warn on basic asm used inside of functions, + EXCEPT when in naked functions. Also allow asm(""). */ +if (warn_only_top_basic_asm && (TREE_STRING_LENGTH (str) != 1) ) + if (lookup_attribute ("naked", +DECL_ATTRIBUTES (current_function_decl)) +== NULL_TREE) +warning_at(asm_loc, OPT_Wonly_top_basic_asm, + "asm statement in function does not use extended syntax"); + goto done_asm; + } /* Parse each colon-delimited section of operands. */ nsections = 3 + is_goto; Index: gcc/cp/parser.c === --- gcc/cp/parser.c (revision 230734) +++ gcc/cp/parser.c (working copy) @@ -17594,6 +17594,8 @@ bool goto_p = false; required_token missing = RT_NONE; + location_t asm_loc = cp_lexer_peek_token (parser->lexer)->location; + /* Look for the `asm' keyword. */ cp_parser_require_keyword (parser, RID_ASM, RT_ASM); @@ -17752,6 +17754,16 @@ /
Re: AW: basic asm and memory clobbers - Proposed solution
On 11/27/2015 11:02 PM, Bernd Edlinger wrote: Hi, I just found this in the docs: The compiler copies the assembler instructions in a basic @code{asm} verbatim to the assembly language output file, without processing dialects or any of the @samp{%} operators that are available with extended @code{asm}. This results in minor differences between basic @code{asm} strings and extended @code{asm} templates. For example, to refer to registers you might use @samp{%eax} in basic @code{asm} and @samp{%%eax} in extended @code{asm}. So it might be good to warn about % in asm statements too, because changing anything on the asm syntax can be is quite dangerous. There are a few differences to be aware of between basic and extended, and yes that is one of them. I'm putting together a "How to convert basic asm to extended asm" guide. Not quite sure where to put it yet. Doesn't really belong in the User Guide, but I worry no one would ever see it in the wiki. And I wonder what the exact equivalence of asm("") is in extended asm ("":::) or asm volatile ("":::) ? From the docs: "asm statements that have no output operands, including asm goto statements, are implicitly volatile." In other words, those 2 are the same. Well, I start to think that Jeff is right, and we should treat a asm ("") as if it were asm volatile ("" ::: ) I believe Segher is already looking at this: https://gcc.gnu.org/ml/gcc/2015-11/msg00196.html but if the asm ("nonempty with optional %") we should treat it as asm volatile ("nonempty with optional %%" ::: "memory"). Our docs should say that explicitly, and the implementation should follow that direction. Bernd.
Re: basic asm and memory clobbers - Proposed solution
On 11/28/2015 10:30 AM, paul_kon...@dell.com wrote: On Nov 28, 2015, at 2:02 AM, Bernd Edlinger wrote: ... Well, I start to think that Jeff is right, and we should treat a asm ("") as if it were asm volatile ("" ::: ) but if the asm ("nonempty with optional %") we should treat it as asm volatile ("nonempty with optional %%" ::: "memory"). I agree. Even if that goes beyond the letter of what the manual has promised before, it is the cautious answer, and it matches expectations of a lot of existing code. Trying to guess what people might have been expecting is a losing game. There is a way for people to be clear about what they want to clobber, and that's to use extended asm. The way to clear up the ambiguity is to start deprecating basic asm, not to add to the confusion by changing its behavior after all these years. And the first step to do that is to provide a means of finding them. That's what the patch at https://gcc.gnu.org/ml/gcc/2015-11/msg00198.html does. Once they are located, people can decide for themselves what to do. If they favor the 'cautious' approach, they can change their asms to use :::"memory" (or start clobbering registers too, to be *really* safe). For people who require maximum backward compatibility and/or minimum impact, they can use :::. Have you tried that patch? How many warnings does it kick out for your projects? dw
Re: basic asm and memory clobbers - Proposed solution
On 12/1/2015 8:08 AM, Richard Earnshaw wrote: > Formatting nit: the '== NULL_TREE)' should line up with the start of > 'lookup_attribute'. > Same here. Ok. Other than that, how do we proceed here? When pursuing a course to "deprecate and later completely remove basic asm within functions," I assume I need a global maintainer or two to sign off on this? While Richard Henderson's post to that effect may have gotten lost in all the discussion (and my ultra-slow-motion roll out plan may have confused things further), that's what's meant by #5 on my "List of questions for a person in authority": 1) Is the idea of changing basic asm to clobber things dead? 2) Is creating a warning for the use of "basic asm inside a function" the solution for this issue? 3) Should the warning be enabled by default in v6? 4) Should the warning be enabled by -Wall or -Wextra? 5) Should the v6 docs explicitly describe using "basic asm inside a function" as deprecated? Saying it's dead in the docs is the first step to making it dead in the code. This patch just implements an optional warning (unless #3,4 crank it up to a default warning), but the intent is that eventually (v7? v8?) this turns into a fatal error. One off-hand comment by someone (even a gm) doesn't seem quite enough to approve this. And some guidance about how quickly we want to get there would also be useful. I've been trying to do the work, but I could use some direction from someone who understands the gcc vision. dw
Re: basic asm and memory clobbers - Proposed solution
On 12/1/2015 10:10 AM, Bernd Edlinger wrote: > And a test case is missing too. > > I think this warning concentrates now only on basic asm. > And people will be probably fix it in the most easy way, > by just adding a colon. Probably true. At least I hope it's that easy for most people. > But IMHO asm("bla":) isn't any better than asm("bla"). > I think _any_ asm with non-empty assembler string, that > claims to clobber _nothing_ is highly suspicious, and worth to be > warned about. I don't see any exceptions from this rule. There's one right now in the basic asm docs: asm("int $3"); And I've seen others: asm volatile ("nop"), asm(".byte 0xf1\n"). I've seen a bunch more, but you get the idea. dw
Re: basic asm and memory clobbers - Proposed solution
On 11/30/2015 4:01 AM, Andrew Haley wrote: >> There is a way for people to be clear about what they want to clobber, >> and that's to use extended asm. The way to clear up the ambiguity is to >> start deprecating basic asm, not to add to the confusion by changing its >> behavior after all these years. > > Well, I disagree. The warning is good, but so is the memory clobber. > They're not exclusive. As Richard Henderson put it: "I'd be perfectly happy to deprecate and later completely remove basic asm within functions." So my intent is that today's optional warning becomes tomorrow's default warning which eventually turns into a fatal error. We could speed that up a bit by making it a default warning now and a fatal error as soon as we re-enter phase 1. I just thought it might be better to update the docs now and delay the change to give more of a heads up. Yes, we could pursue both clobbering and completely removing at the same time, but I don't think we should. Since (currently) basic asm always has no clobbers describing how to update basic asm to extended is fairly simple: To get the same behavior in extended as you used to get from basic, just add a colon. If some versions of gcc perform clobbers, then knowing how to convert gets way harder. dw
Re: basic asm and memory clobbers - Proposed solution
On 12/1/2015 7:56 PM, Segher Boessenkool wrote: On Tue, Dec 01, 2015 at 08:41:22PM -0700, Jeff Law wrote: Isn't "asm" conditionally supported for ISO C++? In which case it's not mandatory and semantics are implementation defined. Yes. My strong preference is still to document the desired semantics for GCC and treat anything that does not adhere to those semantics as a bug. I agree, but I think the semantics should be what they currently are. If we want a "clobbers everything" version we could make a new syntax for that, so that not all current asm-without-operands has to pay the hefty price. I think a non-default warning is fine. A default warning (or -Wall/-Westra) is probably undesirable, though I'm still willing to be convinced either way on that. I don't think the warning is a good idea until we have decided what direction we want this to go in. If we are headed toward removing "basic asm in a function," then the warning (combined with deprecation in the docs) is a logical first step. It lets people find and begin to fix this code with only a trivial change to v6. But if that's not where we're headed, I don't see who would ever use it. I also agree that we shouldn't change the semantics of basic asm to "clobber everything." The coding for the change is not immediately clear, the backward compatibility issues are a challenge, and future moves to extended become much more complicated. I understand that clobbering may fix problems (that no one is asking us to fix), but at a performance, maintenance or 'new bugs' price that I can't see how to justify. I think requiring people to specify what their asm affects (aka extended asm) is the right answer. But if removing "basic asm in a function," is not the direction we want to go, my vote is to just doc the current behavior. dw
Re: AW: basic asm and memory clobbers - Proposed solution
On 12/2/2015 3:34 AM, Bernd Edlinger wrote: Hi, Surely in code like that, you would make "x" volatile? Memory clobbers are not a substitute for correct use of volatile accesses. No, It is as I wrote, a memory clobber is the only way to guarantee that the asm statement is not move somewhere else. I changed the example to use volatile and compiled it with gcc-Version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5) volatile int x; void test() { x = 1; asm volatile("nop"); x = 0; } gcc -S -O2 test.c gives: test: .LFB0: .cfi_startproc movl$1, x(%rip) movl$0, x(%rip) #APP # 6 "test.c" 1 nop # 0 "" 2 #NO_APP ret .cfi_endproc While it works with asm volatile("nop" ::: "memory"). Likewise for "cli" and "sti" if you try to implement critical sections. Although, these instructions do not touch any memory, we need the memory clobber to prevent code motion. If the goal is to order things wrt x, why wouldn't you just reference x? x = 1; asm volatile("nop":"+m"(x)); x = 0; If you have a dependency, stating it explicitly seems a much better approach than hoping that the implied semantics of a memory clobber might get you what you want. Not only is it crystal clear for the optimizers what the ordering needs to be, people maintaining the code can better understand your intent as well. In summary: Yes inline asm can float. Clobbers might help. But I don't believe this is related to the "remove basic asm in functions" work. If a warning here is merited (and I'm not yet convinced), a separate case should be made for it. However it does seem like a good fit for a section in a "How to convert basic asm to extended asm" doc. The extended docs don't mention this need or how extended can be used to address it. It's a good reason for basic asm users to switch. dw
Re: basic asm and memory clobbers - Proposed solution
On 12/1/2015 7:41 PM, Jeff Law wrote: > My strong preference is still to document the desired semantics for GCC and treat anything that does not adhere to those semantics as a bug. Despite nearly 100 posts over 2 threads, we don't seem to be reaching either a consensus or a conclusion. How do we move this from discussion to decision? If the decision were mine to make, I'd just deprecate "basic asm in a function" and be done with it. But it's not. I don't know how the GCC project makes its decisions on issues like this, but the closest we've gotten is a "strong preference" for an approach with which I disagree. I'm going to try to sum up what we've discussed, along with my take on the pluses and minuses of each proposed solution. If after reading this the global reviewers make a final decision (even one I disagree with), I'll try to help move it forward. Otherwise, I guess the discussion fades away and 24414 just sits for another 10 years. - There are three problems related to basic asm that we are trying to solve: #1 The existing (and historical) docs don't describe basic asm's behavior regarding clobbers (registers, memory, etc). #2 People have written basic asm code based on incorrect assumptions about its behavior (easy to do given #1). #3 Because basic asm has no dependencies (except 'volatile'), improvements to optimizers can move it in unexpected ways, sometimes breaking existing projects. - And here are the three solutions that have been proposed: Solution 1: Just document the current behavior of basic asm. People who have written incorrect code can be pointed at the text and told to fix their code. People whose code is damaged by optimizations can rewrite using extended to add dependencies (possibly doc this?). Solution 2: Change the docs to say that basic asm clobbers everything (memory, all registers, etc) or perhaps just memory (some debate here), but that due to bugs that will eventually be addressed, it doesn't currently work this way. Eventually (presumably in a future phase 1) modify the code to implement this. People who have written their code incorrectly may have some hard-to-find problems solved for them. This is particularly valuable for projects that are no longer being maintained. And while clobbers aren't the best solution to dependencies, they can help. Solution 3: Deprecate (and eventually disallow) the use of basic asm within functions (perhaps excluding asm("") and naked functions). Do this by emitting warnings (and eventually fatal errors). Doc this plan. Unless GCC adds a built-in assember, basic asm is always going to be a problem. The whole point of extended asm is to provide the information GCC needs to properly interface C code with asm. Warnings can point out where people have failed to provide that information, allowing them to correct it. Here's my take on the pros/cons of each: Solution 1: Document the current behavior of basic asm. Pro: - Simplest to implement. - Most backward compatible. Con: - Doesn't solve the problems for people who wrote their code wrong, except to let them know that they have. - Doesn't help with dependencies. Mentioning the problem and pointing them to extended might help. A bit. - GCC will need to continue to tweak optimizers to work around problems. Solution 2: Change basic to clobber memory and/or everything. Pro: - Docing the current and intended behavior lets people know what to expect going forward, even though we aren't prepared to implement the change in phase 3. - When the code change is checked in, long-time problems (of a kind that are REALLY hard to find) may be fixed. - Adding a memory clobber (or clobber all) helps with dependencies. A bit. Con: - We haven't agreed exactly what the implementation details are. Docing them before writing the implementation is risky. - Docing "someday" fixes is risky in general since someday may never come. - Implementation may be harder than it sounds (for example how to handle frame pointers). - New users of basic asm won't be able to depend on any specific behavior (since we're explicitly saying the behavior will change). To be sure they'll always get the behavior they want, they'll have to use extended. - Existing users who realize they should have used the memory clobber won't want to wait for a future version of the compiler to fix this. They'll have to use extended too. - Existing users who want to future-proof their code to avoid having clobbers thrust upon them will also have to use extended asm. - While changing the "clobber nothing" semantics to "clobber everything" is WAY safer than doing the reverse, it's still not 100% safe. By definition basic asm is a fragile area. Any changes can conceivably result in failures. - Even without failures, adding memory/re
Re: basic asm and memory clobbers - Proposed solution
On 12/12/2015 1:51 AM, Andrew Haley wrote: Solution 2: Change the docs to say that basic asm clobbers everything (memory, all registers, etc) or perhaps just memory (some debate here), but that due to bugs that will eventually be addressed, it doesn't currently work this way. You've missed the most practical solution, which meets most common usage: clobber memory, but not registers. Actually, that's intended to be part of Solution #2 ("or perhaps just memory"). Sorry if that wasn't clear. That allows most of the effects that people intuitively want and expect, The nice thing about guessing what people want and expect is that it can be molded however we need to suit our point of view. I could make a similar case that what people want is the behavior from other compilers that have built-in assemblers. Those compilers can 'see' what registers are being used by the asm and adapt. Since GCC can't provide that capability, people might 'expect' it to clobber every register if that's what is needed to support the same behavior. I'm not trying to advocate for clobbering registers. My point is simply that I don't see how we can *know* what people want and expect. You can say they expect memory. Jeff can say they expect memory+registers. I can say they expect it to work the same way tomorrow that it did yesterday. Who's right? I'm not prepared to say. But I don't think any of us are wrong. but avoids the breakage of register clobbers. You are right "just memory clobber" does remove some performance issues, and the breakage that could result from clobbering registers. Also, while I have never written nor read an optimizer, my guess is that the biggest 'win' as far as positioning the basic asm comes from the memory clobber (compared to clobbering registers). So this does get us most of the benefits of "clobber everything" with less effort and fewer consequences. However breakage and performance issues can still result solely from adding memory clobbers. And as I mentioned, "just memory clobber" may not be the behavior people expect. And if we aren't solving that, might there be a second update later to add registers? Talk about confusing semantics... It allows basic asm to be used in a sensible way by pushing and popping all used registers. If I were using basic asm, this would indeed seem like a sensible approach. However, it is not the most efficient. If I can clobber registers, push/pop are just wasted cycles. This just seems like another argument for deprecating basic asm and pushing people to extended. -- Imagine for a moment: If the only way right now to do inline asm in gcc was extended, and I proposed adding basic, how could I justify this 'new' feature? Other than 'top-level,' I can't think of a single benefit that it would provide. Pointless push/pops, problems with positioning and the other headaches it causes have no upside. Contra-wise, what if starting today all new inline asm were written using extended? How would that be a bad thing? What would be the downside? Yes, people can still write bad code with extended, but its semantics are well-understood and have been for a long time (certainly compared to basic). Additionally, the abilities it provides allow people to solve problems that basic never will. Which means (to me), the only real justification for the continued existence of basic asm is backward compatibility. Which makes the arguments for changing its behavior (whether a little or a lot) kinda weird. I still vote for doing everything we can think of to discourage people from using basic and begin using extended: - Change the docs to flat out deprecate basic (excluding top-level). - Add the warning so people's integrated dev environments will show the suspect lines. - Make the warning a default, but overridable (-Wno-only-top-basic-asm), so people who HAVE to support the old syntax still can. - Make sure the docs for the warning describe (link to?) how to change asm from basic to extended and why. - Any bugs people report or posts people make involving basic asm get resolved as 'Try it with extended.' Every year these items are in place would make basic asm less important as people slowly move to extended. Maybe by the time we get around to flagging it as a fatal error (say in 2025), no one will care. True, this won't magically solve people's problems for them if they've been using basic asm wrong. But it does point out that there are (potential) problems, where those problems are and motivates people to fix them (which seems like quite reasonable behavior for a compiler). More importantly, we haven't broken or degraded anything that was already working. I'm just afraid that instead of pursuing any of these solutions, we are going to pursue Solution #0: Do nothing. A rather unsatisfying outcome after all this effort. dw PS I was surprised by how many people
Re: basic asm and memory clobbers - Proposed solution
Is there a decision maker still teetering on the edge of making a call here? Or have they all moved on and we are just talking among ourselves? I keep worrying that if I don't reply, someone will swoop in, read the last message in the thread, and charge off to make a changes based on that. So... On 12/13/2015 1:25 AM, Bernd Edlinger wrote: That is also my preferred solution, I'm really not sure what the point of doing the memory clobber is. Yes, I see that it is easier to code than "clobber everything." And yes, it might help with certain types of bugs for people who have or might someday misuse asm. But if we are trying to give users what they expect, shouldn't we be doing the "clobber everything" Jeff suggested (https://gcc.gnu.org/ml/gcc/2015-11/msg00081.html)? Surely that's the "safest" answer, and it's as likely to be what people expect as anything else. Adding the memory clobber will already break backward compatibility and risk breaking existing code. Why do all that for half a solution? Yes it's harder, but please don't tell me we plan to change things yet again some day to do the rest? > The rationale for this is: there are lots of ways to write basic asm > statements, that may appear to work, if they clobber memory. > because you can use all global values simply by name, users will > expect that to be supported. > people who know next > to nothing about assembler will be encouraged to "fix" this warning in a > part of the code which they probably do not understand at all. So in the end, we are proposing a potentially breaking change, not because we are violating a standard, not to give users what (some people think) they expect, and not even because someone has actually reported a problem. But because someone might use it wrong, and if we just point the statement out to them, they might not know how to correct it. That seems like a lot of risk for so very little gain. And if we do end up changing this, using basic asm will become more dangerous than ever. In addition to all the old problems (that will still exist in earlier versions), now its behavior VARIES between versions. Programmers who want specific behavior (rather than our assumption-du-jour) will be forced to CHANGE THEIR CODE to account for this variability. So my solution highlights at-risk statements that must be changed to use extended asm. Your solution is that people will need to realize for themselves that the statements must change to extended in order to work the same way in newer versions, and then must find them all for themselves. Further, if people (correctly) fix the warnings I'm proposing, then they get correct behavior for all supported versions of the compiler. If people do the nothing you are proposing, they may still have problems even after the fix (if they assume registers get clobbered) or may have new problems (performance or otherwise), and will still have problems using any version of the compiler earlier than the one expected to be released sometime late next year. I prepared a patch for next stage1, the details are still under discussion, but it is certainly do-able with not too much effort. See https://gcc.gnu.org/ml/gcc-patches/2015-12/msg00938.html I was unaware of this conversion. Unfortunate that you didn't think to cc me on this, as there are several points I would have made regarding the proposed code and doc changes. But mostly, I would have said what Bernd Schmidt said: "I'm not sure there was any consensus in that other thread" and "I think it would be best if we could deprecate basic asms in functions, or at least warn about them in -Wall." I have been talking myself blue in the fingers on this topic since early November and I honestly don't know if I have convinced anyone of anything. At this point, I really don't know what more I can say. It's pretty much all in https://gcc.gnu.org/ml/gcc/2015-12/msg00092.html FWIW. dw
Re: basic asm and memory clobbers - Proposed solution
On 12/14/2015 1:53 AM, Andrew Haley wrote: > This just seems like another argument for deprecating basic asm and pushing people to extended. Yes. I am not arguing against deprecation. We should do that. You know, there are several people who seem to generally support this direction. Not enough to call it a consensus, but perhaps the beginning of one: - Andrew Haley - David Wohlferd - Richard Henderson - Segher Boessenkool - Bernd Schmidt Anyone else want to add their name here? Maybe it's the implementation details that have other people concerned. My thought is that for v6 we change the docs to say something like: - Unlike top level, using basic asm within a function is deprecated. No new code should use this feature, but should use extended asm instead. Existing code should begin replacing such usage. Instances of affected code can be found using -Wonly-top-basic-asm. For help making this conversion, see "How to convert Basic asm to Extended asm." - With this, we only need to add the warning as a non-default. This will (hopefully) stop new users from using this, and can begin the work of removing it from existing code. The problem with that approach is that (I'm told) some people don't read the docs. Imagine that. Such being the case, they won't even know this is happening. That's why I think that in the next phase 1 (v7?), we should change the warning to 'on' by default (or maybe as part of -Wall?). People will still be able to override it with no-, but at least it will raise awareness and chase more of it out of people's code. What about the final step of actually removing support for basic asm in a function as some people propose? Should we really do this? If so, when? That's a more difficult question. If we make the warning a default in v7, then it would be *at least* v8 before this should be considered. Trying to make any plan that far ahead seems... optimistic. Perhaps by then we'll have more feedback upon which to make a decision. We *could* be more aggressive and start right off with making the warning 'on' by default. But to give people the best chance to prepare, perhaps starting with the non-default is best. Even if we don't all agree about _removing_ "basic asm in a function," can we find consensus that having less of it is a good thing? Because this approach gets us that. dw
Re: basic asm and memory clobbers - Proposed solution
On 12/15/2015 12:42 PM, paul_kon...@dell.com wrote: In the codebase for the product I work on, I see about 200 of them. Many of those are the likes of asm("sync") for MIPS, which definitely wants to be treated as if it were asm ("sync" : : : "memory"). That's right, I meant to ask you about this last time you mentioned this. Now that you are aware that this is a problem, what do you intend to do about it? Jeff is saying that this may not be fixed until at least v7, so waiting for a compiler fix may take a while. Will you be updating your source? Are you just finding these with grep, or have you tried the -Wonly-top-basic-asm patch? That's not counting the hundreds I see in gdb/stubs -- those are "outside a function" flavor. Fortunately, these aren't a problem for memory or register clobbers. dw
Re: basic asm and memory clobbers - Proposed solution
On 12/15/2015 1:13 PM, Jeff Law wrote: Sadly, I'm putting most of this discussion into my gcc-7 queue anyway. Fair enough. If "clobbers" is what we're going to do, that sounds more like a phase 1 thing. That said, some people who have this problem may prefer to fix it sooner rather than later. If we soften the text for the docs to remove the word "deprecate," would just adding -Wonly-top-basic-asm (non-default) be useful for v6? dw
Re: basic asm and memory clobbers - Proposed solution
On 12/15/2015 5:01 PM, paul_kon...@dell.com wrote: On Dec 15, 2015, at 5:22 PM, David Wohlferd wrote: On 12/14/2015 1:53 AM, Andrew Haley wrote: This just seems like another argument for deprecating basic asm and pushing people to extended. Yes. I am not arguing against deprecation. We should do that. You know, there are several people who seem to generally support this direction. Not enough to call it a consensus, but perhaps the beginning of one: - Andrew Haley - David Wohlferd - Richard Henderson - Segher Boessenkool - Bernd Schmidt Anyone else want to add their name here? No, but I want to speak in opposition. Fair enough. "Deprecate" means two things: warn now, remove later. Yup. That's what I'm proposing. Although "later" could be a decade down the road. That's how long 24414 has been sitting. For reasons stated by others, I object to "remove later". So "warn now, remove never" I would support, but not "deprecate". So how about: - Update the basic asm docs to describe basic asm's current (and historical) semantics (ie clobber nothing). - Emphasize how that might be different from users' expectations or the behavior of other compilers. - Warn that this could change in future versions of gcc. To avoid impacts from this change, use extended. - Mention -Wonly-top-basic-asm as a way to locate affected statements. Would that be something you could support? What's your take on making -Wonly-top-basic-asm a default (either now or v7)? Is making it a non-default a waste of time because no one will ever see it? Or is making it a default too aggressive? What about adding it to -Wall? dw
Re: basic asm and memory clobbers - Proposed solution
On 12/15/2015 2:43 PM, Joseph Myers wrote: On Tue, 15 Dec 2015, David Wohlferd wrote: Unlike top level, using basic asm within a function is deprecated. No new code should use this feature, but should use extended asm instead. Existing code should begin replacing such usage. Instances of affected code can be found using -Wonly-top-basic-asm. For help making this conversion, see "How to convert Basic asm to Extended asm." I think the typical use of basic asm is: you want to manipulate I/O registers or other such state unknown to the compiler (not any registers the compiler might use itself), and you want to do it in a way that is maximally compatible with as many compilers as possible (hence limiting yourself to the syntax subset that's in the C++ standard, for example). Compatibility with a wide range of other compilers is the critical thing here; this is not a GCC-invented feature, and considerations for deprecating an externally defined feature are completely different from considerations for GCC-invented features. Do you have evidence that it is now unusual for compilers to support basic asm without supporting GCC-compatible extended asm, or that other compiler providers generally consider basic asm deprecated? On the contrary, I would be surprised to learn that there are ANY compilers (other than clang) that support gcc's extended asm format. And although there is no standard that seems to require it, I'm not certainly not prepared to say that basic asm is "generally deprecated." But the fact that there is no standard may make "doing what other compilers do" challenging. For example quoting from Bernd's email regarding the windriver diab compiler: "non-scratch registers must be preserved." Implying that scratch registers (which they apparently only list for ARM) are considered clobbered. That seems like a sensible approach (and avoids the frame pointer problem). But I'm not prepared to extrapolate from that how all compilers do or should handle basic asm. However it does mean that the suggestion being proposed here to have basic asm only clobber memory would not be compatible with windriver's approach to basic asm. And Jeff's proposal for gcc to clobber "all registers" wouldn't be compatible with them either. I agree that "oh, surprise! gcc does this differently than compiler X!" is a bad thing. But without standards, trying to be compatible with how "everyone else" does it may not be practical. You'll note that windriver made no particular effort to be compatible with gcc. And of course any change will make gcc v7 work differently than gcc v4, v5, v6. If compatibility with other compilers is an important criteria when determining how gcc should handle basic asm, someone's going to need to do some research and some prioritizing. In the meantime, raising awareness of this issue via docs and warnings seems a low-cost way to start. dw
Re: basic asm and memory clobbers - Proposed solution
On 12/17/2015 6:03 AM, Jeff Law wrote: On 12/17/2015 03:39 AM, Andrew Haley wrote: On 17/12/15 01:41, David Wohlferd wrote: On the contrary, I would be surprised to learn that there are ANY compilers (other than clang) that support gcc's extended asm format. Prepare to be surprised: Sun Studio compilers seem to support it just fine. And Intel's ICC. Ok, I admit it: I'm surprised. dw
Re: AW: basic asm and memory clobbers - Proposed solution
On 12/17/2015 11:30 AM, Bernd Edlinger wrote: On Thu, 17 Dec 2015 15:13:07, Bernd Schmidt wrote: What's your take on making -Wonly-top-basic-asm a default (either now or v7)? Is making it a non-default a waste of time because no one will ever see it? Or is making it a default too aggressive? What about adding it to -Wall? Depends if anyone has one in system headers I guess. We could try to add it to -Wall. Sorry, but if I have to object. Adding this warning to -Wall is too quickly and will bring the ia64, tilegx and mep ports into trouble. It doesn't look to me like adding the warnings will affect gcc itself. But I do see how it could have an impact on people using gcc on those platforms, if the warning causes them to convert to extended asm. Each of them invokes some special semantics for basic asm: I'm collecting these for my "How to convert basic asm to extended asm" document. This may need to go in the gcc wiki instead of the user guide, since people may find important conversion tips like these asynchronous to gcc's releases. mep: mep_interrupt_saved_reg looks for ASM_INPUT in the body, and saves different registers if found. I'm trying to follow this code. A real challenge since I know nothing about mep. But what I see is: - This routine only applies to functions marked as __attribute__((interrupt)). - To correctly generate entry/exit, it walks thru each register (up to FIRST_PSEUDO_REGISTER) to see if it is used by the routine. If there is any basic asm within the routine (regardless of its contents), the register is considered 'in use.' The net result is that every register gets saved/restored by the entry/exit of this routine if it contains basic asm. The reason this is important is that if someone just adds a colon, it would suddenly *not* save/restore all the registers. Depending on what the asm does, this could be bad. Does that sound about right? This is certainly worth mentioning in the 'convert' doc. I wonder how often this 'auto-clobber' feature is used, though. I don't see it mentioned in the 'interrupt' attribute docs for mep, and it's not in the basic asm docs either. If your interrupt doesn't need many registers, it seems like you'd want to know this and possibly use extended. And you'd really want to know if you are doing a (redundant) push/pop in your interrupt. tilegx: They never wrap {} around inline asm. But extended asm, are handled differently, probably automatically surrounded by braces. I know nothing about tilegx either. I've tried to read the code, and it seems like basic asm does not get 'bundled' while extended might be. Bundling for tilegx (as I understand it) is when you explicitly fill multiple pipelines by doing something like this: { add r3,r4,r5 ; add r7,r8,r9 ; lw r10,r11 } So if you have a basic asm statement, you wouldn't want it automatically bundled by the compiler, since your asm could be more than 3 statements (the max?). Or your asm may do its own bundling. So it makes sense to never output braces when outputting basic asm. I know I'm guessing about what this means, but doesn't it seem like those same limitations would apply to extended? I wonder if this is a bug. I don't see any discussion of bundling (of this sort) in the docs. ia64: ASM_INPUT emits stop-bits. extended asm does not the same. That was already mentioned by Segher. I already had this one from Segher's email. Given all this, I'm more convinced than ever of the value of -Wonly-top-basic-asm. Basic asm is quirky, has undocumented bits and its behavior is incompatible with other compilers, even though it uses the same syntax. If I had any of this in my projects, I'd sure want to find it and look it over. But maybe Bernd is right and it's best to leave the warning disabled in v6, even by -Wall. I may ask this question again in the next phase 1... With that in mind, how do you feel about the basic plan: - Update the basic asm docs to describe basic asm's current (and historical) semantics (ie clobber nothing). - Emphasize how that might be different from users' expectations or the behavior of other compilers. - Warn that this could change in future versions of gcc. To avoid impacts from this change, use extended. - Reference the "How to convert from basic asm to extended asm" guide (where ever it ends up living). - Mention -Wonly-top-basic-asm as a way to locate affected statements. - -Wonly-top-basic-asm is disabled by default and not enabled by -Wall or -Wextra. Does this seem like a safe and useful change for v6? dw
Re: basic asm and memory clobbers - Proposed solution
On 12/18/2015 11:55 AM, Bernd Edlinger wrote: On 18.12.2015 10:27, David Wohlferd wrote: On 12/17/2015 11:30 AM, Bernd Edlinger wrote: Adding this warning to -Wall is too quickly and will bring the ia64, tilegx and mep ports into trouble. It doesn't look to me like adding the warnings will affect gcc itself. But I do see how it could have an impact on people using gcc on those platforms, if the warning causes them to convert to extended asm. At least we should not start a panic until we have really understood all the details, how to do that. Phase 1 is a better place to start a panic. mep: mep_interrupt_saved_reg looks for ASM_INPUT in the body, and saves different registers if found. I'm trying to follow this code. A real challenge since I know nothing about mep. But what I see is: - This routine only applies to functions marked as __attribute__((interrupt)). - To correctly generate entry/exit, it walks thru each register (up to FIRST_PSEUDO_REGISTER) to see if it is used by the routine. If there is any basic asm within the routine (regardless of its contents), the register is considered 'in use.' The net result is that every register gets saved/restored by the entry/exit of this routine if it contains basic asm. The reason this is important is that if someone just adds a colon, it would suddenly *not* save/restore all the registers. Depending on what the asm does, this could be bad. Does that sound about right? Yes. Seems like a doc update would be appropriate here too then, if anyone wanted to volunteer... This is certainly worth mentioning in the 'convert' doc. I wonder how often this 'auto-clobber' feature is used, though. I don't see it mentioned in the 'interrupt' attribute docs for mep, and it's not in the basic asm docs either. If your interrupt doesn't need many registers, it seems like you'd want to know this and possibly use extended. And you'd really want to know if you are doing a (redundant) push/pop in your interrupt. tilegx: They never wrap {} around inline asm. But extended asm, are handled differently, probably automatically surrounded by braces. I know nothing about tilegx either. I've tried to read the code, and it seems like basic asm does not get 'bundled' while extended might be. Bundling for tilegx (as I understand it) is when you explicitly fill multiple pipelines by doing something like this: { add r3,r4,r5 ; add r7,r8,r9 ; lw r10,r11 } So if you have a basic asm statement, you wouldn't want it automatically bundled by the compiler, since your asm could be more than 3 statements (the max?). Or your asm may do its own bundling. So it makes sense to never output braces when outputting basic asm. I know I'm guessing about what this means, but doesn't it seem like those same limitations would apply to extended? I wonder if this is a bug. I don't see any discussion of bundling (of this sort) in the docs. I wold like to build a cross compiler, but currently that target seems to be broken. I have to check that target anyways, because of my other patch with the memory clobbers. I see in tilegx_asm_output_opcode, that they do somehow automatically place braces. An asm("pseudo":) has a special meaning, and can be replaced with "" or "}". However the static buf[100] means that any extended asm string > 95 characters, invokes the gcc_assert in line 5487. In the moment I would _not_ recommend changing any asm statements without very careful testing. Yes, the handling of extended asm there is very strange. If bundles can only be 3 instructions, then appending an entire extended asm in a single (already in-use) bundle seems odd. But maybe that's just because I don't understand tilegx. I'm not sure it's just changing basic asm to extended I would be concerned about. I'd worry about using extended asm at all... ia64: ASM_INPUT emits stop-bits. extended asm does not the same. That was already mentioned by Segher. I already had this one from Segher's email. Given all this, I'm more convinced than ever of the value of -Wonly-top-basic-asm. Basic asm is quirky, has undocumented bits and its behavior is incompatible with other compilers, even though it uses the same syntax. If I had any of this in my projects, I'd sure want to find it and look it over. But maybe Bernd is right and it's best to leave the warning disabled in v6, even by -Wall. I may ask this question again in the next phase 1... Aehm, yes, maybe the warning could by then be something more reasonable like: "Warning: the semantic of basic asm has changed to include implicit memory clobber, if you think that is a problem for you, please convert it to basic asm, otherwise just relax." I don't think Jeff wants to pursue changes to basic asm&
doc maintainer questions
I have been discussing adding some content to the basic asm docs. As part of this work, I want to add a discussion of "How to convert basic asm to extended asm." However it doesn't seem like this is a good fit for the User Guide. This is both because the UG doesn't generally talk about "How To" write code, and because the text may need updates more often than the UG gets released. I'm thinking this might be a better fit for the gcc wiki. But that brings up a few questions: 1) Is it appropriate for the UG to link to sections in the wiki? I see that we do, but should we? 2) How do I get 'edit' access to the wiki? 3) Where in the wiki should it go? dw
Re: basic asm and memory clobbers - Proposed solution
On 12/20/2015 10:26 AM, Bernd Edlinger wrote: On 19.12.2015 19:54, David Wohlferd wrote: mep: mep_interrupt_saved_reg looks for ASM_INPUT in the body, and saves different registers if found. I'm trying to follow this code. A real challenge since I know nothing about mep. But what I see is: - This routine only applies to functions marked as __attribute__((interrupt)). - To correctly generate entry/exit, it walks thru each register (up to FIRST_PSEUDO_REGISTER) to see if it is used by the routine. If there is any basic asm within the routine (regardless of its contents), the register is considered 'in use.' The net result is that every register gets saved/restored by the entry/exit of this routine if it contains basic asm. The reason this is important is that if someone just adds a colon, it would suddenly *not* save/restore all the registers. Depending on what the asm does, this could be bad. Does that sound about right? Yes. Seems like a doc update would be appropriate here too then, if anyone wanted to volunteer... To confirm this, I built a cross-compiler, but It was difficult, because of pr64402. Yes, a function with __attribute__((interrupt)) spills lots of registers, when it just contains asm(""); but almost nothing, if asm("":); is used. That is remarkable. So if you use extended and clobber a couple registers asm("":::"r0", "r1"), does it spill just those specific registers? That would be cool. I don't write interrupt handlers, and certainly not on mep. But the little I know about them says that performance is an important (and sometimes critical) characteristic. There would be risk in changing this to extended (if you used a register but forgot to clobber it), but depending on the interrupt, it could be a nice performance 'win.' If no one else is prepared to step up to write this, I can. I'm just uncomfortable doing so since I can't try it myself. And I feel weird writing a patch for mep given that I know nothing about it. But since Bernd has tried it, maybe something like this added to the 'interrupt' attribute on https://gcc.gnu.org/onlinedocs/gcc/MeP-Function-Attributes.html - Be aware that if the function contains any basic @code{asm} (@pxref{Basic Asm}), all registers (whether referenced in the asm or not) will be preserved upon entry and restored upon exiting the interrupt. More efficient code can be generated by using extended @code{asm} (@pxref{Extended Asm}) and explicitly listing only the specific registers that need to be preserved (or none if your asm preserves any registers it uses). - tilegx: They never wrap {} around inline asm. But extended asm, are handled differently, probably automatically surrounded by braces. I know nothing about tilegx either. I've tried to read the code, and it seems like basic asm does not get 'bundled' while extended might be. Bundling for tilegx (as I understand it) is when you explicitly fill multiple pipelines by doing something like this: { add r3,r4,r5 ; add r7,r8,r9 ; lw r10,r11 } So if you have a basic asm statement, you wouldn't want it automatically bundled by the compiler, since your asm could be more than 3 statements (the max?). Or your asm may do its own bundling. So it makes sense to never output braces when outputting basic asm. I know I'm guessing about what this means, but doesn't it seem like those same limitations would apply to extended? I wonder if this is a bug. I don't see any discussion of bundling (of this sort) in the docs. I wold like to build a cross compiler, but currently that target seems to be broken. I have to check that target anyways, because of my other patch with the memory clobbers. I see in tilegx_asm_output_opcode, that they do somehow automatically place braces. An asm("pseudo":) has a special meaning, and can be replaced with "" or "}". However the static buf[100] means that any extended asm string > 95 characters, invokes the gcc_assert in line 5487. In the moment I would _not_ recommend changing any asm statements without very careful testing. Yes, the handling of extended asm there is very strange. If bundles can only be 3 instructions, then appending an entire extended asm in a single (already in-use) bundle seems odd. But maybe that's just because I don't understand tilegx. I'm not sure it's just changing basic asm to extended I would be concerned about. I'd worry about using extended asm at all... I also built a tilegx cross compiler, but that was even more difficult, because of pr68917, which hit me when building the libgcc. I tried a while, but was unable to find any example of an extended asm that gets auto-braces, or which can trigger the mentioned gcc_assert. It looks like, in all cases the bundles are closed already before t
"cc" clobber
It is well known that on i386, the "cc" clobber is always set for extended asm, whether it is specified or not. I was wondering how much difference it might make if the generated code actually followed what the user specified (expectation: not much). But implementing this turned up a different question. I started by just commenting out the code in ix86_md_asm_adjust that unconditionally clobbered the flags. I figured this would allow the 'normal' "cc" handling to occur. But apparently there is no 'normal' "cc" handling. So I went back to ix86_md_asm_adjust and tried to handle the "cc" if it was specified in the clobbers argument. But apparently "cc" doesn't get added to that clobbers list. Hmm. Tracing back to see how the "memory" clobber (which does get added to the clobber list) is handled brings me to expand_asm_stmt() in cfgexpand.c. Following the example set by the memory clobber, it looks like I want something like this: else if (j == -3) { #if defined(__i386__) || defined(__x86_64__) rtx x = gen_rtx_REG (CCmode, FLAGS_REG); clobber_rvec.safe_push (x); x = gen_rtx_REG (CCFPmode, FPSR_REG); clobber_rvec.safe_push (x); #endif } Now I can check for this in the clobbers to ix86_md_asm_adjust and SET_HARD_REG_BIT as appropriate. Tada. It's working, but can that be right? Why do I need to do this for i386? How do other platforms handle "cc"? Other than not rejecting it as an invalid clobber, I can't find any code that seems to recognize "cc." Has "cc" become just an unenforced comment on all platforms? Or did I just miss it? dw
Re: "cc" clobber
On 1/26/2016 4:31 PM, Bernd Schmidt wrote: On 01/27/2016 12:12 AM, David Wohlferd wrote: I started by just commenting out the code in ix86_md_asm_adjust that unconditionally clobbered the flags. I figured this would allow the 'normal' "cc" handling to occur. But apparently there is no 'normal' "cc" handling. I have a dim memory that there's a difference between the "cc" and "CC" spellings. You might want to check that. I checked, but my scan of the current code isn't turning up anything for "CC" related to clobbers either. While presumably "cc" did something at one time, apparently now it's just an unenforced comment (on extended asm). Not a problem, just a bit of a surprise. dw
Re: "cc" clobber
On 2/1/2016 6:58 AM, Ulrich Weigand wrote: I think on many targets a clobber "cc" works because the backend actually defines a register named "cc" to correspond to the flags. Therefore the normal handling of clobbering named hard registers catches this case as well. This doesn't work on i386 because there the flags register is called "flags" in the back end. Doh! Of course. This makes perfect sense. Thanks. dw
Re: "cc" clobber
On 2/1/2016 12:40 PM, Richard Henderson wrote: On 02/02/2016 01:58 AM, Ulrich Weigand wrote: I think on many targets a clobber "cc" works because the backend actually defines a register named "cc" to correspond to the flags. Therefore the normal handling of clobbering named hard registers catches this case as well. Yes. C.f. Sparc ADDITIONAL_REGISTER_NAMES. This doesn't work on i386 because there the flags register is called "flags" in the back end. Once upon a time i386 used cc0. A survey of existing asm showed that almost no one clobbered "cc", and that in the process of changing i386 from cc0 to an explicit flags register we would break almost everything that used asm. The only solution that scaled was to force a clobber of the flags register. That was 1999. I think you'll buy nothing but pain in trying to change this now. I expect you are right. After experimenting, the cases where this might buy you any benefit are just too uncommon, and the 'benefit' is just too small. The one place where any of this would (sort of) be useful is checking for the "cc" clobber conflicting with the output parameters. This didn't used to be a thing, but now that i386 can 'output' flags, it is. The compiler currently accepts both of these and they both produce the same code: asm("": "=@ccc"(x) : : ); asm("": "=@ccc"(x) : : "cc"); I assert (pr69095) that the second one should give an error (docs: "Clobber descriptions may not in any way overlap with an input or output operand"). Creating a check for this was more challenging than I expected. I kept assuming that there 'had' to be existing code to handle cc and I could tie into it if I could only figure out where it was. But now that I have this written, I'm still vacillating about whether it is useful. It seems like I could achieve the same result by adding "Using @cc overrides the "cc" clobber" to the docs. But hey, it also checks for duplicate "memory" and "cc" clobbers, so there's that... dw
Bug maintenance
As part of the work I've done on inline asm, I've been looking thru the bugs for it. There appear to be a number that have been fixed or overtaken by events over the years, but the bug is still open. Is closing some of these old bugs of any value? If so, how do I pursue this? dw
Re: Bug maintenance
On 4/28/2016 2:01 AM, Richard Biener wrote: On Thu, Apr 28, 2016 at 9:35 AM, David Wohlferd wrote: As part of the work I've done on inline asm, I've been looking thru the bugs for it. There appear to be a number that have been fixed or overtaken by events over the years, but the bug is still open. Is closing some of these old bugs of any value? Yes, definitely. If so, how do I pursue this? I suppose adding a final comment to them will work, people (like me) watching gcc-bugs can then do the actual closing. Perfect. That gives the OP a chance to respond as well. Look for my updates to 30527, 39440 & 43319. dw
Re: Bug maintenance
On 4/28/2016 9:41 AM, Martin Sebor wrote: On 04/28/2016 01:35 AM, David Wohlferd wrote: As part of the work I've done on inline asm, I've been looking thru the bugs for it. There appear to be a number that have been fixed or overtaken by events over the years, but the bug is still open. Is closing some of these old bugs of any value? If so, how do I pursue this? There are nearly 10,000 still unresolved bugs in Bugzilla, almost half of which are New, and a third Unconfirmed, so I'm sure any effort to help reduce the number is of value and appreciated. That's exactly what prompted me to ask. There's such a vast number of them, it's hard to believe that 9 year old bugs are still of interest. I can share with you my own approach to dealing with them (others might have better suggestions). In cases where the commit that fixed a bug is known, I mention it in the comment closing the bug. I also try to indicate the version in which the bug was fixed (if I can determine it using the limited number of versions I have built). Otherwise, when a test doesn't already exist (finding out whether or not one does can be tedious), I add one before closing the bug will help avoid a regression. I'll see what I can do. dw
Re: Bug maintenance
On 4/28/2016 12:23 PM, Andrew Pinski wrote: On Thu, Apr 28, 2016 at 12:35 AM, David Wohlferd wrote: As part of the work I've done on inline asm, I've been looking thru the bugs for it. There appear to be a number that have been fixed or overtaken by events over the years, but the bug is still open. Is closing some of these old bugs of any value? Yes it is. In fact this is how I got my start into GCC. If so, how do I pursue this? If you go through the bug reports and have a low rate of false positives, I (and others) can get you permission to change the bug reports (I started out with a bug report only account too). I'll do my best. But it's not always clear what might trigger a debate. I swear to you, I never expected 24414 to blow up the way it did. dw
Machine constraints list
Looking at the v6 release criteria (https://gcc.gnu.org/gcc-6/criteria.html) there are about a dozen supported platforms. Looking at the Machine Constraints docs (https://gcc.gnu.org/onlinedocs/gcc/Machine-Constraints.html), there are 34 architectures listed. That's a lot of entries to scroll thru. If these architectures aren't supported anymore, is it time to drop some of these from this page? As a first pass, maybe something like this: Keep AArch64 family—config/aarch64/constraints.md ARM family—config/arm/constraints.md MIPS—config/mips/constraints.md PowerPC and IBM RS6000—config/rs6000/constraints.md S/390 and zSeries—config/s390/s390.h SPARC—config/sparc/sparc.h x86 family—config/i386/constraints.md Drop ARC —config/arc/constraints.md AVR family—config/avr/constraints.md Blackfin family—config/bfin/constraints.md CR16 Architecture—config/cr16/cr16.h Epiphany—config/epiphany/constraints.md FRV—config/frv/frv.h FT32—config/ft32/constraints.md Hewlett-Packard PA-RISC—config/pa/pa.h Intel IA-64—config/ia64/ia64.h M32C—config/m32c/m32c.c MeP—config/mep/constraints.md MicroBlaze—config/microblaze/constraints.md Motorola 680x0—config/m68k/constraints.md Moxie—config/moxie/constraints.md MSP430–config/msp430/constraints.md NDS32—config/nds32/constraints.md Nios II family—config/nios2/constraints.md PDP-11—config/pdp11/constraints.md RL78—config/rl78/constraints.md RX—config/rx/constraints.md SPU—config/spu/spu.h TI C6X family—config/c6x/constraints.md TILE-Gx—config/tilegx/constraints.md TILEPro—config/tilepro/constraints.md Visium—config/visium/constraints.md Xstormy16—config/stormy16/stormy16.h Xtensa—config/xtensa/constraints.md dw
Re: Machine constraints list
On 5/9/2016 6:42 AM, paul_kon...@dell.com wrote: On May 8, 2016, at 6:27 PM, David Wohlferd wrote: If these architectures aren't supported anymore, is it time to drop some of these from this page? Your theory is quite mistaken. A lot of the ones you labeled "drop" are supported. Quite possibly all of them. Ok, I see that. Spot checking some of the architectures, they are still getting periodic checkins. In my defense, I can't find any official list of which are 'tertiary' and which are deprecated (https://gcc.gnu.org/ml/gcc/2016-03/msg00010.html). That said, there are still a lot of entries on that machine constraint page. How about if I re-organize the list similar to function attributes (https://gcc.gnu.org/onlinedocs/gcc/Function-Attributes.html)? Or at a minimum, add @anchors for each architecture so there are links? dw
Deprecating basic asm in a function - What now?
Perhaps this post should be directed toward port maintainers? Since several global maintainers have now suggested it, I have created a patch that deprecates basic asm when used in a function (attached). It excludes (ie does not deprecate) top level asm, asm in "naked" functions, asm with empty instruction strings, and extended asm. Building gcc using this patch turns up a few places that use this feature, so I fixed them. Where possible, I used builtins to replace the asm. For ease-of-review, these changes are in their own patch (attached) and obviously this patch should be checked in first. But before I send these 2 off to gcc-patches, there's a problem. What about platforms other than x86/x64? I don't speak other assembler languages, and have no setup with which to test them. I could try to provide patches for other platforms, but it would probably be faster for platform experts to just make the changes themselves, rather than trying to review my efforts. Especially if they also want to move to builtins (which I hope they do). I could just send the patches and let the chips fall where they may, but if there's a less disruptive approach, let me know. dw PS I have done a scan for uses of basic asm to get some idea of the scope of the remaining work. My results: All basic asm in trunk: 1,105 instances. - Exclude 273 instances with empty strings leaving 832. - Exclude 271 instances for boehm-gc project leaving 561. - Exclude 202 instances for testsuite project leaving 359. - Exclude 282 instances that are (apparently) top-level leaving ~77 instances of basic-asm-in-a-function to be fixed for gcc builds. Most of these are in gcc/config or libgcc/config with just a handful per platform. Lists available upon request. FWIW... Index: gcc/ada/init.c === --- gcc/ada/init.c (revision 237502) +++ gcc/ada/init.c (working copy) @@ -2141,7 +2141,7 @@ #if defined (__i386__) && !defined (VTHREADS) /* This is used to properly initialize the FPU on an x86 for each process thread. Is this needed for x86_64 ??? */ - asm ("finit"); + asm ("finit":); #endif /* Similarly for SPARC64. Achieved by masking bits in the Trap Enable Mask @@ -2652,7 +2652,7 @@ /* This is used to properly initialize the FPU on an x86 for each process thread. */ - asm ("finit"); + asm ("finit":); #endif /* Defined __i386__ */ } Index: libatomic/config/x86/fenv.c === --- libatomic/config/x86/fenv.c (revision 237502) +++ libatomic/config/x86/fenv.c (working copy) @@ -68,10 +68,10 @@ if (excepts & FE_DENORM) { struct fenv temp; - asm volatile ("fnstenv\t%0" : "=m" (temp)); - temp.__status_word |= FE_DENORM; - asm volatile ("fldenv\t%0" : : "m" (temp)); - asm volatile ("fwait"); + __builtin_ia32_fnstenv (&temp); + temp.__status_word |= FE_DENORM; + __builtin_ia32_fldenv (&temp); + asm ("fwait" ::"m" (temp):); /* Trigger the fp exception. */ } if (excepts & FE_DIVBYZERO) { @@ -88,18 +88,18 @@ if (excepts & FE_OVERFLOW) { struct fenv temp; - asm volatile ("fnstenv\t%0" : "=m" (temp)); - temp.__status_word |= FE_OVERFLOW; - asm volatile ("fldenv\t%0" : : "m" (temp)); - asm volatile ("fwait"); + __builtin_ia32_fnstenv (&temp); + temp.__status_word |= FE_OVERFLOW; + __builtin_ia32_fldenv (&temp); + asm ("fwait" ::"m" (temp):); /* Trigger the fp exception. */ } if (excepts & FE_UNDERFLOW) { struct fenv temp; - asm volatile ("fnstenv\t%0" : "=m" (temp)); - temp.__status_word |= FE_UNDERFLOW; - asm volatile ("fldenv\t%0" : : "m" (temp)); - asm volatile ("fwait"); + __builtin_ia32_fnstenv (&temp); + temp.__status_word |= FE_UNDERFLOW; + __builtin_ia32_fldenv (&temp); + asm ("fwait" ::"m" (temp):); /* Trigger the fp exception. */ } if (excepts & FE_INEXACT) { Index: libcilkrts/runtime/config/x86/os-unix-sysdep.c === --- libcilkrts/runtime/config/x86/os-unix-sysdep.c (revision 237502) +++ libcilkrts/runtime/config/x86/os-unix-sysdep.c (working copy) @@ -85,7 +85,11 @@ _mm_pause(); # endif #elif defined __i386__ || defined __x86_64 +#ifdef __GNUC__ + __builtin_ia32_pause (); +#else __asm__("pause"); +#endif #else # warning __cilkrts_short_pause empty #endif Index: libgcc/config/i386/sfp-exceptions.c === --- libgcc/config/i386/sfp-exceptions.c (revision 237502) +++ libgcc/config/i386/sfp-exceptions.c (working copy) @@ -59,10 +59,10 @@ if (_fex & FP_EX_DENORM) { struct fenv temp; - asm volatile ("fnstenv\t%0" : "=m" (temp)); - temp.__status_word |= FP_EX_DENORM; - asm volatile ("fldenv\t%0" : : "m"
Re: Deprecating basic asm in a function - What now?
In the end, my problems with basic-asm-in-functions (BAIF) come down to reliably generating correct code. Optimizations are based on the idea of "you can safely modify x if you can prove y." But given that basic asm are opaque blocks, there is no way to prove, well, much of anything. Adding 'volatile' and now 'memory' do help. But the very definition of "opaque blocks" means that you absolutely cannot know what weird-ass crap someone might be trying. And every time gcc tweaks any of the optimizers, there's a chance that this untethered blob could bubble up or down within the code. There is no way for any optimizer to safely decide whether a given position violates the intent of the asm, since it can have no idea what that intent is. Can even a one line displacement of arbitrary assembler code cause race conditions or data corruption without causing a compiler error? I'll bet it can. And also, there is the problem of compatibility. I am told that the mips "sync" instruction (used in multi-thread programming) requires a memory clobber to work as intended. With that in mind, when I see this code on some blog, is it safe? asm("sync"); And of course the answer is that without context, there's no way to know. It might have been safe for the compiler in the blogger's environment. Or maybe he was an idiot. It's certainly not safe in any released version of gcc. But unless the blog reader knows the blogger's level of intelligence, development environment, and all the esoteric differences between that environment and their own, they could easily copy/paste that right into their own project, where it will compile without error, and "nearly always" work just fine... With Bernd's recent change, people who have this (incorrect) code in their gcc project will suddenly have their code start working correctly, and that's a good thing. Well, they will a year or so from now. If they immediately start using v7.0. And if they prevent everyone from trying to compile their code using 6.x, 5.x, 4.x etc where it will never work correctly. So yes, it's a fix. But it can easily be years before this change finally solves this problem. However, changing this to use extended: asm("sync":::"memory"); means that their very next source code release will immediately work correctly on every version of gcc. You make the point about how people changing this code could easily muck things up. And I absolutely agree, they could. On the other hand, it's possible that their code is already mucked up and they just don't know it. But even if they change this code to asm("sync":::) (forcing it to continue working the same way it has for over a decade), the programmer's intent is clear (if wrong). A knowledgeable mips programmer could look at that and say "Hey, that's not right." While making that same observation with asm("sync") is all but impossible. BAIF is fragile. That, combined with: unmet user expectations, bad user code, inconsistent implementations between compilers, changing implementations within compilers, years of bad docs, no "best practices" guide and an area that is very complex all tell me that this is an area filled with unfixable, ticking bombs. That's why I think it should be deprecated. Other comments inline: On 20/06/16 18:36, Michael Matz wrote: I see zero gain by deprecating them and only churn. What would be the advantage again? Correctness. As said in the various threads about basic asms, all correctness problems can be solved by making GCC more conservative in handling them (or better said: not making it less conservative). If you talk about cases where basic asms diddle registers expecting GCC to have placed e.g. local variables into specific ones (without using local reg vars, or extended asm) I won't believe any claims ... It is very likely that many of these basic asms are not robust ... of them being very likely without proof. I don't have a sample of people accessing local variables, but I do have one where someone was using 'scratch' registers in BAIF assuming the compiler would just "handle it." And before you call that guy a dummy, let me point out that under some compilers, that's a perfectly valid assumption for asm. And even if not, it may have been working "just fine" in his implementation for years. Which doesn't make it right. They will have stopped working with every change in compilation options or compiler version. In contrast I think those that did survive a couple years in software very likely _are_ correct, under the then documented (or implicit) assumptions. Those usually are: clobbers and uses memory, processor state and fixed registers. As I was saying, a history of working doesn't make it right. It just means the ticking hasn't finished yet. How would you know if you have correctly followed the documented rules? Don't expect the compiler to flag these violations for you. in t
Re: Deprecating basic asm in a function - What now?
On 6/21/2016 9:43 AM, Jeff Law wrote: > I think there's enough resistance to deprecating basic asms within a function that we should probably punt that idea. I don't disagree that there has been pushback. I just wish less of it was of the form "Because I don't wanna." A few examples of "Here's something that has to be in basic asm because..." might have produced a more interesting discussion. I think if people had to defend (even to themselves) why they were using BAIF, there might be more converts. And I *get* that it takes time to re-write this, and people have schedules, lives, a need for sleep. But even under the most insanely aggressive schedule I can imagine (if gcc continue to release ~1/year), it will be at least a year before there's a release that has the (disable-able) warning, and another year before we could even think about actually removing this. So someone who plans to use v8.0 in their production code on the day it is released still has a minimum of *two years* to get ready. > I do think we should look to stomp out our own uses of basic asms within functions just from a long term maintenance standpoint. Fixes.patch (from the start of this thread) seems mostly uncontested. Send it to patches? If someone wanted to clean up a bunch of these, they should take a look at CRT_CALL_STATIC_FUNCTION in gcc/config. This is defined for nearly every platform, and most of them do it with basic asm. I gotta wonder if there's a better way to do this. Isn't there an attribute for 'section'? > Finally I think we should continue to bring the implementation of basic asms more in-line with expectations and future proofing them I believe there are compilers that do safely use inline asm. However it appears that they accomplish this trick by parsing the asm. Not something I expect to see added to gcc anytime soon... > since I'm having a hard time seeing a reasonable path to deprecating their use. Umm. Hmm. Seems like the ideal answer here would be something that prevents new code from using BAIF, without putting the old-timers in an uproar. So how about the old "empty threat" gambit? We could *say* we are going to deprecate it, put the (disable-able) warning into the code and the stern-sounding text in the docs, and then *leave* it like that for a decade or so. The idea being that new code won't use it, but old code will still be supported. Hopefully 10 years from now, there might be so little code that uses BAIF that finally removing it may no longer be so controversial. It's not a lie, since deprecate means "express disapproval of" and yeah, that's about right. And since we don't specify precisely *when* we intend to remove it... dw
Unused variable in avx512fintrin.h
I'm looking at gcc/config/i386/avx512fintrin.h, and I see this: extern __inline __m256i __attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) _mm512_cvtsepi64_epi32 (__m512i __A) { __v8si __O; return (__m256i) __builtin_ia32_pmovsqd512_mask ((__v8di) __A, (__v8si) _mm256_undefined_si256 (), (__mmask8) -1); } Since r208793, __O appears to be unused? For someone with checkin permissions (ie not me), removing this seems like an "as obvious." dw
fxsrintrin.h
According to the docs (https://gcc.gnu.org/onlinedocs/gcc/x86-Built-in-Functions.html), __builtin_ia32_fxsave() has return type 'void.' Given that, does this code (from gcc/config/i386/fxsrintrin.h) make sense? _fxsave (void *__P) { return __builtin_ia32_fxsave (__P); } Returning a void? Is that a thing? Similar question for _fxrstor, _fxsave64, and _fxrstor64. And again in xsaveintrin.h for _xsave, _xrstor, _xsave64 and _xrstor64? dw
Re: fxsrintrin.h
Interesting. Seems slightly strange, but I've seen stranger. I guess it's seen as "cleaner" than forcing this into 2 statements. IAC, it seems wrong for headers, since they can be used from either C or C++. Also, seems unnecessary here, since 'return' is implied by the fact that the 'next' statement is the end of the routine. dw On 8/18/2016 10:50 PM, lhmouse wrote: Given the `_fxsave()` function returning `void`, it is invalid C but valid C++: # WG14 N1256 (C99) / N1570 (C11) 6.8.6.4 The return statement Constraints 1 A return statement with an expression shall not appear in a function whose return type is void. ... # WG21 N1804 (C++03) 6.6.3 The return statement [stmt.return] 3 A return statement with an expression of type “cv void” can be used only in functions with a return type of cv void; the expression is evaluated just before the function returns to its caller. # WG21 N4582 (C++1z) 6.6.3 The return statement [stmt.return] 2 ... A return statement with an operand of type void shall be used only in a function whose return type is cv void. ... -- Best regards, lh_mouse 2016-08-19 ----- 发件人:David Wohlferd 发送日期:2016-08-19 11:51 收件人:gcc@gcc.gnu.org 抄送: 主题:fxsrintrin.h According to the docs (https://gcc.gnu.org/onlinedocs/gcc/x86-Built-in-Functions.html), __builtin_ia32_fxsave() has return type 'void.' Given that, does this code (from gcc/config/i386/fxsrintrin.h) make sense? _fxsave (void *__P) { return __builtin_ia32_fxsave (__P); } Returning a void? Is that a thing? Similar question for _fxrstor, _fxsave64, and _fxrstor64. And again in xsaveintrin.h for _xsave, _xrstor, _xsave64 and _xrstor64? dw
Re: doc maintainer questions
On 9/20/2016 12:36 PM, Gerald Pfeifer wrote: [ Old e-mail alert ] On Sat, 19 Dec 2015, David Wohlferd wrote: I have been discussing adding some content to the basic asm docs. As part of this work, I want to add a discussion of "How to convert basic asm to extended asm." However it doesn't seem like this is a good fit for the User Guide. This is both because the UG doesn't generally talk about "How To" write code, and because the text may need updates more often than the UG gets released. For the latter, could referring to gcc.gnu.org/onlinedocs (in particular the trunk version there) be a viable option? That gets updated daily, so not a question of waiting for the next release. 1) Is it appropriate for the UG to link to sections in the wiki? I see that we do, but should we? Generally, I think that is fine. It does require Internet access and you run into the opposite of what you describe above (the Wiki might have moved to coverage of a later version of GCC), but then so does what I described at the beginning of this message. Generally, I also think that we should not spread things across too many places, and our documentation should be mostly self contained. (The particular use case you had in mind, seems fine for the Wiki or the web pages -- and I'm thinking of our porting_to.html pages that we have had for the last couple of major releases.) In the time since my email was sent, there has been good news and bad news on this subject: The good news is that the wiki content got created (see https://gcc.gnu.org/wiki/ConvertBasicAsmToExtended) and a reference has been added to the docs (see https://gcc.gnu.org/onlinedocs/gcc/Basic-Asm.html). The bad news is that the original intent of this doc was that it would be included as part of completely deprecating using Basic Asm in functions (BAIF), or at least adding a warning to help people find and convert at-risk code. However when I sent the patches to do these things, I was unable to convince anyone to sign off on either of them. Moving on, I am contemplating creating some additional content on a related subject, so perhaps you would care to voice your opinion as to the value and best location for this content. The gcc docs are able to "get away" with not documenting a great deal of behavior for the compiler. This is because the compiler is merely implementing the C/C++ standards. If someone wants to know how a particular feature is intended to work, they can read the standard and assume that's what the compiler does. However this approach doesn't apply when gcc adds extensions to the standard. In these cases, the gcc docs must provide 100% of the information for using the feature. Anything not listed as explicitly supported is (by definition) "undefined" behavior. Having a program's function depend upon undefined behavior is risky, and is usually described as "a bad thing." I believe the decision to continue to support BAIF necessitates documenting (somewhere) this gcc extension to clearly specify which uses are supported and which are not. As a first cut, I intend to define a list of common programming tasks and document which ones are and are not supported. Obviously this list can never be complete, but by explicitly listing some of the more common tasks, we can at least begin to steer people away from known-bad ideas. Off the top of my head: - read/write global variables - read/write function parameters - read/write local variables - return value from function - modify registers (without restoring) - modify flags (without restoring) - modify the stack (push/pop/mov to/mov from) - call a function - invoke top-level asm - jump outside single asm - throw exceptions - accessing variables declared in asm That's probably enough for a start. Unfortunately, anything we don't list remains "undefined behavior." Icky, I know, but right now EVERYTHING is undefined. We are leaving people to guess about what might be safe, and hope that "It's worked that way for a long time" or "Lots of people do it" implies that it will always work that way. Talk about icky... I intend to categorize each of these as one of: A) Unsupported - While this might appear to work, that is merely happenstance and cannot be depended upon. Even if there are no circumstances today where this will fail, this behavior is not supported and may fail in future releases. B) Partial - This operation is safe under specific conditions. C) Supported - This operation is safe and will not be mangled by optimizations. At first read, I see 1(?) for category C, 3 for category B, and the rest are A. However, as "the guy who wants to kill BAIF," perhaps I'm being too harsh. Indeed, the biggest challenge for this project will be finding someone who is prepared to sign off on the accura
RFC: Doc update for attribute
After updating gcc's docs about inline asm, I'm trying to improve some of the related sections. One that I feel has problems with clarity is __attribute__ naked. I have attached my proposed update. Comments/corrections are welcome. In a related question: To better understand how this attribute is used, I looked at the Linux kernel. While the existing docs say "only ... asm statements that do not have operands" can safely be used, Linux routinely uses asm WITH operands. Some examples: memory clobber operand: http://lxr.free-electrons.com/source/arch/arm/kernel/kprobes.c#L377 Input arguments: http://lxr.free-electrons.com/source/arch/arm/mm/copypage-feroceon.c#L17 Since I don't know why "asm with operands" was excluded from the existing docs, I'm not sure whether what Linux does here is supported or not (maybe with some limitations?). If someone can clarify, I'll add it to this text. Even without discussing "asm with operands," I believe this text is an improvement. Thanks in advance, dw Index: extend.texi === --- extend.texi (revision 210349) +++ extend.texi (working copy) @@ -3330,16 +3330,15 @@ @item naked @cindex function without a prologue/epilogue code -Use this attribute on the ARM, AVR, MCORE, MSP430, NDS32, RL78, RX and SPU -ports to indicate that the specified function does not need prologue/epilogue -sequences generated by the compiler. -It is up to the programmer to provide these sequences. The -only statements that can be safely included in naked functions are -@code{asm} statements that do not have operands. All other statements, -including declarations of local variables, @code{if} statements, and so -forth, should be avoided. Naked functions should be used to implement the -body of an assembly function, while allowing the compiler to construct -the requisite function declaration for the assembler. +This attribute is available on the ARM, AVR, MCORE, MSP430, NDS32, RL78, RX +and SPU ports. It allows the compiler to construct the requisite function +declaration, while allowing the body of the function to be assembly code. +The specified function will not have prologue/epilogue sequences generated +by the compiler; it is up to the programmer to provide these sequences if +the function requires them. The expectation is that only Basic @code{asm} +statements will be included in naked functions (@pxref{Basic Asm}). While it +is discouraged, it is possible to write your own prologue/epilogue code +using asm and use ``C'' code in the middle. @item near @cindex functions that do not handle memory bank switching on 68HC11/68HC12
Re: RFC: Doc update for attribute
Thank you for your response. This is exactly what I wanted to know. One last question: +While it +is discouraged, it is possible to write your own prologue/epilogue code +using asm and use ``C'' code in the middle. I wouldn't remove the last sentence since IMO it's not the intent of the feature to ever support that and the compiler doesn't guarantee it and may result in wrong code given that `naked' is a fragile low-level feature. I'm assuming you meant "would remove." I wasn't comfortable including that sentence, but I was following the existing docs. Since they said you could "only" use basic asm, following that with a warning to "avoid" locals/if/etc was really confusing without this text. Also, as ugly as this is, apparently some people really do this (comment 6): https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43404#c6 We don't have to doc every crazy thing people try to do with gcc. But since it's out there, maybe we should this time? If only to discourage it. I'm *slightly* more in favor of keeping it. But if you still feel it should go, it's gone. Thanks, dw
Re: RFC: Doc update for attribute
After thinking about this some more, I believe I have some better text. Previously I used the word "discouraged" to describe this practice. The existing docs use the term "avoid." I believe what you want is something more like the attached. Direct and clear, just like docs should be. If you are ok with this, I'll send it to gcc-patches. dw +While it +is discouraged, it is possible to write your own prologue/epilogue code +using asm and use ``C'' code in the middle. I wouldn't remove the last sentence since IMO it's not the intent of the feature to ever support that and the compiler doesn't guarantee it and may result in wrong code given that `naked' is a fragile low-level feature. I'm assuming you meant "would remove." I wasn't comfortable including that sentence, but I was following the existing docs. Since they said you could "only" use basic asm, following that with a warning to "avoid" locals/if/etc was really confusing without this text. Also, as ugly as this is, apparently some people really do this (comment 6): https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43404#c6 We don't have to doc every crazy thing people try to do with gcc. But since it's out there, maybe we should this time? If only to discourage it. I'm *slightly* more in favor of keeping it. But if you still feel it should go, it's gone. Index: extend.texi === --- extend.texi (revision 210624) +++ extend.texi (working copy) @@ -3332,16 +3332,15 @@ @item naked @cindex function without a prologue/epilogue code -Use this attribute on the ARM, AVR, MCORE, MSP430, NDS32, RL78, RX and SPU -ports to indicate that the specified function does not need prologue/epilogue -sequences generated by the compiler. -It is up to the programmer to provide these sequences. The -only statements that can be safely included in naked functions are -@code{asm} statements that do not have operands. All other statements, -including declarations of local variables, @code{if} statements, and so -forth, should be avoided. Naked functions should be used to implement the -body of an assembly function, while allowing the compiler to construct -the requisite function declaration for the assembler. +This attribute is available on the ARM, AVR, MCORE, MSP430, NDS32, +RL78, RX and SPU ports. It allows the compiler to construct the +requisite function declaration, while allowing the body of the +function to be assembly code. The specified function will not have +prologue/epilogue sequences generated by the compiler. Only Basic +@code{asm} statements can safely be included in naked functions +(@pxref{Basic Asm}). While using Extended @code{asm} or a mixture of +Basic @code{asm} and ``C'' code may appear to work, they cannot be +depended upon to work reliably and are not supported. @item near @cindex functions that do not handle memory bank switching on 68HC11/68HC12 @@ -6269,6 +6268,8 @@ efficient code, and in most cases it is a better solution. When writing inline assembly language outside of C functions, however, you must use Basic @code{asm}. Extended @code{asm} statements have to be inside a C function. +Functions declared with the @code{naked} attribute also require Basic +@code{asm} (@pxref{Function Attributes}). Under certain circumstances, GCC may duplicate (or remove duplicates of) your assembly code when optimizing. This can lead to unexpected duplicate @@ -6388,6 +6389,8 @@ Note that Extended @code{asm} statements must be inside a function. Only Basic @code{asm} may be outside functions (@pxref{Basic Asm}). +Functions declared with the @code{naked} attribute also require Basic +@code{asm} (@pxref{Function Attributes}). While the uses of @code{asm} are many and varied, it may help to think of an @code{asm} statement as a series of low-level instructions that convert input
Question for ARM person re asm_fprintf
I have been looking at asm_fprintf in final.c, and I think there's a design flaw. But since the change affects ARM and since I have no access to an ARM system, I need a second opinion. asm_fprintf allows platforms to add support for new format specifiers by using the ASM_FPRINTF_EXTENSIONS macro. ARM uses this to add support for %@ and %r specifiers. Pretty straight-forward. However, it isn't enough to add these two items to the case statement in asm_fprintf. Over in c-format.c, there is compile-time checking that is done against calls to asm_fprintf to validate the format string. %@ and %r have been added to this checking (see asm_fprintf_char_table), but NOT in a platform-specific way. This means that using %r or %@ will successfully pass the format checking on all platforms, but will ICE on non-ARM platforms since there are no case statements in asm_fprintf to support them. Compiling the code in asm_fprintf-1.c (see the patch) with this patch correctly reports "unknown conversion type character" for both 'r' and '@' in x86_64-pc-cygwin. It would be helpful if someone could confirm that it still compiles without error under ARM after applying this patch. I'm reluctant to post this to gcc-patches when it has never been run. dw Index: gcc/c-family/c-format.c === --- gcc/c-family/c-format.c (revision 212900) +++ gcc/c-family/c-format.c (working copy) @@ -637,8 +637,9 @@ { "I", 0, STD_C89, NOARGUMENTS, "", "", NULL }, { "L", 0, STD_C89, NOARGUMENTS, "", "", NULL }, { "U", 0, STD_C89, NOARGUMENTS, "", "", NULL }, - { "r", 0, STD_C89, { T89_I, BADLEN, BADLEN, BADLEN, BADLEN, BADLEN, BADLEN, BADLEN, BADLEN, BADLEN, BADLEN, BADLEN }, "", "", NULL }, - { "@", 0, STD_C89, NOARGUMENTS, "", "", NULL }, +#ifdef ASM_FPRINTF_TABLE + ASM_FPRINTF_TABLE +#endif { NULL, 0, STD_C89, NOLENGTHS, NULL, NULL, NULL } }; Index: gcc/config/arm/arm.h === --- gcc/config/arm/arm.h (revision 212900) +++ gcc/config/arm/arm.h (working copy) @@ -888,6 +888,12 @@ fputs (reg_names [va_arg (ARGS, int)], FILE); \ break; +/* Used in c-format.c to add entries to the table used to validate calls + to asm_fprintf. */ +#define ASM_FPRINTF_TABLE \ + { "r", 0, STD_C89, { T89_I, BADLEN, BADLEN, BADLEN, BADLEN, BADLEN, BADLEN, BADLEN, BADLEN, BADLEN, BADLEN, BADLEN }, "", "", NULL }, \ + { "@", 0, STD_C89, NOARGUMENTS, "", "", NULL }, + /* Round X up to the nearest word. */ #define ROUND_UP_WORD(X) (((X) + 3) & ~3) Index: gcc/doc/tm.texi === --- gcc/doc/tm.texi (revision 212900) +++ gcc/doc/tm.texi (working copy) @@ -8611,8 +8611,39 @@ The varargs input pointer is @var{argptr} and the rest of the format string, starting the character after the one that is being switched upon, is pointed to by @var{format}. +See also ASM_FPRINTF_TABLE. + +Example: +@smallexample +#define ASM_FPRINTF_EXTENSIONS(FILE, ARGS, P) \ + case '@': \ +fputs (ASM_COMMENT_START, FILE); \ +break; \ + \ + case 'r': \ +fputs (REGISTER_PREFIX, FILE); \ +fputs (reg_names [va_arg (ARGS, int)], FILE); \ +break; +@end smallexample @end defmac +@defmac ASM_FPRINTF_TABLE +When using ASM_FPRINTF_EXTENSIONS, you must also use this macro to define +table entries for the printf format checking performed in c-format.c. +This macro must contain format_char_info entries for each printf format +being added. + +Example: +@smallexample +#define ASM_FPRINTF_TABLE \ + @{ "r", 0, STD_C89, \ + @{ T89_I, BADLEN, BADLEN, BADLEN, BADLEN, BADLEN, \ + BADLEN, BADLEN, BADLEN, BADLEN, BADLEN, BADLEN @}, \ + "", "", NULL @}, \ + @{ "@", 0, STD_C89, NOARGUMENTS, "", "", NULL @}, +@end smallexample +@end defmac + @defmac ASSEMBLER_DIALECT If your target supports multiple dialects of assembler language (such as different opcodes), define this macro as a C expression that gives the Index: gcc/doc/tm.texi.in === --- gcc/doc/tm.texi.in (revision 212900) +++ gcc/doc/tm.texi.in (working copy) @@ -6370,8 +6370,39 @@ The varargs input pointer is @var{argptr} and the rest of the format string, starting the character after the one that is being switched upon, is pointed to by @var{format}. +See also ASM_FPRINTF_TABLE. + +Example: +@smallexample +#define ASM_FPRINTF_EXTENSIONS(FILE, ARGS, P) \ + case '@': \ +fputs (ASM_COMMENT_START, FILE); \ +break; \ + \ + case 'r': \ +fputs (REGISTER_PREFIX, FILE); \ +fputs (reg_names [va_arg (ARGS, int)], FILE); \ +break; +@end smallexample @end defmac +@defmac ASM_FPRINTF_TABLE +When using ASM_FPRINTF_EXTENSIONS, you must also use this macro to define +ta
Re: C as intermediate language, signed integer overflow and -ftrapv
I believe that sometimes gcc is promoting the ints to long longs when doing the overflow testing. If I try to overflow a long long, I get the trap as expected. See also https://gcc.gnu.org/bugzilla/show_bug.cgi?id=19020 dw On 7/23/2014 7:56 AM, Thomas Mertes wrote: C is popular as intermediate language. This means that some compilers generate C and use a C compiler as backend. Wikipedia lists several languages, which use C as intermediate language: Eiffel, Sather, Esterel, some dialects of Lisp (Lush, Gambit), Haskell (Glasgow Haskell Compiler), Squeak's Smalltalk-subset Slang, Cython, Seed7 and Vala. When C is used as backend the features needed from a C compiler are different from the features that a human programmer of C needs. One such feature is the detection of signed integer overflow. It is not hard, to detect signed integer overflow with a generated C program, but the performance is certainly not optimal. Signed integer overflow is undefined behavior in C and the access to some overflow flag of the CPU is machine dependent. So the generated C code must recogize overflow before it happens or use unsigned computations and recognize the overflow afterwards. I have doubts that this leads to optimal performance. The C compiler is much better suited to do signed integer overflow checks. The C compiler can do low level tricks, that would be undefined behavior in C, and the C compiler also knows about overflow flags and other details of the CPU. Maybe the CPU can be switched to a mode where it traps signed integer overflow for free. The gcc compiler option -ftrapv promises to do exactly that, but it seems broken. At least my test suite shows that both gcc version 4.6.3 and 4.8.1 fail to detect integer overflow with -ftrapv. The detection fails even for addition and subtraction. I know that 4.6.3 and 4.8.1 are old, but I found nothing in the internet that tells me that the situation is better now. So for gcc as C compiler backend -ftrapv cannot be used and overflow checking in the generated C code is necessary. Clang supports -ftrapv reliably. Signed integer overflow raises the signal SIGILL, which can be catched. Btw. SIGILL seems to be a better choice, because under windows (7) SIGABRT causes some text to be written to the console. Is it possible to choose a -ftrapv signal? A sanitizer such as ubsan is good as tool to find errors in C programs. But I don't think that ubsan is well suited to do overflow detection with maximum performance. Is just not the goal of this tool. The argumantation that nobody uses -ftrapv is self-fulfilling prophecy. How can someone expect that a buggy feature is used. The counterexample is clang, where -ftrapv works and is also used (E.g. by the integer overflow detection of Seed7). Conclusion: Signed integer overflow detection with -ftrapv is NOT something that nobody uses. It is an important feature. Especially when C is used as intermediate language. When it works it results in a significant speed up of signed overflow detection. A sanitizer has a different purpose and cannot be used as replacement. I can offer some help with this issue: I have test programs for cases with integer overflow and for cases where the result is as big or as small as possible without causing an overflow. The test programs are not written in C, but are licensed with the GPL, and it would be possible to convert them to C with reasonable effort. Maybe this is not necessary, because clang must have some test suites for -ftrapv. Greetings Thomas Mertes -- Seed7 Homepage: http://seed7.sourceforge.net Seed7 - The extensible programming language: User defined statements and operators, abstract data types, templates without special syntax, OO with interfaces and multiple dispatch, statically typed, interpreted or compiled, portable, runs under linux/unix/windows.
Re: Question for ARM person re asm_fprintf
Not that the following would constitute the actual testing usually required for a patch, but: /path/to/toplevel/configure --target=arm-eabi && make all-gcc # Yay, the compiler-proper for a "bare iron" ARM compiler. ./gcc/xgcc -B./gcc -S test.c Woot, compiled your first ARM program. :) Just emitting text assembly code, and #include's won't work, but a missing-case leading-to-abort would be prominently noticed as an internal compiler error. I just tried this and I did get it to build and run. And it's true that invalid formats do cause an error, even just using -S. But I'm still glad that kyrill tried it too. So I've sent this to gcc-patches. Now comes the hard part: Patience... dw
[RFD] Using the 'memory constraint' trick to avoid memory clobber doesn't work
Hans-Peter Nilsson: I should have listened to you back when you raised concerns about this. My apologies for ever doubting you. In summary: - The "trick" in the docs for using an arbitrarily sized struct to force register flushes for inline asm does not work. - Placing the inline asm in a separate routine can sometimes mask the problem with the trick not working. - The sample that has been in the docs forever performs an unhelpful, unexpected, and probably unwanted stack allocation + memcpy. Details: Here is the text from the docs: --- One trick to avoid [using the "memory" clobber] is available if the size of the memory being accessed is known at compile time. For example, if accessing ten bytes of a string, use a memory input like: "m"( ({ struct { char x[10]; } *p = (void *)ptr ; *p; }) ) --- When I did the re-write of gcc's inline asm docs, I left the description for this (essentially) untouched. I just took it on faith that "magic happens" and the right code gets generated. But reading a recent post raised questions for me, so I tried it. And what I found was that not only does this not work, it actually just makes a mess. I started with some code that I knew required some memory clobbering: #include int main(int argc, char* argv[]) { struct { int a; int b; } c; c.a = 1; c.b = 2; int Count = sizeof(c); void *Dest; __asm__ __volatile__ ("rep; stosb" : "=D" (Dest), "+c" (Count) : "0" (&c), "a" (0) //: "memory" ); printf("%u %u\n", c.a, c.b); } As written, this x64 code (compiled with -O2) will print out "1 2", even though someone might (incorrectly) expect the asm to overwrite the struct with zeros. Adding the memory clobber allows this code to work as expected (printing "0 0"). Now that I have code I can use to see if registers are getting flushed, I removed the memory clobber, and tried just 'clobbering' the struct: #include int main(int argc, char* argv[]) { struct { int a; int b; } c; c.a = 1; c.b = 2; int Count = sizeof(c); void *Dest; __asm__ __volatile__ ("rep; stosb" : "=D" (Dest), "+c" (Count) : "0" (&c), "a" (0), "m" ( ({ struct foo { char x[8]; } *p = (struct foo *)&c ; *p; }) ) ); printf("%u %u\n", c.a, c.b); } I'm using a named struct (foo) to avoid some compiler messages, but other than that, I believe this is the same as what's in the docs. And it doesn't work. I still get "1 2". At this point I realized that code I've seen using this trick usually has the asm in its own routine. When I try this, it still fails. Unless I start cranking up the size of x from 8 to ~250. At ~250, suddenly it starts working. Apparently this is because at this point, gcc decides not to inline the routine anymore, and flushes the registers before calling the non-inline code. And why does changing the size of the structure we are pointing to result in increases in the size of the routine? Reading the -S output, the "*p" at the end of this constraint generates a call to memcpy the 250 characters onto the stack, which it passes to the asm as %4, which is never used. Argh! Conclusion: What I expected when using that sample code from the docs was that any registers that contain values from the struct would get flushed to memory. This was intended to be a 'cheaper' alternative to doing a full-on "memory" clobber. What I got instead was an unexpected/unneeded stack allocation and memcpy, and STILL didn't get the values flushed. Yeah, not exactly the 'cheaper' I was hoping for. Is the example in the docs just written incorrectly? Did this get broken somewhere along the line? Or am I just using it wrong? I'm using gcc version 4.9.0 (x86_64-win32-seh-rev2, Built by MinGW-W64 project). Remember to compile these x64 samples with -O2. dw
Re: [RFD] Using the 'memory constraint' trick to avoid memory clobber doesn't work
On 9/25/2014 12:36 AM, Yury Gribov wrote: On 09/24/2014 12:31 PM, Richard Biener wrote: On Wed, Sep 24, 2014 at 9:43 AM, David Wohlferd wrote: Hans-Peter Nilsson: I should have listened to you back when you raised concerns about this. My apologies for ever doubting you. In summary: - The "trick" in the docs for using an arbitrarily sized struct to force register flushes for inline asm does not work. - Placing the inline asm in a separate routine can sometimes mask the problem with the trick not working. - The sample that has been in the docs forever performs an unhelpful, unexpected, and probably unwanted stack allocation + memcpy. Details: Here is the text from the docs: --- One trick to avoid [using the "memory" clobber] is available if the size of the memory being accessed is known at compile time. For example, if accessing ten bytes of a string, use a memory input like: "m"( ({ struct { char x[10]; } *p = (void *)ptr ; *p; }) ) Well - this can't work because you essentially are using a _value_ here (looking at the GIMPLE - I'm not sure if a statement expression evaluates to an lvalue. It should work if you simply do this without a stmt expression: "m" (*(struct { char x[10]; } *)ptr) because that's clearly an lvalue (and the GIMPLE correctly says so): : c.a = 1; c.b = 2; __asm__ __volatile__("rep; stosb" : "=D" Dest_4, "=c" Count_5 : "0" &c, "a" 0, "m" MEM[(struct foo *)&c], "1" 8); printf ("%u %u\n", 1, 2); note that we still constant propagated 1 and 2 for the reason that the asm didn't get any VDEF. That's because you do not have any memory output! So while it keeps 'c' live it doesn't consider it modified by the asm. You'd still need to clobber the memory, but "m" clobbers are not supported, only "memory". Thus fixed asm: __asm__ __volatile__ ("rep; stosb" : "=D" (Dest), "+c" (Count) : "0" (&c), "a" (0), "m" (*( struct foo { char x[8]; } *)&c) : "memory" ); where I'm not 100% sure if the "m" input is now pointless (that is, if a "memory" clobber also constitutes a use of all memory). Or maybe even __asm__ __volatile__ ("rep; stosb" : "=D" (Dest), "+c" (Count), "+m" (*(struct foo { char x[8]; } *)&c) : "0" (&c), "a" (0) ); to avoid the big-hammer memory clobber? -Y Thank you both for the responses. At this point I've started composing some replacement text for the docs (below), cuz clearly what's there is both inadequate and wrong. All comments are welcome. While the code in Richard's response will always produce the correct results, the intent here is to use memory constraints to *avoid* the performance penalties of the memory clobber. The existing docs say this should work, and I've seen a number of places using it (linux kernel, glibc, etc). If this does work, we should update the docs to say how it's done. If this doesn't work, we should say that too. Looking at Yury's response, that code does work (in c, not c++). At least it does sometimes. The problem is that sometimes gcc can "lose track" of what a pointer points to. And when it does, gcc can get confused about what to flush. Here's a simple example that shows this (4.9.0, x64, -O2, 'c'): #include inline void *memset( void * Dest, int c, size_t Count) { void *dummy; __asm__ ( "rep stosb" : "=D" (dummy), "+c" (Count), "=m" (*( struct foo { char x[8]; } *)Dest) : "0" (Dest), "a" (c) : "cc"//, "memory" ); return Dest; } int main() { struct { int a; } c; void *v; asm volatile("#" : "=r" (v) : "0" (&c) ); c.a = 0x30303030; //memset(&c, 0, sizeof(c)); memset(v, 0, sizeof(c)); printf("%x\n", c.a); } This code will work if you pass &c to memset. But it will fail if you use v. Oh, the wonders of aliasing. And this is why I'm having a problem doc'ing this. I love the potential benefits of using this. But if you are writing general-purpose routines, how can you hope to know whether the code calling you will pass you a pointer that will work with this trick? This kind of thing can introduce *horribly* hard to track down problems. Note that the memory clobber always works, but potentially with a performance penalty. I could simply skip describing the whole problem with aliasing, but that just hides the problem. I hate wh
Re: [RFD] Using the 'memory constraint' trick to avoid memory clobber doesn't work
You want "=m" (*( struct foo { char x[8]; } __attribute__((may_alias)) *)Dest) Thank you. With your help, that worse-than-useless sample in the docs is getting closer to something people can actually use. Except for one last serious problem: This trick only works for very specific (and very small) sizes of x (something else the existing docs fail to mention). On x64: Sizes of 1/2/4/8/16 correctly clobber Dest (and only Dest) as expected. Trying to use any other values (5, 12, 32, etc) seem to force a full memory clobber. This is the case REGARDLESS of the actual size of the underlying structure. For example clobbering 16 bytes of a 12 byte struct only clobbers the 12 bytes of Dest (as expected), but clobbering 12 bytes of a 12 byte struct performs a full memory clobber. This presents a problem for general purpose routines like memset, where the actual size of the block cannot be hardcoded into the struct. glibc (which uses this trick to avoid using the memory clobber) uses a size of 0xfff. Being larger than 16, this always causes a full memory clobber, not the "memory constraint" clobber I assume they were expecting. OTOH, since they don't use may_alias, this is may actually be a good thing... While I really like the idea of using memory constraints to avoid all out memory clobbers, 16 bytes is a pretty small maximum memory block, and x32 only supports a max of 8. Unless there's some way to use larger sizes (say SSIZE_MAX), this feature hardly seems worth documenting. dw
Re: [RFD] Using the 'memory constraint' trick to avoid memory clobber doesn't work
Sorry for the (very) delayed response. I'm still looking for feedback here so I can fix the docs. To refresh: The topic of conversation was the (extremely) wrong explanation that has been in the docs since forever about how to use memory constraints with inline asm to avoid the performance hit of a full memory clobber. Trying to understand how this really works has led to some surprising results. Me: >> While I really like the idea of using memory constraints to avoid all out >> memory clobbers, 16 bytes is a pretty small maximum memory block, and x86 >> only supports a max of 8. Unless there's some way to use larger sizes (say >> SSIZE_MAX), this feature hardly seems worth documenting. Richard: > I wonder how you figured out that a 12 byte clobber performs a full > memory clobber? Here's the code (compiled with gcc version 4.9.0 x86_64-win32-seh-rev2, using -m64 -O2 -fdump-final-insns): #include #define MYSIZE 3 inline void __stosb(unsigned char *Dest, unsigned char Data, size_t Count) { struct _reallybigstruct { char x[MYSIZE]; } *p = (struct _reallybigstruct *)Dest; __asm__ __volatile__ ("rep stos{b|b}" : "+D" (Dest), "+c" (Count), "=m" (*p) : [Data] "a" (Data) //: "memory" ); } int main() { unsigned char buff[100]; buff[5] = 'A'; __stosb(buff, 'B', sizeof(buff)); printf("%c\n", buff[5]); } In summary: 1) Create a 100 byte buffer, and set buff[5] to 'A'. 2) Call __stosb, which uses inline asm to overwrite all of buff with 'B'. 3) Use a memory constraint in __stosb to flush buff. The size of the memory constraint is controlled by a #define. With this, I have a simple way to test various sizes of memory constraints to see if the buffer gets flushed. If it *is* flushing the buffer, printing buff[5] after __stosb will print 'B'. If it is *not* flushing, it will print 'A'. Results: - Since buff[5] is the 6th byte in the buffer, using memory constraint sizes of 1, 2 & 4 (not surprisingly) all print 'A', showing that no flush was done. - Sizes of 8 and 16 print 'B', showing that the flush was done. This is also the expected result, since I am now flushing enough of buff to include buff[5]. - The surprise comes from using a size of 3 or 5. These also print 'B'. WTF? If 4 doesn't flush, why does 3? I believe the answer comes from reading the RTL. The difference between sizes of 3 and 16 comes here: (set (mem/c:TI (plus:DI (reg/f:DI 7 sp) (const_int 32 [0x20])) [ MEM[(struct _reallybigstruct *)&buff]+0 S16 A128]) (asm_operands/v:TI ("rep stos{b|b}") ("=m") 2 [ (set (mem/c:BLK (plus:DI (reg/f:DI 7 sp) (const_int 32 [0x20])) [ MEM[(struct _reallybigstruct *)&buff]+0 S3 A128]) (asm_operands/v:BLK ("rep stos{b|b}") ("=m") 2 [ While I don't actually speak RTL, TI clearly refers to TIMode. Apparently when using a size that exactly matches a machine mode, asm memory references (on i386) can flush the exact number of bytes. But for other sizes, gcc seems to falls back to BLK mode, which doesn't. I don't know the exact meaning of BLK on a "set" or "asm_operands." Does it cause a full clobber? Or just a complete clobber of buff? Attempting to answer that question leads us to the second bit of code: #include #define MYSIZE 8 inline void __stosb(unsigned char *Dest, unsigned char Data, size_t Count) { struct _reallybigstruct { char x[MYSIZE]; } *p = (struct _reallybigstruct *)Dest; __asm__ __volatile__ ("rep stos{b|b}" : "+D" (Dest), "+c" (Count), "=m" (*p) : [Data] "a" (Data) //: "memory" ); } int main() { unsigned char buff[100], buff2[100]; buff[5] = 'A'; buff2[5] = 'M'; asm("#" : : "r" (buff2)); __stosb(buff, 'B', sizeof(buff)); printf("%c %c\n", buff[5], buff2[5]); } Here I've added a buff2, and I set buff2[5] to 'M' (aka ascii 77), which I also print. I still perform the memory constraint against buff, then I check to see if it is affecting buff2. I start by compiling this with a size of 8 and look at the -S output. If this is NOT doing a full clobber, gcc should be able to just print buff2[5] by moving 77 into the appropriate register before calling printf. And indeed, that's what we see. /APP # 17 "mem2.cpp" 1 rep stosb # 0 "" 2 /NO_APP movzbl 37(%rsp), %edx movl$77, %r8d leaq.LC0(%rip), %rcx callprintf If using a size of 3 *is* causing a full memory clobber, we would expect to see the value getting read from memory before calling printf. And indeed, that's also what we see. /APP # 17 "mem2.cpp" 1 rep stosb # 0 "" 2 /NO_APP movzbl 37(%rsp), %edx leaq.LC0(%rip), %rcx movzbl 149(%rsp), %r8d I don't know the internals of gcc well enough to understand exactly why this is happening. But from a user's poin
Re: [RFD] Using the 'memory constraint' trick to avoid memory clobber doesn't work
On 11/13/2014 6:02 AM, Richard Biener wrote: On Thu, Nov 13, 2014 at 2:53 PM, Hans-Peter Nilsson wrote: On Thu, 13 Nov 2014, David Wohlferd wrote: Sorry for the (very) delayed response. I'm still looking for feedback here so I can fix the docs. Thank you for your diligence. As I said before, triggering a full memory clobber for anything over 16 bytes (and most sizes under 16 bytes) makes this feature all but useless. So if that's really what's happening, we need to decide what to do next: 1) Can this be "fixed?" 2) Do we want to doc the current behavior? 3) Or do we just remove this section? I think it could be a nice performance win for inline asm if it could be made to work right, but I have no idea what might be involved in that. Failing that, I guess if it doesn't work and isn't going to work, I'd recommend removing the text for this feature. Since all 3 suggestions require a doc change, I'll just say that I'm prepared to start work on the doc patch as soon as someone lets me know what the plan is. Richard? Hans-Peter? Your thoughts? I've forgot if someone mentioned whether we have a test-case in our test-suite for this feature. I'm looking thru gcc/testsuite/*.c to see if I can spot anything. It's not easy since there is a lot of asm and the people who write these are apparently allergic to using comments to describe what they are testing. If we don't, then 3; removal. If we do, I guess it's flawed or at least not agreeing with the documentation? Then it *might* be worth the effort fixing that and additional test-coverage (depending on the person stepping up...) but 3 is IMHO still an arguably sane option. Well, as what is missing is just an optimization I'd say we should try to fix it. While I'd love to be the one to fix this, the fact of the matter is that most of gcc is a black box to me. Even if you told me roughly where to start, I'd have no idea of the downstream impacts of anything I changed. So while I understand that it looks like I'm just finding work for other people, fixing something like this is simply beyond me. That said, I'm certainly prepared to outline what I see as the interesting test cases and to do some testing if someone else is willing to step up and do this optimization. And surely the docs should not promise that optimization will happen - it should just mention that doing this might allow optimization to happen. I can agree with this. I am quite confident there will be occasions where gcc has no option but to fall back to doing a full clobber to ensure correct function (a possibility which the current docs also fails to mention). So yes, the docs should be limited in what it promises here. Which brings us to the question: what do we do now? The 15th is fast approaching. Can something like this get done before then? Can it be checked in for 5.0 after the 15th? Or does it need to wait for 6.0? If it does need to wait for 6.0, what do we want to do with the docs in the meantime? Given how wrong they are currently, I'd hate to ship yet another release with that ugly text. But trying to describe the best way to take advantage of optimizations that haven't been written yet is... hard. Since (as I understand it) 5.0 docs *can* be checked in after the 15th, my recommendations: - If someone is prepared to step up and do this work for v5.0, then I'll wait and write the docs when they are done and can describe how it works. - If this is going to wait for 6.0, then if someone does (at least) enough investigative work to be able to describe how this will eventually work, I'll update the 5.0 docs in a general way talking about ways gcc *may* be able to optimize. It should be possible to phrase this so code people write today will work even better tomorrow. - Worst case is if no one has the time to look at this for the foreseeable future. In that case, I'm with Hans-Peter. Let's take the existing text out. Following the existing text makes things *worse*, and the current implementation is so limited that I'd be surprised if anyone's code actually uses it successfully. New text can get added when the new code is. Hmm. I just had a thought: Is it possible the problem I'm seeing here is platform-specific? Maybe this works perfectly for non-i386 code? That would certainly change my recommendations here. dw
Re: [RFD] Using the 'memory constraint' trick to avoid memory clobber doesn't work
Can you please file a bug in bugzilla if you haven't already done so? We can still fix bugs during the next three months ;) Well, I opened one. Briefly. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63900 dw
Re: organization of optimization options in manual
On 1/17/2015 12:13 PM, Sandra Loosemore wrote: BTW, as a GCC user I'm also very frustrated by the (lack of) organization in the extensions chapter; the information about attributes and built-in functions is all mixed up with sections on random syntactic extensions like "Dollar Signs in Identifier Names", etc. Maybe somebody else could work on a proposal for reordering the material in that chapter in parallel? I actually took a shot at this a while back, but didn't think anyone would be interested. 60+ items in an unordered list seemed a bit much for human consumption. Even when you *know* something is on the page it can be hard to find. Let me dig that back out and post it here in a new thread. dw
organization of C Extensions in manual
The page for Extensions to the C Language Family (https://gcc.gnu.org/onlinedocs/gcc/C-Extensions.html) is very long (60+ items) and completely unordered. This makes it hard to find things, even when you know they are there. I have taken a first shot at grouping these. Hopefully it can at least serve as the basis for discussion. These section headings were just the general terms that first occurred to me. Feel free to propose more computer-y terminology. And yes, some of these topics are a stretch to be included in the sections in which I have placed them. Still, I believe this is better than what is there now. Better section names and alternative groupings welcome. Arrays and Vectors Designated Inits: Labeling elements of initializers. Pointers to Arrays: Pointers to arrays with qualifiers work as expected. Subscripting: Any array can be subscripted, even if not an lvalue. Variable Length: Arrays whose length is computed at run time. Vector Extensions: Using vector instructions through built-in functions. Zero Length: Zero-length arrays. Attributes Attribute Syntax: Formal syntax for attributes. Function Attributes: Declaring that functions have no side effects, or that they can never return. Inline: Defining inline functions (as fast as macros). Label Attributes: Specifying attributes on labels. Thread-Local: Per-thread variables. Type Attributes: Specifying attributes of types. Variable Attributes: Specifying attributes of variables. Volatiles: What constitutes an access to a volatile object. BuiltIns __atomic Builtins: Atomic built-in functions with memory model. __sync Builtins: Legacy built-in functions for atomic memory access. Cilk Plus Builtins: Built-in functions for the Cilk Plus language extension. Integer Overflow Builtins: Built-in functions to perform arithmetics and arithmetic overflow checking. Object Size Checking: Built-in functions for limited buffer overflow checking. Other Builtins: Other built-in functions. Pointer Bounds Checker builtins: Built-in functions for Pointer Bounds Checker. Target Builtins: Built-in functions specific to particular targets. x86 specific memory model extensions for transactional memory: x86 memory models. Computed values Alignment: Inquiring about the alignment of a type or variable. Conditionals: Omitting the middle operand of a ‘?:’ expression. Initializers: Non-constant initializers. Offsetof: Special syntax for implementing offsetof. Pointer Arith: Arithmetic on void-pointers and function pointers. Return Address: Getting the return or frame address of a function. Statement Exprs: Putting statements and declarations inside expressions. Target Format Checks: Format checks specific to particular targets. Typeof: typeof: referring to the type of an expression. Variadic Macros: Macros with a variable number of arguments. Constant Values Binary constants: Binary constants using the ‘0b’ prefix. Character Escapes: ‘\e’ stands for the character . Compound Literals: Compound literals give structures, unions or arrays as values. Escaped Newlines: Slightly looser rules for escaped newlines. Hex Floats: Hexadecimal floating-point constants. Labels as Values: Getting pointers to labels, and computed gotos. Data types __int128: 128-bit integers---__int128. Complex: Data types for complex numbers. Decimal Float: Decimal Floating Types. Empty Structures: Structures with no members. Fixed-Point: Fixed-Point Types. Floating Types: Additional Floating Types. Half-Precision: Half-Precision Floating Point. Long Long: Double-word integers---long long int. Naming Alternate Keywords: __const__, __asm__, etc., for header files. Cast to Union: Casting to union type from any member of the union. Dollar Signs: Dollar sign is allowed in identifiers. Function Names: Printable strings which are the name of the current function. Incomplete Enums: enum foo;, with details to follow. Local Labels: Labels local to a block. Mixed Declarations: Mixing declarations and code. Named Address Spaces: Named address spaces. Unnamed Fields: Unnamed struct/union fields within structs/unions. Misc C++ Comments: C++ comments are recognized. Case Ranges: `case 1 ... 9' and such. Constructing Calls: Dispatching a call to another function. Function Prototypes: Prototype declarations and old-style definitions. Nested Functions: As in Algol and Pascal, lexical scoping of functions. Pragmas: Pragmas accepted by GCC. Using Assembly Language with C: Instructions and extensions for interfacing C with assembler. In a related question, should we change the "Extensions" page to only contain these section headers ("Data types", "Builtins" etc) and make submenus (ie new pages) out of the items under them? From a str
Re: organization of C Extensions in manual
> H, looking at your list, I think it is better to leave things that > are hard to categorize where they are, as top-level sections within the > chapter, rather than trying to group them arbitrarily with other things > that are hard to categorize; the latter only makes things things harder > to find by burying them. I understand what you are saying. I have said uncomplimentary things about other documentation that had the information I was looking for in the 'wrong' place. OTOH, leaving everything ungrouped is what we have now, and that's also a mess. An unordered list with more than about a dozen entries also makes it "hard to find things by burying them." While I have disagreed with some of your suggestions (see comments below), some of them made good sense. In an attempt to keep this post readable, I have included a complete, updated list of groupings at the end, rather than piecemeal inline. >> Arrays and Vectors >> Designated Inits: Labeling elements of initializers. > > I think this one might better be placed in a section on initializers. Since the only place 'Designated Inits' can be used is on arrays, placing this in the 'Arrays and Vectors' section seems appropriate. >> Vector Extensions: Using vector instructions through built-in functions. > > I'm not sure where this one belongs, but it doesn't seem to go with > array extensions. It's not really about builtins, either. The section is entitled 'Arrays and Vectors.' I don't see why a topic about vectors doesn't fit. > Extracting your next comments, you object to Inline, Thread-Local and Volatiles being grouped under 'Attributes.' Honestly, I don't have a problem implying that these are attributes. But since you do, I pored over some standards docs, and it seems like they refer to these things as 'Specifiers.' I have created a new section and moved these three there. >> Computed values >> Constant Values > I don't really like this division. To me, it would be more useful to > have a section on purely syntactic extensions (the binary constants, > character escapes, dollar signs in identifiers), and one on expression > values. I don't know what the term "syntactic extensions" would mean in this context. Not knowing what it means makes it difficult to know what items to place under it. > I don't think the variadic macros section belongs in either of those > groups and should be left separate for now. Ok. >> Naming > A lot of these things don't have anything to do with naming, and > burying them here would just make them hard to find. How about > moving the purely syntactic things to that section and leave the > others at top-level for now, unless some other grouping suggests > itself appropriate with the leftovers. Ok, here you've got me. Forcing some of these items into 'Naming' required a bit of imagination. I could walk you thru the contortions I used to justify them, but finding a better grouping seems like a more sensible plan (see below). >> Misc > The first two items might go into the syntactic extensions category. > Pragmas and inline asm definitely should stay at top level. Not > sure about the others. It is not clear to me that moving them back to top level provides any advantages over putting them in the 'Misc' group. Folding in your changes (especially re-working Naming and Constants/Computed) gives me this. I can already guess which one is going to make you wince, but perhaps you can come up with a better section name. Arrays and Vectors Designated Inits: Labeling elements of initializers. Pointers to Arrays: Pointers to arrays with qualifiers work as expected. Subscripting: Any array can be subscripted, even if not an lvalue. Variable Length: Arrays whose length is computed at run time. Vector Extensions: Using vector instructions through built-in functions. Zero Length: Zero-length arrays. Assignments Cast to Union: Casting to union type from any member of the union. Conditionals: Omitting the middle operand of a ‘?:’ expression. Compound Literals: Compound literals give structures, unions or arrays as values. Initializers: Non-constant initializers. Statement Exprs: Putting statements and declarations inside expressions. Attributes Attribute Syntax: Formal syntax for attributes. Function Attributes: Declaring that functions have no side effects, or that they can never return. Label Attributes: Specifying attributes on labels. Type Attributes: Specifying attributes of types. Variable Attributes: Specifying attributes of variables. BuiltIns __atomic Builtins: Atomic built-in functions with memory model. __sync Builtins: Legacy built-in functions for atomic memory access. Cilk Plus Builtins: Built-in functions for the Cilk Plus language extension. Integer Overflow Builtins: Built-in functions to perform arithmetics and arithmetic overflow checking. Object Size
Re: organization of C Extensions in manual
On 1/22/2015 1:23 PM, Joseph Myers wrote: On Thu, 22 Jan 2015, Jeff Law wrote: Inline: Defining inline functions (as fast as macros). Doesn't seem to belong here. Given that inline isn't really an extension anymore, one could argue this isn't relevant anymore. Well, we need to document -std=gnu89 / __gnu_inline__ attribute semantics. And the fact that a C99 feature is accepted as an extension in C89 mode is something that needs documenting. Though for C extensions maybe the documentation would better describe exceptions - extensions from newer C standards that are *not* enabled in older standard modes, or that are only enabled when you use alternative keywords such as __inline. Then you wouldn't need to go into the details e.g. of long long, just that it's an extension accepted in C89 / C++98 mode. I can't speak to whether removing this topic is a good idea or not. For the moment, I have moved volatile, inline and thread-local from the Attributes section to their own section (Specifiers). If it is time to remove inline from this page (or just re-work it), someone else will need to pursue this. (For C++ there may be more features than for C that require -std=c++11 to enable them, so listing exceptions may not be the right approach for C++.) Ways in which GNU C features go beyond the corresponding standard feature do also need documenting. (So VLA documentation needs to refer to VLAs in structures, and to the interaction with alloca, and to parameter forward declarations, even if the basic feature is described in terms of a C99 feature also supported in C89 mode and for C++ and there's less need to describe the C99 semantics.)
inline asm clobbers
Why does gcc allow you to specify clobbers using numbers: asm ("" : : "r" (var) : "0"); // i386: clobbers eax How is this better than using register names? This makes even less sense when you realize that (apparently) the indices of registers aren't fixed. Which means there is no way to know which register you have clobbered in order to use it in the template. Having just seen someone trying (unsuccessfully) to use this, it seems like there is no practical way you can. Which makes me wonder why it's there. And whether it still should be. dw
Re: inline asm clobbers
On 3/11/2015 4:19 PM, Ian Lance Taylor wrote: On Wed, Mar 11, 2015 at 3:58 PM, David Wohlferd wrote: Why does gcc allow you to specify clobbers using numbers: asm ("" : : "r" (var) : "0"); // i386: clobbers eax How is this better than using register names? This makes even less sense when you realize that (apparently) the indices of registers aren't fixed. Which means there is no way to know which register you have clobbered in order to use it in the template. Having just seen someone trying (unsuccessfully) to use this, it seems like there is no practical way you can. Which makes me wonder why it's there. And whether it still should be. I don't know why it works. It should be consistent, though. It's simply GCC's internal hard register number, which doesn't normally change. The reason I believe the order can change is this comment from i386.h: /* Order in which to allocate registers. Each register must be listed once, even those in FIXED_REGISTERS. List frame pointer late and fixed registers last. Note that, in general, we prefer registers listed in CALL_USED_REGISTERS, keeping the others available for storage of persistent values. The ADJUST_REG_ALLOC_ORDER actually overwrite the order, so this is just empty initializer for array. */ My attempts to follow ADJUST_REG_ALLOC_ORDER were not particularly successful, but it did take me to this comment in ira.c: /* This is called every time when register related information is changed. */ I would agree that one should avoid it. I'd be wary of removing it from GCC at this point since it might break working code. I hear you on this. Removing existing functionality is definitely risky, so I agree with your caution. And of course changing anything is much less important if the register order here really is fixed. On the other hand, what if my fear about register order changing is correct? In that case people who assume (as you have) that they don't change are clobbering random registers. Also, the fellow I saw trying to use this (incorrectly) assumed that "0" was referring to the same thing as the template's "%0." If people don't (or if in fact it is impossible to) use this safely, "breaking" their code by forcing them to use register names might be the best way to fix it. dw
Re: inline asm clobbers
On 3/11/2015 4:41 PM, paul_kon...@dell.com wrote: On Mar 11, 2015, at 7:19 PM, Ian Lance Taylor wrote: On Wed, Mar 11, 2015 at 3:58 PM, David Wohlferd wrote: Why does gcc allow you to specify clobbers using numbers: asm ("" : : "r" (var) : "0"); // i386: clobbers eax How is this better than using register names? This makes even less sense when you realize that (apparently) the indices of registers aren't fixed. Which means there is no way to know which register you have clobbered in order to use it in the template. Having just seen someone trying (unsuccessfully) to use this, it seems like there is no practical way you can. Which makes me wonder why it's there. And whether it still should be. I don't know why it works. It should be consistent, though. It's simply GCC's internal hard register number, which doesn't normally change. I would agree that one should avoid it. I'd be wary of removing it from GCC at this point since it might break working code. It certainly would. It’s not all that common, but I have seen this done in production code. Come to think of it, this certainly makes sense in machines where some instructions act on fixed registers. Really? While I've seen much code that uses clobbers, I have never (until this week) see anyone attempt to clobber by index. Since I'm basically an i386 guy, maybe this is a platform thing? Do you have some examples/links? Register names would be nice as an additional capability. Every example I've ever seen uses register names. Perhaps that what you've seen before? dw
Re: inline asm clobbers
Resending due to bounced email. On 3/11/2015 6:19 PM, Ian Lance Taylor wrote: On Wed, Mar 11, 2015 at 5:51 PM, David Wohlferd wrote: The reason I believe the order can change is this comment from i386.h: /* Order in which to allocate registers. Each register must be listed once, even those in FIXED_REGISTERS. List frame pointer late and fixed registers last. Note that, in general, we prefer registers listed in CALL_USED_REGISTERS, keeping the others available for storage of persistent values. The ADJUST_REG_ALLOC_ORDER actually overwrite the order, so this is just empty initializer for array. */ That is REG_ALLOC_ORDER. The index that appears in an asm statement is the hard register number. REG_ALLOC_ORDER is an array holding hard register numbers. The hard register numbers can change in principle, by changing the source code, but I actually can't recall that ever happening. To wrap this up: Like Ian said, the order of registers here apparently never changes. I read more into that comment than I should have. For good luck, I experimented with -fomit-frame-pointer, -ffixed-, etc, and nothing has any impact here. The list is the list. In fact, it turns out you can use this same format with register variables: register int x asm("3"); // i386: ebx So while I find it ugly, unnecessarily complex, and lacking in self-documenting-ness, it is not inherently buggy the way I feared it was, so I can't think of any good arguments for pulling it out. Thanks to Ian and Paul for straightening me out. dw
Re: inline asm clobbers
On 3/12/2015 7:24 AM, paul_kon...@dell.com wrote: On Mar 11, 2015, at 8:53 PM, David Wohlferd wrote: ... I would agree that one should avoid it. I'd be wary of removing it from GCC at this point since it might break working code. It certainly would. It’s not all that common, but I have seen this done in production code. Come to think of it, this certainly makes sense in machines where some instructions act on fixed registers. Really? While I've seen much code that uses clobbers, I have never (until this week) see anyone attempt to clobber by index. Since I'm basically an i386 guy, maybe this is a platform thing? Do you have some examples/links? The example I remember was not in open code. It may have been cleaned up by now, but as supplied to us by the vendor, there were some bits of assembly code that needed a scratch register and used a fixed register (t0 == %8) for that purpose rather than having GCC deal with temporaries. So there was a clobber with “8” in it. Obviously there’s a better way in that instance, but if GCC had removed the feature before we found and cleaned up that code, we would have had a failure on our hands. An example of hardwired registers I remember is some VAX instructions (string instructions). You could write those by name, of course, but if you didn’t know that GCC supports names, you might just use numbers. On machines like VAX where the register names are really just numbers (R0, R1, etc.) that isn’t such a strange thing to do. Register names would be nice as an additional capability. Every example I've ever seen uses register names. Perhaps that what you've seen before? No; I didn’t know that gcc supports register names. The examples I had seen all use numbers. Or more often, preprocessor symbols. It may be that’s because the most common asm code I run into is MIPS coprocessor references, and while general registers may be known by name, coprocessor registers may not be. Or it may just be a case of lack of awareness. paul Ahh. So perhaps as I suspected: this is more commonly used on non-i386 platforms. So clearly removing this is out of the question. This brings us to the question of documentation. Right now the docs only refer to register names. I expect it would be helpful for people to understand what it means when they come across code that uses indices. A few words in the 'clobbers' section and the two Reg Vars sections would probably cover it. Perhaps some variation of: In addition to specifying registers by name, it is also possible to use a register index (ie "3" to refer to the 3rd register). The list of registers and their order is platform specific. See the REGISTER_NAMES defined for your platform in the gcc source. I'm not excited about pointing people vaguely toward the source, but that's the only place I know to find this info. I realize this may seem a bit redundant for people who are used to registers named R0,R1,R2..., but on the i386, the order is: ax,dx,cx,bx,si... dw
Optimization breaks inline asm code w/ptrs
Environment: gcc 6.1 compiling for 64bit i386 optimizations: -O2 Consider this simple bit of code (from https://stackoverflow.com/a/45656087/2189500): #include int getStringLength(const char *pStr){ int len; __asm__ ( "repne scasb\n\t" "not %%ecx\n\t" "dec %%ecx" :"=c" (len), "+D"(pStr) :"c"(-1), "a"(0) ); return len; } int main() { char buff[50] = "hello world"; int a = getStringLength(buff); printf("%s: %d\n", buff, a); } This code works as expected and prints out 11. Yay. However, if you add "buff[4] = 0;" before the call to getStringLength, it STILL prints out 11 (when optimizations are enabled), when it should print 4. I would expect this kind of behavior if the asm were in 'main.' But it has always been my understanding that function calls performed an implicit memory clobber. The fact that this clobber goes away during inlining means that code can stop working any time the compiler makes a different decision about whether or not to inline a function. Ouch. And before somebody asks: Adding "+m"(pStr) does not help. The result is that (apparently) you can NEVER safely pass a buffer pointer to inline asm without using the memory clobber. If this is true, I don't believe it is widely known. Given how 'heavy' memory clobbers are, I would hope that only pointers that have 'escaped' the function would get flushed before a function call. But not flushing *anything* seems very bad. dw
Re: Optimization breaks inline asm code w/ptrs
On 8/12/2017 10:14 PM, Andrew Pinski wrote: On Sat, Aug 12, 2017 at 10:08 PM, Andrew Pinski wrote: On Sat, Aug 12, 2017 at 9:21 PM, David Wohlferd wrote: Environment: gcc 6.1 compiling for 64bit i386 optimizations: -O2 Consider this simple bit of code (from https://stackoverflow.com/a/45656087/2189500): #include int getStringLength(const char *pStr){ int len; __asm__ ( "repne scasb\n\t" "not %%ecx\n\t" "dec %%ecx" :"=c" (len), "+D"(pStr) :"c"(-1), "a"(0) ); return len; } int main() { char buff[50] = "hello world"; int a = getStringLength(buff); printf("%s: %d\n", buff, a); } This code works as expected and prints out 11. Yay. However, if you add "buff[4] = 0;" before the call to getStringLength, it STILL prints out 11 (when optimizations are enabled), when it should print 4. I would expect this kind of behavior if the asm were in 'main.' But it has always been my understanding that function calls performed an implicit memory clobber. The fact that this clobber goes away during inlining means that code can stop working any time the compiler makes a different decision about whether or not to inline a function. Ouch. And before somebody asks: Adding "+m"(pStr) does not help. But does adding: "+m"(*pStr) Help? "+m"(pStr) Just says pStr variable changes, not what it points to. Using "+m"(*pStr) gives a compile error (read-only location used as output). Using "m"(*pStr) as an (unused) input parameter has no effect. Using "m"(*pStr) as a used input parameter can work. But let's not get off track here. The purpose of this question isn't "how do I make this code work?" I can already give you at least 3 different work-arounds. The goal here is to understand how function parameters can (safely) be passed to inline asm. I should ask why are you using inline-asm for this? strlen will have the best optimized version for your processor anyways. I don't actually care about this particular example. It's just a question I was answering for a user on SO (see link above) when someone pointed out the subtle, but more serious problem. So it just serves as a 'minimal' example to illustrate the issue I'm asking about, which is: The result is that (apparently) you can NEVER safely pass a buffer pointer to inline asm without using the memory clobber. If this is true, I don't believe it is widely known. This seems like a "by definition" thing. By definition when writing a function, you should assume one of these two things: 1) It is perfectly reasonable for a function to assume that on entry all pointers to memory that the function can access have been clobbered and can thus those pointers can be safely passed to inline asm. 2) It is never (never never never) reasonable for a function to assume that on entry (or indeed at any time) that a pointer to memory can be used to read that memory via inline asm unless either a memory constraint is used, or a memory clobber is included. Observation suggests that despite my expectations, #1 is false, which implies #2 is correct. I don't know that this is generally understood. Passing function parameters (including pointers) to inline asm is not uncommon. If #2 is the rule, I expect I'm not the first person to break it. Next question: Is this by design? Just because it does behave this way doesn't mean that it should. As I say, I expected that calling a function always does an implied clobber (at least for escaped pointers if not a complete clobber). And indeed, if getStringLength isn't inlined (ie via attribute), the code always works as expected. If there is an implicit clobber for noinline, why wouldn't there be one for inline? Wouldn't this be something inherent in the definition of "declaring and calling a function?" And perhaps more significantly: This exact same code compiled with -m32 instead of -m64 works correctly (not sure about other architectures). Could this behavior be an unintended consequence of some overly aggressive optimization related to 64bit inlining? How SHOULD this work? Given how 'heavy' memory clobbers are, I would hope that only pointers that have 'escaped' the function would get flushed before a function call. But not flushing *anything* seems very bad. dw