Re: Cron sh /home/gccadmin/scripts/update_version_git

2020-11-25 Thread Martin Liška

On 11/25/20 2:58 AM, H.J. Lu via Gcc wrote:

On Tue, Nov 24, 2020 at 5:19 PM Joseph Myers  wrote:


On Wed, 25 Nov 2020, (Cron Daemon) via Gccadmin wrote:


=== Working on: master ===
branch pulled and checked out
61 revisions since last Daily bump
Traceback (most recent call last):
   File "../gcc-changelog/git_update_version.py", line 143, in 
 update_current_branch()
   File "../gcc-changelog/git_update_version.py", line 100, in 
update_current_branch
 % (commit.hexsha, head.hexsha))
   File "/tmp/tmp.yGFjw60tQz/gcc-changelog/git_repository.py", line 76, in 
parse_git_revisions
 commit_to_info_hook=commit_to_info)
   File "/tmp/tmp.yGFjw60tQz/gcc-changelog/git_commit.py", line 281, in __init__
 self.info = self.commit_to_info_hook(self.revert_commit)
   File "/tmp/tmp.yGFjw60tQz/gcc-changelog/git_repository.py", line 37, in 
commit_to_info
 c = repo.commit(commit)
   File "/home/gccadmin/.local/lib/python3.6/site-packages/git/repo/base.py", 
line 480, in commit
 return self.rev_parse(str(rev) + "^0")
   File "/home/gccadmin/.local/lib/python3.6/site-packages/git/repo/fun.py", 
line 213, in rev_parse
 obj = name_to_object(repo, rev[:start])
   File "/home/gccadmin/.local/lib/python3.6/site-packages/git/repo/fun.py", 
line 147, in name_to_object
 raise BadName(name)
gitdb.exc.BadName: Ref 'c4fa3728ab4f78984a549894e0e8c4d6a253e540,' did not 
resolve to an object


I don't know where that comma after the commit id came from, but something
appears to have broken the update_version_git cron job.


Could it be

commit ce2d9549f2b2bcb70a1a6f8f4e776e1ed427546b
Author: Ulrich Weigand 
Date:   Tue Nov 24 19:30:01 2020 +0100

 Revert: "Fix -ffast-math flags handling inconsistencies"

 This reverts commit c4fa3728ab4f78984a549894e0e8c4d6a253e540,
^^^
 which caused a regression in the default for flag_excess_precision.






Hello.

I fixed that in d3e763efcb85d2b5967aeb3178567e435e796420 and Jakub will run the 
cron job
manually.

Thanks for heads up.
Martin


Re: DWARF64 gcc/clang flag discussion

2020-11-25 Thread Richard Biener via Gcc
On Tue, Nov 24, 2020 at 7:38 PM David Blaikie  wrote:
>
> On Tue, Nov 24, 2020 at 3:11 AM Jakub Jelinek  wrote:
> >
> > On Tue, Nov 24, 2020 at 12:04:45PM +0100, Mark Wielaard wrote:
> > > Hi,
> > >
> > > On Tue, 2020-11-24 at 08:50 +0100, Richard Biener wrote:
> > > > On Tue, Nov 24, 2020 at 8:45 AM Jakub Jelinek  wrote:
> > > > > I agree with Richard and I'd lean towards -gdwarf32/-gdwarf64, even
> > > > > when DWARF 32 is released in 81 years from now or how many, it would
> > > > > use -gdwarf-32.
> > > >
> > > > Works for me.  Let's go with -gdwarf32/64.
> > >
> > > I don't have a strong opinion, so if that is the consensus, lets go
> > > with that. The only open question (which I wanted to avoid by picking
> > > -f...) is whether it enables generating debuginfo as is normal when
> > > using any -goption, or whether you need another -goption to explicitly
> > > turn on debuginfo generation when using -gdwarf32/64? My preference
> > > would be that any option starting with -g enables debuginfo generation
> > > and no additional -g is needed to keep things consistent.
> >
> > I think we lost that consistency already, I think -gsplit-dwarf has been
> > changed quite recently not to imply -g.
>
> My understanding was that that change hasn't gone in at this point, in
> part because of the issue of changing the semantics of an existing
> flag and discussions around whether -g implies debug info. Could you
> confirm if this change has been made in GCC? as it may be important to
> make a similar change in Clang for consistency.
>
> Not that Split DWARF would be the first example of -g flags that don't
> imply -g. (-ggnu-pubnames, I think, comes to mind)
>
> > That said, for -gdwarf32/64, I think it is more sensible to enable debug
> > info than not to.
>
> Given my (& I think others on both GCC and Clang from what I gathered
> from the previous threads) fairly strong desire to allow selecting
> features without enabling debug info - perhaps it'd make sense for
> Clang to implement -fdwarf32/64 and then can implement -gdwarf32/64
> for compatibility whenever GCC does (without implementing -gdwarf32/64
> with potentially differing semantics than GCC re: enabling debug info)
>
> Seems these conversations end up with a bunch of different
> perspectives which is compounding the inconsistencies/variety in
> flags.
>
> If there's general agreement that -g* flags should imply -g, maybe we
> could carveout the possibility then that -f flags can affect debug
> info generation but don't enable it? For users who want to be able to
> change build-wide settings while having potentially
> per-library/per-file customization. (eg: I want to turn down the debug
> info emission on this file (to, say, -gmlt) but I don't want to force
> debug info on for this file regardless of build settings)

I don't think that all -g switches have to enable debuginfo generation.
Historically the -g flags selecting a debuginfo format did and I guess
we need to continue to do that for backward compatibility (-gdwarf,
-gstabs, etc.).  All other -g flags should not enable debug and some
clearly don't, like -gcolumn-info which is even enabled by default.
Also -gno-pubnames does not disable debug.

>From looking at the source the following options enable debug:

-g
-gN
-gdwarf
-gdwarf-N
-ggdb
-gstabs
-gstabs+
-gvms
-gxcoff
-gxcoff+

all others do not.  And yes, the -gsplit-dwarf change went in.

Richard.

> - Dave


Re: Reassociation and trapping operations

2020-11-25 Thread Richard Biener via Gcc
On Wed, Nov 25, 2020 at 8:15 AM Marc Glisse  wrote:
>
> On Wed, 25 Nov 2020, Ilya Leoshkevich via Gcc wrote:
>
> > I have a C floating point comparison (a <= b && a >= b), which
> > test_for_singularity turns into (a <= b && a == b) and vectorizer turns
> > into ((a <= b) & (a == b)).  So far so good.
> >
> > eliminate_redundant_comparison, however, turns it into just (a == b).
> > I don't think this is correct, because (a <= b) traps and (a == b)
> > doesn't.
>
>
> Hello,
>
> let me just mention the old
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53805
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53806
>
> There has been some debate about the exact meaning of -ftrapping-math, but
> don't let that stop you.

My interpretation has been that GCC considers traps not observable
unless you compile with -fnon-call-exceptions which means that GCC
happily elides them.  That's usually in-line of user expectations with
respect to optimization - they do not expect us to do less optimization
just for the sake if there's a trap.  Of course we do have to be careful
to not introduce traps where there were none.

In particular for say

 a <= b;
 foo ();

you cannot rely on foo () never being called when a <= b traps because
its effect on control flow is not modeled in the IL (we also happily
DCE any such possibly trapping operation - the traps are not considered
unmodeled side-effects).
Even with -fnon-call-exceptions when the possible exception is not caught within
the function there are probably similar issues with respect to code motion.

Richard.

> --
> Marc Glisse


Re: Reassociation and trapping operations

2020-11-25 Thread Ilya Leoshkevich via Gcc
On Wed, 2020-11-25 at 10:53 +0100, Richard Biener wrote:
> On Wed, Nov 25, 2020 at 8:15 AM Marc Glisse 
> wrote:
> > On Wed, 25 Nov 2020, Ilya Leoshkevich via Gcc wrote:
> > 
> > > I have a C floating point comparison (a <= b && a >= b), which
> > > test_for_singularity turns into (a <= b && a == b) and vectorizer
> > > turns
> > > into ((a <= b) & (a == b)).  So far so good.
> > > 
> > > eliminate_redundant_comparison, however, turns it into just (a ==
> > > b).
> > > I don't think this is correct, because (a <= b) traps and (a ==
> > > b)
> > > doesn't.
> > 
> > Hello,
> > 
> > let me just mention the old
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53805
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53806
> > 
> > There has been some debate about the exact meaning of -ftrapping-
> > math, but
> > don't let that stop you.
> 
> My interpretation has been that GCC considers traps not observable
> unless you compile with -fnon-call-exceptions which means that GCC
> happily elides them.  That's usually in-line of user expectations
> with
> respect to optimization - they do not expect us to do less
> optimization
> just for the sake if there's a trap.  Of course we do have to be
> careful
> to not introduce traps where there were none.
> 
> In particular for say
> 
>  a <= b;
>  foo ();
> 
> you cannot rely on foo () never being called when a <= b traps
> because
> its effect on control flow is not modeled in the IL (we also happily
> DCE any such possibly trapping operation - the traps are not
> considered
> unmodeled side-effects).
> Even with -fnon-call-exceptions when the possible exception is not
> caught within
> the function there are probably similar issues with respect to code
> motion.
> 
> Richard.
> 
> > --
> > Marc Glisse

Thanks for the explanation, that's good to know.  I'll need to rather
ad
just my test expectations then.

Best regards,
Ilya



Re: PETITION TO REMOVE -fexec-charset in GCC. That is purely garbage and undefined behavior.

2020-11-25 Thread Zack Weinberg
> printf(“Hello World\n”); is UB under -fexec-charset= EBCDIC. WTF WTF!!!

It's not undefined behavior.  It does, however, appear to trip various
bugs in GCC.

$ cat test.c
#include 
int main(void) { printf("hello world\n"); }

$ gcc-9 --version | head -n1
gcc-9 (Debian 9.3.0-18) 9.3.0
$ gcc-9 -fexec-charset=EBCDIC-US test.c
during GIMPLE pass: printf-return-value
test.c: In function ‘main’:
test.c:2: internal compiler error: converting to execution character
set: Invalid or incomplete multibyte or wide character
2 | int main(void) { printf("hello world\n"); }

$ gcc-10 --version | head -n1
gcc (Debian 10.2.0-18) 10.2.0

$ gcc-10 -fexec-charset=EBCDIC=US -O2 test.c
during GIMPLE pass: strlen
test.c: In function ‘main’:
test.c:2: internal compiler error: converting to execution character
set: Invalid or incomplete multibyte or wide character
2 | int main(void) { printf("hello world\n"); }

But if you manage to avoid all the bugs, it works the way it's supposed to:

$ gcc-10 -fexec-charset=EBCDIC-US -O0 test.c
$ ./a.out | iconv -f EBCDIC-US -t UTF-8
hello world

"Internal compiler error" means "there is a bug in the compiler".  It
is not the same as "undefined behavior," which means something more
like "there is a bug in your code that the compiler is not obliged to
diagnose."

If this is not the problem you encountered, please describe in
excruciating detail what your problem actually was.

zw

p.s. I agree with you that the C "locale" mechanism and the C
standard's concept of "execution character set" are poorly designed
and one is usually better off writing code that avoids depending on
them.  But please understand that it's almost impossible to remove
_anything_ from the C standard, because the main thing C has going for
it anymore is backward compatibility all the way to the 1980s.  We
will not be dropping -fexec-charset as long as it's a feature of the C
standard.


Re: unnormal Intel 80-bit long doubles and isnanl

2020-11-25 Thread Joseph Myers
On Wed, 25 Nov 2020, Siddhesh Poyarekar wrote:

> Would you agree to treating unnormals as NaNs and consequently have glibc
> provide that guarantee in the library instead of either declaring it undefined
> or maintaining the status quo, i.e. keeping it unspecified?

I think it would be a pain to maintain test coverage for unnormals (and 
presumably all the other kinds of unsupported operands, and you'd need to 
work out what semantics you want for pseudo-denormals as well since those 
are the one kind of such representation the processor doesn't raise 
"invalid" for) for all the functions with floating-point arguments - and 
claiming to handle those consistently requires having such test coverage 
(there are only a few tests for such format-specific representations in 
sysdeps/ieee754/ldbl-96 at present).

But maybe you could set up some mechanism by which, when gen-libm-test.py 
processes a test using snan_value or snan_value_ld (but not 
snan_value_pl), and the relevant format is one of the format variants that 
has these representations, it automatically generates tests for all those 
variants (that the processor raises "invalid" for when handling as 
operands, i.e. treats much like sNaN).  I'm not sure if it's actually 
possible to generate a static initializer for a long double value with one 
of those representations, or only for a union containing a long double 
where another member is initialized; if a union type needs to be used in 
the tables of test inputs, that further complicates things.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: Reassociation and trapping operations

2020-11-25 Thread Joseph Myers
On Wed, 25 Nov 2020, Richard Biener via Gcc wrote:

> > Hello,
> >
> > let me just mention the old
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53805
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53806
> >
> > There has been some debate about the exact meaning of -ftrapping-math, but
> > don't let that stop you.
> 
> My interpretation has been that GCC considers traps not observable
> unless you compile with -fnon-call-exceptions which means that GCC

-ftrapping-math is primarily about raising exception flags (the only form 
of floating-point exception handling supported in ISO C), not traps in the 
sense of change of control flow.  But it's true that GCC tends to do more 
moving flag raising (by virtue of not modelling the side effects) than 
eliminating it, and more eliminating it (at least in the case of 
apparently dead code, again since the side effects are not modelled) than 
introducing extra flag raising that doesn't occur in the abstract machine.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: DWARF64 gcc/clang flag discussion

2020-11-25 Thread David Blaikie via Gcc
On Wed, Nov 25, 2020 at 1:22 AM Richard Biener
 wrote:
>
> On Tue, Nov 24, 2020 at 7:38 PM David Blaikie  wrote:
> >
> > On Tue, Nov 24, 2020 at 3:11 AM Jakub Jelinek  wrote:
> > >
> > > On Tue, Nov 24, 2020 at 12:04:45PM +0100, Mark Wielaard wrote:
> > > > Hi,
> > > >
> > > > On Tue, 2020-11-24 at 08:50 +0100, Richard Biener wrote:
> > > > > On Tue, Nov 24, 2020 at 8:45 AM Jakub Jelinek  
> > > > > wrote:
> > > > > > I agree with Richard and I'd lean towards -gdwarf32/-gdwarf64, even
> > > > > > when DWARF 32 is released in 81 years from now or how many, it would
> > > > > > use -gdwarf-32.
> > > > >
> > > > > Works for me.  Let's go with -gdwarf32/64.
> > > >
> > > > I don't have a strong opinion, so if that is the consensus, lets go
> > > > with that. The only open question (which I wanted to avoid by picking
> > > > -f...) is whether it enables generating debuginfo as is normal when
> > > > using any -goption, or whether you need another -goption to explicitly
> > > > turn on debuginfo generation when using -gdwarf32/64? My preference
> > > > would be that any option starting with -g enables debuginfo generation
> > > > and no additional -g is needed to keep things consistent.
> > >
> > > I think we lost that consistency already, I think -gsplit-dwarf has been
> > > changed quite recently not to imply -g.
> >
> > My understanding was that that change hasn't gone in at this point, in
> > part because of the issue of changing the semantics of an existing
> > flag and discussions around whether -g implies debug info. Could you
> > confirm if this change has been made in GCC? as it may be important to
> > make a similar change in Clang for consistency.
> >
> > Not that Split DWARF would be the first example of -g flags that don't
> > imply -g. (-ggnu-pubnames, I think, comes to mind)
> >
> > > That said, for -gdwarf32/64, I think it is more sensible to enable debug
> > > info than not to.
> >
> > Given my (& I think others on both GCC and Clang from what I gathered
> > from the previous threads) fairly strong desire to allow selecting
> > features without enabling debug info - perhaps it'd make sense for
> > Clang to implement -fdwarf32/64 and then can implement -gdwarf32/64
> > for compatibility whenever GCC does (without implementing -gdwarf32/64
> > with potentially differing semantics than GCC re: enabling debug info)
> >
> > Seems these conversations end up with a bunch of different
> > perspectives which is compounding the inconsistencies/variety in
> > flags.
> >
> > If there's general agreement that -g* flags should imply -g, maybe we
> > could carveout the possibility then that -f flags can affect debug
> > info generation but don't enable it? For users who want to be able to
> > change build-wide settings while having potentially
> > per-library/per-file customization. (eg: I want to turn down the debug
> > info emission on this file (to, say, -gmlt) but I don't want to force
> > debug info on for this file regardless of build settings)
>
> I don't think that all -g switches have to enable debuginfo generation.

Any thoughts on this case - whether -gdwarf32/-gdwarf64 should imply -g?

> Historically the -g flags selecting a debuginfo format did and I guess
> we need to continue to do that for backward compatibility (-gdwarf,
> -gstabs, etc.).

-gdwarf-N sort of falls under this category, at least for backwards
compatibility - though whether it "selects a debuginfo format" might
be a bit open to interpretation. Where does -gdwarf32/-gdwarf64 fall
on that spectrum for you? I guess the important part is compatibility,
not whether it selects a debug info format or does something else.
There's no need for mechanical compatibility (though possibly for
human compatibility - having -gdwarf-4 enable -g but -gdwarf32 not
enable -g seems fairly subtle to me) here, but some folks on this
thread suggest -gdwarf32 should enable -g (Jakub and Jeff).

> All other -g flags should not enable debug and some
> clearly don't, like -gcolumn-info which is even enabled by default.
> Also -gno-pubnames does not disable debug.
>
> From looking at the source the following options enable debug:
>
> -g
> -gN
> -gdwarf
> -gdwarf-N
> -ggdb
> -gstabs
> -gstabs+
> -gvms
> -gxcoff
> -gxcoff+
>
> all others do not.  And yes, the -gsplit-dwarf change went in.

Oh. Seems a pity from a backwards (& sidewards with clang - though
we'll probably update ours to match to reduce that problem)
compatibility standpoint, but good to know!

- Dave


Why there are no macro to access the name of current exec-charset and fwide-exec-charset??????

2020-11-25 Thread sotrdg sotrdg via Gcc
I have to make filename work correctly under none UTF-8 exec charset. Do not 
tell me to use locale since locale is not thread-safe and configure locale is a 
huge issue.

I do not use any C stdio or C++ iostream facilities since I create my own I/O 
library from scratch. However, this is something I would like to deal with it 
correctly since none of these libc library does the right thing, they all 
assume exec-charset is UTF-8 and that just creates format string 
vulnerabilities silently.

If you guys could not correctly support this feature, please just remove that 
toggle, just like removing trigraph. Clang does not support none UTF-8 
exec-charset and no one complains about it.

Sent from Mail for Windows 10



Re: unnormal Intel 80-bit long doubles and isnanl

2020-11-25 Thread Siddhesh Poyarekar

On 11/26/20 12:57 AM, Joseph Myers wrote:

I think it would be a pain to maintain test coverage for unnormals (and
presumably all the other kinds of unsupported operands, and you'd need to
work out what semantics you want for pseudo-denormals as well since those
are the one kind of such representation the processor doesn't raise
"invalid" for) for all the functions with floating-point arguments - and
claiming to handle those consistently requires having such test coverage
(there are only a few tests for such format-specific representations in
sysdeps/ieee754/ldbl-96 at present).


pseudo-denormals are still considered valid, so I'm admittedly punting 
them for later since the processor manual still claims to handle them 
correctly.



But maybe you could set up some mechanism by which, when gen-libm-test.py
processes a test using snan_value or snan_value_ld (but not
snan_value_pl), and the relevant format is one of the format variants that
has these representations, it automatically generates tests for all those
variants (that the processor raises "invalid" for when handling as
operands, i.e. treats much like sNaN).  I'm not sure if it's actually
possible to generate a static initializer for a long double value with one
of those representations, or only for a union containing a long double
where another member is initialized; if a union type needs to be used in
the tables of test inputs, that further complicates things.
It would have to either be a union type with various bit patterns or a 
bit string copied into a long double; the CPU will never generate any of 
the pseudo numbers on its own.


Siddhesh


Re: PETITION TO REMOVE -fexec-charset in GCC. That is purely garbage and undefined behavior.

2020-11-25 Thread Martin Sebor via Gcc

On 11/25/20 8:15 AM, Zack Weinberg wrote:

printf(“Hello World\n”); is UB under -fexec-charset= EBCDIC. WTF WTF!!!


It's not undefined behavior.  It does, however, appear to trip various
bugs in GCC.

$ cat test.c
#include 
int main(void) { printf("hello world\n"); }

$ gcc-9 --version | head -n1
gcc-9 (Debian 9.3.0-18) 9.3.0
$ gcc-9 -fexec-charset=EBCDIC-US test.c
during GIMPLE pass: printf-return-value
test.c: In function ‘main’:
test.c:2: internal compiler error: converting to execution character
set: Invalid or incomplete multibyte or wide character
 2 | int main(void) { printf("hello world\n"); }

$ gcc-10 --version | head -n1
gcc (Debian 10.2.0-18) 10.2.0

$ gcc-10 -fexec-charset=EBCDIC=US -O2 test.c
during GIMPLE pass: strlen
test.c: In function ‘main’:
test.c:2: internal compiler error: converting to execution character
set: Invalid or incomplete multibyte or wide character
 2 | int main(void) { printf("hello world\n"); }

But if you manage to avoid all the bugs, it works the way it's supposed to:

$ gcc-10 -fexec-charset=EBCDIC-US -O0 test.c
$ ./a.out | iconv -f EBCDIC-US -t UTF-8
hello world

"Internal compiler error" means "there is a bug in the compiler".  It
is not the same as "undefined behavior," which means something more
like "there is a bug in your code that the compiler is not obliged to
diagnose."


I suspect this is due to the same problem as:
  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82700

The EBCDIC-US charset doesn't define all the characters GCC
expects (the bug above says it's missing the opemn left bracket
'[') and the GCC charset APIs don't make it possible to diagnose
this condition in a friendlier way.  (I mentioned this in response
to the duplicate bug:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97620)

The printf pass that fails with this error may not actually need
the left bracket so if that's the only one the conversion fails
for we could work around it by skipping it.  But if the left
bracket appears in the format string GCC will fail to translate
it and give another error (not an ICE, but still a hard error):

$ cat a.c && gcc -Wall -fexec-charset=EBCDIC-US a.c
int main(void) { __builtin_printf("hello [ world\n"); }
a.c: In function ‘main’:
a.c:1:52: error: converting to execution character set: Invalid or 
incomplete multibyte or wide character

1 | int main(void) { __builtin_printf("hello [ world\n"); }
  |^
a.c:1:35: warning: zero-length gnu_printf format string 
[-Wformat-zero-length]

1 | int main(void) { __builtin_printf("hello [ world\n"); }
  |   ^

It seems to me the EBCDIC-US charset needs to get fixed (i.e.,
Glibc).

Martin



If this is not the problem you encountered, please describe in
excruciating detail what your problem actually was.

zw

p.s. I agree with you that the C "locale" mechanism and the C
standard's concept of "execution character set" are poorly designed
and one is usually better off writing code that avoids depending on
them.  But please understand that it's almost impossible to remove
_anything_ from the C standard, because the main thing C has going for
it anymore is backward compatibility all the way to the 1980s.  We
will not be dropping -fexec-charset as long as it's a feature of the C
standard.





RE: PETITION TO REMOVE -fexec-charset in GCC. That is purely garbage and undefined behavior.

2020-11-25 Thread sotrdg sotrdg via Gcc
Glibc cannot deal with EBCDIC or any other charset besides UTF-8 since GCC 
itself does not emit exec-coding set to C library and C library just could not 
deal with it.

Even glibc could deal with it, GCC allows different exec-charset to be linked 
with each other which is definitely an undefined behavior since linker just 
ignores the whole stuff and does not know what are the differences between 
different exec-charset.

The entire C standard library is just designed poorly because of exec-charset 
can be anything and forced locale which violates zero-overhead principle.

I hope GCC could add a macro __GNUC_EXEC_CHARSET__ and 
__GNUC_WIDE_EXEC_CHARSET__ for example to tell the program the name of current 
exec-charset. I need that to make my program runs correctly under different 
exec-charset since glibc does the wrong thing which I have to avoid.



Sent from Mail for Windows 10

From: Martin Sebor
Sent: Wednesday, November 25, 2020 19:47
To: Zack Weinberg; 
gcc@gcc.gnu.org
Cc: euloa...@live.com
Subject: Re: PETITION TO REMOVE -fexec-charset in GCC. That is purely garbage 
and undefined behavior.

On 11/25/20 8:15 AM, Zack Weinberg wrote:
>> printf(“Hello World\n”); is UB under -fexec-charset= EBCDIC. WTF WTF!!!
>
> It's not undefined behavior.  It does, however, appear to trip various
> bugs in GCC.
>
> $ cat test.c
> #include 
> int main(void) { printf("hello world\n"); }
>
> $ gcc-9 --version | head -n1
> gcc-9 (Debian 9.3.0-18) 9.3.0
> $ gcc-9 -fexec-charset=EBCDIC-US test.c
> during GIMPLE pass: printf-return-value
> test.c: In function ‘main’:
> test.c:2: internal compiler error: converting to execution character
> set: Invalid or incomplete multibyte or wide character
>  2 | int main(void) { printf("hello world\n"); }
>
> $ gcc-10 --version | head -n1
> gcc (Debian 10.2.0-18) 10.2.0
>
> $ gcc-10 -fexec-charset=EBCDIC=US -O2 test.c
> during GIMPLE pass: strlen
> test.c: In function ‘main’:
> test.c:2: internal compiler error: converting to execution character
> set: Invalid or incomplete multibyte or wide character
>  2 | int main(void) { printf("hello world\n"); }
>
> But if you manage to avoid all the bugs, it works the way it's supposed to:
>
> $ gcc-10 -fexec-charset=EBCDIC-US -O0 test.c
> $ ./a.out | iconv -f EBCDIC-US -t UTF-8
> hello world
>
> "Internal compiler error" means "there is a bug in the compiler".  It
> is not the same as "undefined behavior," which means something more
> like "there is a bug in your code that the compiler is not obliged to
> diagnose."

I suspect this is due to the same problem as:
   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82700

The EBCDIC-US charset doesn't define all the characters GCC
expects (the bug above says it's missing the opemn left bracket
'[') and the GCC charset APIs don't make it possible to diagnose
this condition in a friendlier way.  (I mentioned this in response
to the duplicate bug:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97620)

The printf pass that fails with this error may not actually need
the left bracket so if that's the only one the conversion fails
for we could work around it by skipping it.  But if the left
bracket appears in the format string GCC will fail to translate
it and give another error (not an ICE, but still a hard error):

$ cat a.c && gcc -Wall -fexec-charset=EBCDIC-US a.c
int main(void) { __builtin_printf("hello [ world\n"); }
a.c: In function ‘main’:
a.c:1:52: error: converting to execution character set: Invalid or
incomplete multibyte or wide character
 1 | int main(void) { __builtin_printf("hello [ world\n"); }
   |^
a.c:1:35: warning: zero-length gnu_printf format string
[-Wformat-zero-length]
 1 | int main(void) { __builtin_printf("hello [ world\n"); }
   |   ^

It seems to me the EBCDIC-US charset needs to get fixed (i.e.,
Glibc).

Martin

>
> If this is not the problem you encountered, please describe in
> excruciating detail what your problem actually was.
>
> zw
>
> p.s. I agree with you that the C "locale" mechanism and the C
> standard's concept of "execution character set" are poorly designed
> and one is usually better off writing code that avoids depending on
> them.  But please understand that it's almost impossible to remove
> _anything_ from the C standard, because the main thing C has going for
> it anymore is backward compatibility all the way to the 1980s.  We
> will not be dropping -fexec-charset as long as it's a feature of the C
> standard.
>