Re: Static Analyzer: correlate state for different region/svalue for reporting

2023-03-16 Thread Pierrick Philippe

On 15/03/2023 17:26, David Malcolm wrote:

On Wed, 2023-03-15 at 16:24 +0100, Pierrick Philippe wrote:

Hi everyone,

Hi Pierrick


I have some question regarding the analyzer.
I'm currently working on an fully out-of-tree static analyzer plugin.
I started development on commit tagged as /basepoints/gcc-13/, but
recently moved my code to a more recent /trunk/ branch (build
/20230309/).

  From my different experiments and analyzer's log readings, I have
the
intuition that '/ana::state_machine::state/' class is mostly (if not
only) linked with '/ana::svalue/',

Yes: a program_state (which describes the state at a particular node
along an execution path) contains (among other things) an sm_state_map
per state_machine, and the sm_state_map maps from svalues to
state_machine::states.


i.e. the lvalue of a memory region.
In my analysis case, I would like to also be able to track state for
some rvalue of a memory region, i.e. '/ana::region/' objects.

You're using the terms "lvalue" and "rvalue", but if I'm reading your
email right, you're using them in exactly the opposite way that I
understand them.

I apologize, you are completely right, I swapped their usage...


I think of an "lvalue" as something that can be on the left-hand side
of an assignment: something that can be written to (an ana::region),
whereas an "rvalue" is something that can be on the right-hand side of
an assignment: a pattern of bits (an ana:svalue) that be written into
an lvalue.

An ana::svalue is a pattern of bits (possibly symbolic, such as "the
constant 42" or "the initial value of param x")

An ana::region is a description of a memory location for reads/writes
(possibly symbolic, such as "local var x within this frame", or "the
block reached by dereferencing the initial value of ptr in the frame
above").

Sorry if I've misread things; I'll try to answer the rest of the email
as best as I can...


So, first question: is there any way to associate and track the state
of
a rvalue, independently of its lvalue?

To try to clarify the question, here's an example:

'''
int __attribute__("__some_attribute__") x = 42;
/* STEP-1
  From now on, we consider the state of x as being marked by
some_attribute.
But in fact, in the log, we can observe that we'll have something
like
this in the new '/ana::program_state/':
{0x4b955b0: (int)42: marked_state (‘x’)} */

Yes: sm-state is associated within a program_state with an svalue, in
this case with the constant 42.

There isn't yet a way to instead give sm-state to "x" itself.  I
suppose you could give sm-state to &x (the pointer that points to x is
an instance of region_svalue, which is an svalue), but I haven't tried
this approach myself.


int *y = &x;
/* STEP-2
For analysis purpose, you want to know that from now on, y is
pointing
to marked data.
So you set state of the LHS of the GIMPLE statement (i.e. some
ssa_name
instance of y) accordingly, with a state named 'points-
to_marked_data'
and setting 'x' as the origin of the state (in the sense of the
argument
/origin/ from '/ana::sm_context::on_transition/'.
What we now have in the new '/ana::program_state/' is this:
{0x4b9acb0: &x: points-to-marked_data (‘&x’) (origin: 0x4b955b0:
(int)42
(‘x’)), 0x4b955b0: (int)42: marked_state (‘x’)} */

Yes: you've set the state on the svalue "&x", not on "y".


int z = *y;
/* STEP-3
Now you have implicit copy of marked data, and you want to report it.
So you state the LHS of the GIMPLE statement (i.e. some ssa_name
instance of z) as being marked, with 'y' as the origin.
What we now have in the new '/ana::program_state/' is this:
{0x4b9acb0: &x: points-to-marked_data (‘&x’) (origin: 0x4b955b0:
(int)42
(‘x’)), 0x4b955b0: (int)42: marked_state (‘x’)} */
'''

Presumably the program_state also shows that you have a binding for the
region "z", containing the svalue 42 (this is represented within the
"store", within the region_model within the program_state).


Indeed, there is a binding for region "z" to the svalue 42 within the 
new program state.



In STEP-2:

We lost the information saying that the rvalue of y (i.e. y), is
pointing to marked data.
Only remain the information that its lvalue (i.e. &x), is pointing to
marked data, which is of course true.

Note that the analyzer by default attempts to purge state that it
doesn't think will be needed anymore, so it may have purged the state
of "y" if y isn't going to be read from anymore.  You could try -fno-
analyzer-state-purge to avoid this purging.


Nothing changes when I run it with the -fno-anlyzer-state-purge,
it is still the state of &x which is tracked.


In STEP-3:

No information is added regarding z in the new program_state.
In fact, if one have a closer look at the log, we see that the LHS of
the GIMPLE statement (the ssa_name instance of z), is already in the
state of 'marked_data'.
Through the log to the call to '/sm_context::on_transition/' this can
be
seen:
  marked_sm: state transition of ‘z_5’: marked_data -> marked_data

All of this step i

Re: [GSoC] gccrs Unicode support

2023-03-16 Thread Raiki Tamura via Gcc
Sorry for resending this email. I forgot using “Reply All”.

Thank you for your response, Arsen and Jakub.
I did not know C++ also supports Unicode identifiers.
I looked a little into C++ and found C++ accepts the same form of
identifiers as Rust.
So I will do further investigation of libcpp with the hope that it can also
be used in the Rust frontend.

Raiki Tamura

On Thu, Mar 16, 2023 at 0:18 Jakub Jelinek  wrote:

> On Wed, Mar 15, 2023 at 11:00:19AM +, Philip Herron via Gcc wrote:
> > Excellent work on getting up to speed on the rust front-end. From my
> > perspective I am interested to see what the wider GCC community thinks
> > about using https://www.gnu.org/software/libunistring/ library within
> GCC
> > instead of rolling our own, this means it will be another dependency on
> GCC.
> >
> > The other option is there is already code in the other front-ends to do
> > this so in the worst case it should be possible to extract something out
> of
> > them and possibly make this a shared piece of functionality which we can
> > mentor you through.
>
> I don't know what exactly Rust FE needs in this area, but e.g. libcpp
> already handles whatever C/C++ need from Unicode support POV and can handle
> it without any extra libraries.
> So, if we could avoid the extra dependency, it would be certainly better,
> unless you really need massive amounts of code from those libraries.
> libcpp already e.g. provides mapping of unicode character names to code
> points, determining which unicode characters can appear at the start or
> in the middle of identifiers, etc.
>
> Jakub
>
>


Re: [GSoC] gccrs Unicode support

2023-03-16 Thread Thomas Schwinge
Hi!

(By the way, this GSoC project is being discussed in GCC/Rust Zulip:
.)

I'm now also putting Mark Wielaard in CC; he once also started discussing
this topic, "thinking of importing a couple of gnulib modules to help
with UTF-8 processing [unless] other gcc frontends handle [these things]
already in a way that might be reusable".  See the thread starting at

"rust frontend and UTF-8/unicode processing/properties".

On 2023-03-15T16:18:18+0100, Jakub Jelinek via Gcc  wrote:
> On Wed, Mar 15, 2023 at 11:00:19AM +, Philip Herron via Gcc wrote:
>> Excellent work on getting up to speed on the rust front-end. From my
>> perspective I am interested to see what the wider GCC community thinks
>> about using https://www.gnu.org/software/libunistring/ library within GCC
>> instead of rolling our own, this means it will be another dependency on GCC.
>>
>> The other option is there is already code in the other front-ends to do
>> this so in the worst case it should be possible to extract something out of
>> them and possibly make this a shared piece of functionality which we can
>> mentor you through.
>
> I don't know what exactly Rust FE needs in this area, but e.g. libcpp
> already handles whatever C/C++ need from Unicode support POV and can handle
> it without any extra libraries.
> So, if we could avoid the extra dependency, it would be certainly better,
> unless you really need massive amounts of code from those libraries.
> libcpp already e.g. provides mapping of unicode character names to code
> points, determining which unicode characters can appear at the start or
> in the middle of identifiers, etc.

So that's exactly the answer that I supposed you or someone else would
give.  ;-)

That means, GCC/Rust has some investigation to do: whether what libcpp
contains is (a) sufficient for its needs, and (b) whether that code can
be reused/extracted/refactored in a sensible way, into GCC-level shared
source code file, to be used by several front ends (possibly via libcpp).
(I suppose GCC/Rust shouldn't link in libcpp directly.)


Thanks for the input, all!


Grüße
 Thomas
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


Re: [GSoC] gccrs Unicode support

2023-03-16 Thread Mark Wielaard
Hi,

On Thu, 2023-03-16 at 10:28 +0100, Thomas Schwinge wrote:
> I'm now also putting Mark Wielaard in CC; he once also started discussing
> this topic, "thinking of importing a couple of gnulib modules to help
> with UTF-8 processing [unless] other gcc frontends handle [these things]
> already in a way that might be reusable".  See the thread starting at
> 
> "rust frontend and UTF-8/unicode processing/properties".

Thanks. BTW. I am not currently working on this.
Note the responses in the above thread by Ian and Jason who pointed out
that some of the requirements of the gccrs frontend might be covered in
the go frontend and libcpp, but not really in a reusable way.

One other thing you might want to coordinate on is NFC normalization
and Confusable Detection for identifiers.
https://unicode.org/reports/tr39/#Confusable_Detection
There has been some work on this by David Malcolm and Marek Polacek
https://developers.redhat.com/articles/2022/01/12/prevent-trojan-source-attacks-gcc-12
But that is on a slightly higher source level (not specific to
identifiers).

You might want to research whether NFC normalization of identifiers is
required to be done by the lexer or parser in Rust and how it interacts
with proc macros.

Cheers,

Mark


Re: [GSoC] gccrs Unicode support

2023-03-16 Thread Jakub Jelinek via Gcc
On Thu, Mar 16, 2023 at 01:58:57PM +0100, Mark Wielaard wrote:
> On Thu, 2023-03-16 at 10:28 +0100, Thomas Schwinge wrote:
> > I'm now also putting Mark Wielaard in CC; he once also started discussing
> > this topic, "thinking of importing a couple of gnulib modules to help
> > with UTF-8 processing [unless] other gcc frontends handle [these things]
> > already in a way that might be reusable".  See the thread starting at
> > 
> > "rust frontend and UTF-8/unicode processing/properties".
> 
> Thanks. BTW. I am not currently working on this.
> Note the responses in the above thread by Ian and Jason who pointed out
> that some of the requirements of the gccrs frontend might be covered in
> the go frontend and libcpp, but not really in a reusable way.

libcpp can be certainly linked into the gccrs FE and specific functions
called from it even if libcpp isn't used as a preprocessor for the language.
Small changes to libcpp are obviously possible as well to make it work.

Jakub



Re: Static Analyzer: correlate state for different region/svalue for reporting

2023-03-16 Thread David Malcolm via Gcc
On Thu, 2023-03-16 at 09:54 +0100, Pierrick Philippe wrote:
> On 15/03/2023 17:26, David Malcolm wrote:
> > On Wed, 2023-03-15 at 16:24 +0100, Pierrick Philippe wrote:

[...snip...]

> > 
> > 
> > An ana::svalue is a pattern of bits (possibly symbolic, such as
> > "the
> > constant 42" or "the initial value of param x")
> > 
> > An ana::region is a description of a memory location for
> > reads/writes
> > (possibly symbolic, such as "local var x within this frame", or
> > "the
> > block reached by dereferencing the initial value of ptr in the
> > frame
> > above").
> > 
> > Sorry if I've misread things; I'll try to answer the rest of the
> > email
> > as best as I can...
> > 
> > > So, first question: is there any way to associate and track the
> > > state
> > > of
> > > a rvalue, independently of its lvalue?
> > > 
> > > To try to clarify the question, here's an example:
> > > 
> > > '''
> > > int __attribute__("__some_attribute__") x = 42;
> > > /* STEP-1
> > >   From now on, we consider the state of x as being marked by
> > > some_attribute.
> > > But in fact, in the log, we can observe that we'll have something
> > > like
> > > this in the new '/ana::program_state/':
> > > {0x4b955b0: (int)42: marked_state (‘x’)} */
> > Yes: sm-state is associated within a program_state with an svalue,
> > in
> > this case with the constant 42.
> > 
> > There isn't yet a way to instead give sm-state to "x" itself.  I
> > suppose you could give sm-state to &x (the pointer that points to x
> > is
> > an instance of region_svalue, which is an svalue), but I haven't
> > tried
> > this approach myself.
> > 
> > > int *y = &x;
> > > /* STEP-2
> > > For analysis purpose, you want to know that from now on, y is
> > > pointing
> > > to marked data.
> > > So you set state of the LHS of the GIMPLE statement (i.e. some
> > > ssa_name
> > > instance of y) accordingly, with a state named 'points-
> > > to_marked_data'
> > > and setting 'x' as the origin of the state (in the sense of the
> > > argument
> > > /origin/ from '/ana::sm_context::on_transition/'.
> > > What we now have in the new '/ana::program_state/' is this:
> > > {0x4b9acb0: &x: points-to-marked_data (‘&x’) (origin: 0x4b955b0:
> > > (int)42
> > > (‘x’)), 0x4b955b0: (int)42: marked_state (‘x’)} */
> > Yes: you've set the state on the svalue "&x", not on "y".
> > 
> > > int z = *y;
> > > /* STEP-3
> > > Now you have implicit copy of marked data, and you want to report
> > > it.
> > > So you state the LHS of the GIMPLE statement (i.e. some ssa_name
> > > instance of z) as being marked, with 'y' as the origin.
> > > What we now have in the new '/ana::program_state/' is this:
> > > {0x4b9acb0: &x: points-to-marked_data (‘&x’) (origin: 0x4b955b0:
> > > (int)42
> > > (‘x’)), 0x4b955b0: (int)42: marked_state (‘x’)} */
> > > '''
> > Presumably the program_state also shows that you have a binding for
> > the
> > region "z", containing the svalue 42 (this is represented within
> > the
> > "store", within the region_model within the program_state).
> 
> Indeed, there is a binding for region "z" to the svalue 42 within the
> new program state.
> 
> > > In STEP-2:
> > > 
> > > We lost the information saying that the rvalue of y (i.e. y), is
> > > pointing to marked data.
> > > Only remain the information that its lvalue (i.e. &x), is
> > > pointing to
> > > marked data, which is of course true.
> > Note that the analyzer by default attempts to purge state that it
> > doesn't think will be needed anymore, so it may have purged the
> > state
> > of "y" if y isn't going to be read from anymore.  You could try -
> > fno-
> > analyzer-state-purge to avoid this purging.
> 
> Nothing changes when I run it with the -fno-anlyzer-state-purge,
> it is still the state of &x which is tracked.
> 
> > > In STEP-3:
> > > 
> > > No information is added regarding z in the new program_state.
> > > In fact, if one have a closer look at the log, we see that the
> > > LHS of
> > > the GIMPLE statement (the ssa_name instance of z), is already in
> > > the
> > > state of 'marked_data'.
> > > Through the log to the call to '/sm_context::on_transition/' this
> > > can
> > > be
> > > seen:
> > >   marked_sm: state transition of ‘z_5’: marked_data ->
> > > marked_data
> > > 
> > > All of this step is somehow leading to confusing diagnostic.
> > > For example, you miss the fact that 'z' is being marked, because
> > > no
> > > event happen as it is somehow aliasing 'x' svalue.
> > > Though it might not be true in case of missed optimization.
> > > 
> > > Of course, if anyone wants to have a look at my code, I can
> > > provide
> > > it
> > > to you (it is not yet publicly available at the moment).
> > I think I'd have to look at the code to be of more help; I confess
> > that
> > I stopped understanding somewhere around step 3, sorry.
> No worries, the idea is that regarding the state_map, z_3 is already 
> considered as marked.
> > Are you able to post a simple example of the code you'd like to
> > analyze,

[Tree-SSA] Question from observation, bogus SSA form?

2023-03-16 Thread Pierrick Philippe

Hi everyone,

I was working around with the analyzer, but I usually dump the SSA-tree 
to get a view of the analyzed code.
This is how I noticed something wrong, at least in the sense of the 
definition of SSA form.


I'm using a version of gcc build from a /trunk/ branch (/20230309/).

Here is an example code:

'''
int main(void) {
    int x = 42;
    int * y = &x;
    x = 6;
    return x;
}
'''

And here is the output from -fdump-tree-ssa-vops:

'''
;; Function main (main, funcdef_no=0, decl_uid=2739, cgraph_uid=1, 
symbol_order=0)


int main ()
{
  int * y;
  int x;
  int D.2744;
  int _5;

   :
  # .MEM_2 = VDEF <.MEM_1(D)>
  x = 42;                                           // First assignment 
to var_decl x

  y_3 = &x;
  # .MEM_4 = VDEF <.MEM_2>
  x = 6;                                         // Second 
assignment to var_decl x

  # VUSE <.MEM_4>
  _5 = x;
  # .MEM_6 = VDEF <.MEM_4>
  x ={v} {CLOBBER(eol)};

   :
:
  # VUSE <.MEM_6>
  return _5;

}
'''

The thing is, there is two distinct assignment to the same LHS tree at 
two different gimple statement, which is by definition not supposed to 
happened in SSA form.
Is there any particular reason this happen? Is that because the address 
of x is taken and stored?


I have to precise, I did not dig into the SSA form transformation and am 
a newbie to gcc source code.

So maybe my question is a bit naive or a known issue.

Thank you for your time,

Pierrick


Re: [Tree-SSA] Question from observation, bogus SSA form?

2023-03-16 Thread Martin Jambor
Hello Pierrick,

On Thu, Mar 16 2023, Pierrick Philippe wrote:
> Hi everyone,
>
> I was working around with the analyzer, but I usually dump the SSA-tree 
> to get a view of the analyzed code.
> This is how I noticed something wrong, at least in the sense of the 
> definition of SSA form.

please note that only some DECLs are put into a SSA form in GCC, these
are sometimes referred to as "gimple registers" and you can query the
predicate is_gimple_reg to figure out whether a DECL is one (putting
aside "virtual operands" which are a special construct of alias
analysis, are in an SSA form but the predicate returns false for them
for some reason).

This means that global variables, volatile variables, aggregates,
variables which are not considered aggregates but are nevertheless
partially modified (think insertion into a vector) or variables which
need to live in memory (most probably because their address was taken)
are not put into an SSA form.  It may not be easily possible.

>
> I'm using a version of gcc build from a /trunk/ branch (/20230309/).
>
> Here is an example code:
>
> '''
> int main(void) {
>      int x = 42;
>      int * y = &x;

Here you take address of x which therefore has to live in memory at
least as long as y is not optimized away.

>      x = 6;
>      return x;
> }
> '''
>
> And here is the output from -fdump-tree-ssa-vops:
>
> '''
> ;; Function main (main, funcdef_no=0, decl_uid=2739, cgraph_uid=1, 
> symbol_order=0)
>
> int main ()
> {
>    int * y;
>    int x;
>    int D.2744;
>    int _5;
>
>     :
>    # .MEM_2 = VDEF <.MEM_1(D)>
>    x = 42;                                           // First assignment 
> to var_decl x
>    y_3 = &x;
>    # .MEM_4 = VDEF <.MEM_2>
>    x = 6;                                         // Second 
> assignment to var_decl x
>    # VUSE <.MEM_4>
>    _5 = x;
>    # .MEM_6 = VDEF <.MEM_4>
>    x ={v} {CLOBBER(eol)};
>
>     :
> :
>    # VUSE <.MEM_6>
>    return _5;
>
> }
> '''
>
> The thing is, there is two distinct assignment to the same LHS tree at 
> two different gimple statement, which is by definition not supposed to 
> happened in SSA form.

I think it is now clear that x is not in SSA form because (at this stage
of the compilation) it is still considered to need to live in memory.

> Is there any particular reason this happen? Is that because the address 
> of x is taken and stored?
>
> I have to precise, I did not dig into the SSA form transformation and am 
> a newbie to gcc source code.
> So maybe my question is a bit naive or a known issue.

No worries, we know these important details are not straightforward when
you see them for the first time.

Good luck with your gcc hacking!

Martin


PROBLEM !!! __ OS: ANY LINUX __ COMPILERS: gcc & g++ __ OUTPUT: BAD!!!

2023-03-16 Thread oszibarack korte via Gcc
*An unsolved problem for more than a decade!*
*Dear GNU Compiler Collection development team!*

*There is a problem with the gcc and g++ compilers for Linux operating
systems!*
*Here are 3 pieces of C and 3 pieces of C++ source code.*





*- Please compile them on any LINUX!- Run it!- Compare the output with the
corresponding source code!Summary:   THE OUTPUTS ARE BAD !!!*

*>>> C*

*1.*

// c0.c
// ENVIRONMENT: LINUX COMPILER: gcc OUTPUT: BAD !!!

#include 

void main (void)
{
   printf ("GNU Compiler Collection");

   while (1)
   {
   ;
   }
}


*2.*

// c1.c
// ENVIRONMENT: LINUX COMPILER: gcc OUTPUT: BAD !!!

#include 
#include 

void main (void)
{
   printf ("Hello!");

   system ("sleep 10");

   system ("clear");

   printf ("Goodbye!");
}


*3.*

// c2.c
// ENVIRONMENT: LINUX COMPILER: gcc OUTPUT: BAD !!!

#include 
#include 

void main (void)
{
   unsigned int   i;

   printf ("Now count up to 4.000.000.000. ST\nART! Wait...");

   for (i = 0; i < 40; i++)
   {
  ;
   }

   printf ("READY!");

   system ("gcc --version");

   printf ("i = %u", i);

   system ("uname -a");
}


*>>> C++*

*1.*

// cpp0.cpp
// ENVIRONMENT: LINUX COMPILER: g++ OUTPUT: BAD !!!

#include 

int main (void)
{
   std::cout << "GNU Compiler Collection";

   while (1)
   {
  ;
   }

   return 0;
}


*2.*

// cpp1.cpp
// ENVIRONMENT: LINUX COMPILER: g++ OUTPUT: BAD !!!

#include 

int main (void)
{
   std::cout << "Hello!";

   system ("sleep 10");

   system ("clear");

   std::cout << "Goodbye!";

   return 0;
}


*3.*

// cpp2.cpp
// ENVIRONMENT: LINUX COMPILER: g++ OUTPUT: BAD !!!

#include 

int main (void)
{
   unsigned int   i;

   std::cout << "Now count up to 4.000.000.000. ST\nART! Wait...";

   for (i = 0; i < 40; i++)
   {
  ;
   }

   std::cout << "READY!";

   system ("g++ --version");

   std::cout << "i = " << i;

   system ("uname -a");

   return 0;
}



*Thank you,Best regards,*
korte.oszibarack
*March 16th, 2023*


Re: PROBLEM !!! __ OS: ANY LINUX __ COMPILERS: gcc & g++ __ OUTPUT: BAD!!!

2023-03-16 Thread Andrew Pinski via Gcc
On Thu, Mar 16, 2023 at 10:46 AM oszibarack korte via Gcc
 wrote:
>
> *An unsolved problem for more than a decade!*
> *Dear GNU Compiler Collection development team!*
>
> *There is a problem with the gcc and g++ compilers for Linux operating
> systems!*
> *Here are 3 pieces of C and 3 pieces of C++ source code.*

There is no bug here. stdout is line buffered.

Thanks,
Andrew

>
>
>
>
>
> *- Please compile them on any LINUX!- Run it!- Compare the output with the
> corresponding source code!Summary:   THE OUTPUTS ARE BAD !!!*
>
> *>>> C*
>
> *1.*
>
> // c0.c
> // ENVIRONMENT: LINUX COMPILER: gcc OUTPUT: BAD !!!
>
> #include 
>
> void main (void)
> {
>printf ("GNU Compiler Collection");
>
>while (1)
>{
>;
>}
> }
>
>
> *2.*
>
> // c1.c
> // ENVIRONMENT: LINUX COMPILER: gcc OUTPUT: BAD !!!
>
> #include 
> #include 
>
> void main (void)
> {
>printf ("Hello!");
>
>system ("sleep 10");
>
>system ("clear");
>
>printf ("Goodbye!");
> }
>
>
> *3.*
>
> // c2.c
> // ENVIRONMENT: LINUX COMPILER: gcc OUTPUT: BAD !!!
>
> #include 
> #include 
>
> void main (void)
> {
>unsigned int   i;
>
>printf ("Now count up to 4.000.000.000. ST\nART! Wait...");
>
>for (i = 0; i < 40; i++)
>{
>   ;
>}
>
>printf ("READY!");
>
>system ("gcc --version");
>
>printf ("i = %u", i);
>
>system ("uname -a");
> }
>
>
> *>>> C++*
>
> *1.*
>
> // cpp0.cpp
> // ENVIRONMENT: LINUX COMPILER: g++ OUTPUT: BAD !!!
>
> #include 
>
> int main (void)
> {
>std::cout << "GNU Compiler Collection";
>
>while (1)
>{
>   ;
>}
>
>return 0;
> }
>
>
> *2.*
>
> // cpp1.cpp
> // ENVIRONMENT: LINUX COMPILER: g++ OUTPUT: BAD !!!
>
> #include 
>
> int main (void)
> {
>std::cout << "Hello!";
>
>system ("sleep 10");
>
>system ("clear");
>
>std::cout << "Goodbye!";
>
>return 0;
> }
>
>
> *3.*
>
> // cpp2.cpp
> // ENVIRONMENT: LINUX COMPILER: g++ OUTPUT: BAD !!!
>
> #include 
>
> int main (void)
> {
>unsigned int   i;
>
>std::cout << "Now count up to 4.000.000.000. ST\nART! Wait...";
>
>for (i = 0; i < 40; i++)
>{
>   ;
>}
>
>std::cout << "READY!";
>
>system ("g++ --version");
>
>std::cout << "i = " << i;
>
>system ("uname -a");
>
>return 0;
> }
>
>
>
> *Thank you,Best regards,*
> korte.oszibarack
> *March 16th, 2023*


Re: PROBLEM !!! __ OS: ANY LINUX __ COMPILERS: gcc & g++ __ OUTPUT: BAD!!!

2023-03-16 Thread Jonathan Wakely via Gcc
On Thu, 16 Mar 2023 at 17:45, oszibarack korte via Gcc  wrote:
>
> *An unsolved problem for more than a decade!*
> *Dear GNU Compiler Collection development team!*
>
> *There is a problem with the gcc and g++ compilers for Linux operating
> systems!*
> *Here are 3 pieces of C and 3 pieces of C++ source code.*
>
>
>
>
>
> *- Please compile them on any LINUX!- Run it!- Compare the output with the
> corresponding source code!Summary:   THE OUTPUTS ARE BAD !!!*

No, your programs are bad. You need to flush the output if you want it
to appear right away. When writing to standard output, a new line will
cause that to happen.


gcc-10-20230316 is now available

2023-03-16 Thread GCC Administrator via Gcc
Snapshot gcc-10-20230316 is now available on
  https://gcc.gnu.org/pub/gcc/snapshots/10-20230316/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 10 git branch
with the following options: git://gcc.gnu.org/git/gcc.git branch 
releases/gcc-10 revision d640e435f156d8f825bf95c2164053b4a3a7b682

You'll find:

 gcc-10-20230316.tar.xz   Complete GCC

  SHA256=c71b27a6b617827562ade42c9b4636653825cf42b85046c61c1e1ce5eb371abf
  SHA1=34484b715cd98f4d71d087d57479b34d36d2d9ab

Diffs from 10-20230309 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-10
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


[PATCH] C, ObjC: Add -Wunterminated-string-initialization

2023-03-16 Thread Alejandro Colomar via Gcc
Warn about the following:

char  s[3] = "foo";

Initializing a char array with a string literal of the same length as
the size of the array is usually a mistake.  Rarely is the case where
one wants to create a non-terminated character sequence from a string
literal.

In some cases, for writing faster code, one may want to use arrays
instead of pointers, since that removes the need for storing an array of
pointers apart from the strings themselves.

char  *log_levels[]   = { "info", "warning", "err" };
vs.
char  log_levels[][7] = { "info", "warning", "err" };

This forces the programmer to specify a size, which might change if a
new entry is later added.  Having no way to enforce null termination is
very dangerous, however, so it is useful to have a warning for this, so
that the compiler can make sure that the programmer didn't make any
mistakes.  This warning catches the bug above, so that the programmer
will be able to fix it and write:

char  log_levels[][8] = { "info", "warning", "err" };

This warning already existed as part of -Wc++-compat, but this patch
allows enabling it separately.  It is also included in -Wextra, since
it may not always be desired (when unterminated character sequences are
wanted), but it's likely to be desired in most cases.

Link: 
Link: 
Link: 

Acked-by: Doug McIlroy 
Cc: "G. Branden Robinson" 
Cc: Ralph Corderoy 
Cc: Dave Kemper 
Cc: Larry McVoy 
Cc: Andrew Pinski 
Cc: Jonathan Wakely 
Cc: Andrew Clayton 
Signed-off-by: Alejandro Colomar 
---

Hi!

I finally have a working patch for this warning :-)
Tested with the following code:

$ cat str.c 
int main(void)
{
char a[2] = "foo";
char b[3] = "bar";
char c[4] = "baz";
char d[5] = "qwe";
char log_levels[][N] = {  // -DN=7
"info",
"warning",
"err"
};
return *a + *b + *c + *d + log_levels[0][0];
}

One thing which doesn't make me fully happy about this warning is that
the message is a bit worse than the one in C++.  See:

$ /opt/local/gnu/gcc/wusi/1/bin/gcc str.c \
  -Wall -Wunterminated-string-initialization -DN=8
str.c: In function ‘main’:
str.c:4:21: warning: initializer-string for array of ‘char’ is too long
4 | char a[2] = "foo";
  | ^
str.c:5:21: warning: initializer-string for array of ‘char’ is too long 
for C++ [-Wunterminated-string-initialization]
5 | char b[3] = "bar";
  | ^
$ /opt/local/gnu/gcc/wusi/1/bin/g++ str.c \
  -Wall -Wunterminated-string-initialization -DN=8
str.c: In function ‘int main()’:
str.c:4:21: error: initializer-string for ‘char [2]’ is too long 
[-fpermissive]
4 | char a[2] = "foo";
  | ^
str.c:5:21: error: initializer-string for ‘char [3]’ is too long 
[-fpermissive]
5 | char b[3] = "bar";
  | ^

In C++ we see the complete type in the error message, which is more
informative than "array of 'char'".  This is especially relevant for
multiline definitions, where the shown line may not contain the type,
but only the string.  However, that was already the case previously with
-Wc++-compat, so a fix for that might be better as a different patch.

$ /opt/local/gnu/gcc/wusi/1/bin/gcc str.c \
  -Wall -Wunterminated-string-initialization -DN=7
str.c: In function ‘main’:
str.c:4:21: warning: initializer-string for array of ‘char’ is too long
4 | char a[2] = "foo";
  | ^
str.c:5:21: warning: initializer-string for array of ‘char’ is too long 
for C++ [-Wunterminated-string-initialization]
5 | char b[3] = "bar";
  | ^
str.c:10:17: warning: initializer-string for array of ‘char’ is too 
long for C++ [-Wunterminated-string-initialization]
   10 | "warning",
  | ^
$ /opt/local/gnu/gcc/wusi/1/bin/g++ str.c \
  -Wall -Wunterminated-string-initialization -DN=7
str.c: In function ‘int main()’:
str.c:4:21: error: initializer-string for ‘char [2]’ is too long 
[-fpermissive]
4 | char a[2] = "foo";
  | ^
str.c:5:21: error: initializer-string for ‘char [3]’ is too long 
[-fpermissive]
5 | char b[3] = "bar";
  |   

Re: [PATCH] C, ObjC: Add -Wunterminated-string-initialization

2023-03-16 Thread Alejandro Colomar via Gcc


On 3/17/23 02:12, Alejandro Colomar wrote:
> Warn about the following:
> 
> char  s[3] = "foo";
> 
> Initializing a char array with a string literal of the same length as
> the size of the array is usually a mistake.  Rarely is the case where
> one wants to create a non-terminated character sequence from a string
> literal.
> 
> In some cases, for writing faster code, one may want to use arrays
> instead of pointers, since that removes the need for storing an array of
> pointers apart from the strings themselves.
> 
> char  *log_levels[]   = { "info", "warning", "err" };
> vs.
> char  log_levels[][7] = { "info", "warning", "err" };
> 
> This forces the programmer to specify a size, which might change if a
> new entry is later added.  Having no way to enforce null termination is
> very dangerous, however, so it is useful to have a warning for this, so
> that the compiler can make sure that the programmer didn't make any
> mistakes.  This warning catches the bug above, so that the programmer
> will be able to fix it and write:
> 
> char  log_levels[][8] = { "info", "warning", "err" };
> 
> This warning already existed as part of -Wc++-compat, but this patch
> allows enabling it separately.  It is also included in -Wextra, since
> it may not always be desired (when unterminated character sequences are
> wanted), but it's likely to be desired in most cases.
> 
> Link: 
> Link: 
> Link: 
> 
> Acked-by: Doug McIlroy 
> Cc: "G. Branden Robinson" 
> Cc: Ralph Corderoy 
> Cc: Dave Kemper 
> Cc: Larry McVoy 
> Cc: Andrew Pinski 
> Cc: Jonathan Wakely 
> Cc: Andrew Clayton 
> Signed-off-by: Alejandro Colomar 
> ---
> 
> Hi!
> 
> I finally have a working patch for this warning :-)
> Tested with the following code:
> 
>   $ cat str.c 
>   int main(void)
>   {
>   char a[2] = "foo";
>   char b[3] = "bar";
>   char c[4] = "baz";
>   char d[5] = "qwe";
>   char log_levels[][N] = {  // -DN=7
>   "info",
>   "warning",
>   "err"
>   };
>   return *a + *b + *c + *d + log_levels[0][0];
>   }
> 
> One thing which doesn't make me fully happy about this warning is that
> the message is a bit worse than the one in C++.  See:
> 
>   $ /opt/local/gnu/gcc/wusi/1/bin/gcc str.c \
> -Wall -Wunterminated-string-initialization -DN=8
>   str.c: In function ‘main’:
>   str.c:4:21: warning: initializer-string for array of ‘char’ is too long
>   4 | char a[2] = "foo";
> | ^
>   str.c:5:21: warning: initializer-string for array of ‘char’ is too long 
> for C++ [-Wunterminated-string-initialization]

You may notice that these messages still have the "for C++" thingy.
I removed that after testing, but since it's just text I didn't test again.

>   5 | char b[3] = "bar";
> | ^
>   $ /opt/local/gnu/gcc/wusi/1/bin/g++ str.c \
> -Wall -Wunterminated-string-initialization -DN=8
>   str.c: In function ‘int main()’:
>   str.c:4:21: error: initializer-string for ‘char [2]’ is too long 
> [-fpermissive]
>   4 | char a[2] = "foo";
> | ^
>   str.c:5:21: error: initializer-string for ‘char [3]’ is too long 
> [-fpermissive]
>   5 | char b[3] = "bar";
> | ^
> 
> In C++ we see the complete type in the error message, which is more
> informative than "array of 'char'".  This is especially relevant for
> multiline definitions, where the shown line may not contain the type,
> but only the string.  However, that was already the case previously with
> -Wc++-compat, so a fix for that might be better as a different patch.
> 
>   $ /opt/local/gnu/gcc/wusi/1/bin/gcc str.c \
> -Wall -Wunterminated-string-initialization -DN=7
>   str.c: In function ‘main’:
>   str.c:4:21: warning: initializer-string for array of ‘char’ is too long
>   4 | char a[2] = "foo";
> | ^
>   str.c:5:21: warning: initializer-string for array of ‘char’ is too long 
> for C++ [-Wunterminated-string-initialization]
>   5 | char b[3] = "bar";
> | ^
>   str.c:10:17: warning: initializer-string for array of ‘char’ is too 
> long for C++ [-Wunterminated-string-initialization]
>  10 | "warning",
> | ^
>   $ /opt/local/gnu/gcc/wusi/1/bin/g++ str.c \
> -Wall -Wunterminated-string-initialization -DN=7
>   str.c: In function ‘int main()’:
>   str.c:4