Re: gdb 8.x - g++ 7.x compatibility

2018-02-03 Thread Manfred

n4659 17.4 (Type equivalence) p1.3:

Two template-ids refer to the same class, function, or variable if
...
their corresponding non-type template arguments of integral or 
enumeration type have identical values

...

It looks that for non-type template arguments the template type 
equivalence is based on argument /value/ not /type/ (and value), so IMHO 
gcc is correct where it considers foo<10u> and foo<10> to be the same 
type, i.e. "refer to the same class"


FWIW, type_info reports the same class name for both templates, which 
appears to be correct as per the above.


I would think someone from gcc might be more specific on why both 
templates print 4294967286, and what debug info is actually stored by -g 
in this case.



On 2/3/2018 6:18 PM, Roman Popov wrote:

I've just checked g++8.0.1 from trunk, and the problem is still there. And
same with Clang compiler.

This is indeed is a serious issue for me, since my Python scripts for gdb
expect reliable dynamic type identification. However gdb is
completely powerless here. So I'm forced to stay on older compiler.
Consider this case (Results with g++ 8.0.1)

#include 
struct base {
 virtual void print() = 0;
};

template< auto IVAL>
struct foo : base {
 decltype(IVAL) x = -IVAL;
 void print() override { std::cout << x << std::endl; };
};

int main()
{
 base * fu = new foo<10u>();
 base * fi = new foo<10>();
 fu->print();
 fi->print();
 return 0; // set breakpoint here
}:

Now check dynamic types in GDB:

(gdb) p *fu
warning: RTTI symbol not found for class 'foo<10u>'
$1 = warning: RTTI symbol not found for class 'foo<10u>'
warning: RTTI symbol not found for class 'foo<10u>'
{_vptr.base = 0x400bd0 +16>}

(gdb) p *fi

(gdb) p *fi
$2 = (foo<10>) { = {_vptr.base = 0x400bb8 +16>}, *x
= 4294967286*}

Here GDB picks wrong type!

In RTTI type names are different. And this is correct.

But in debuginfo both types have same name:

foo<10> {
   unsigned x;
}
foo<10> {
   int x;
}

So GDB picks the first one, which is wrong.

-Roman








2018-02-03 6:20 GMT-08:00 Paul Smith :


On Fri, 2018-02-02 at 23:54 -0500, Simon Marchi wrote:

Your problem is probably linked to these issues, if you want to follow
them:

gcc: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81932
gdb: https://sourceware.org/bugzilla/show_bug.cgi?id=22013

As Carl said, it's a good idea to try with the latest version of both
tools, but I think the issue will still be present.

GCC changed how it outputs unsigned template parameters in the debug
info (from 2u to just 2), and it doesn't look like it's going to change
it back.  So I suppose we'll have to find a way to make GDB deal with
it.


I also tried a couple of times [1][2][3] to get a discussion started on
the mailing lists for how to resolve this but didn't get any replies,
and I got busy with other things.

We really need to find someone who is knowlegeable on type lookup in
GDB.  That person needs to engage with the experts on the GCC side and
hash out the right answer to this problem.  In my experience, Jonathan
Wakely on the GCC side is extremely responsive, I'm just too much of a
tyro to be able to discuss it with him at the level needed to find a
solution.

I think it's an extremely serious issue, if GDB can't resolve some very
common (IME) types, but so far it hasn't risen to the level of getting
attention from those who have sufficient expertise to solve it.


[1] https://gcc.gnu.org/ml/gcc-help/2017-08/msg00120.html
[2] https://sourceware.org/ml/gdb/2017-08/msg00069.html
[3] https://sourceware.org/ml/gdb/2017-09/msg00042.html



Re: gdb 8.x - g++ 7.x compatibility

2018-02-04 Thread Manfred



On 2/4/2018 6:01 AM, Simon Marchi wrote:

On 2018-02-03 13:35, Manfred wrote:

n4659 17.4 (Type equivalence) p1.3:

Two template-ids refer to the same class, function, or variable if
...
their corresponding non-type template arguments of integral or
enumeration type have identical values
...

It looks that for non-type template arguments the template type
equivalence is based on argument /value/ not /type/ (and value), so
IMHO gcc is correct where it considers foo<10u> and foo<10> to be the
same type, i.e. "refer to the same class"

FWIW, type_info reports the same class name for both templates, which
appears to be correct as per the above.

I would think someone from gcc might be more specific on why both
templates print 4294967286, and what debug info is actually stored by
-g in this case.


I think that Roman's example clearly shows that they are not equivalent in
all cases.
I was merely reporting the wording of the standard, which would be the 
authority to follow. I may agree that not specifying type identity may 
lead to unexpected results. Personally I would prefer the standard to 
say "identical value and type" here (and it appears from your findings 
below that quality compilers already handle it this way), but this is 
only an opinion.




Building Roman's example with g++ 7.3 results in a single instantiated type.  
You
can see that both "new foo<10>()" and "new foo<10u>()" end up calling the same
constructor.  It seems like which type is instantiated depends on which template
parameter (the signed or unsigned one) you use first.  So with this:

  base * fi = new foo<10>();
  base * fu = new foo<10u>();

the output is -10 for both, and with

  base * fu = new foo<10u>();
  base * fi = new foo<10>();

the output is 4294967286 for both.  But it's probably a bogus behavior.

Indeed.

  I tested

with clangd, it instantiates two different types, so you get 4294967286 for the
<10u> case and -10 for the <10> case.  I also just built gcc from master, and it
also instantiates two types, so it seems like that was fixed recently.

So let's see what debug info gcc master generates for these two instances of foo
(clang master generates the equivalent).

   <1><9257>: Abbrev Number: 66 (DW_TAG_structure_type)
  <9258>   DW_AT_name: (indirect string, offset: 0x8455): foo<10>
  <925c>   DW_AT_byte_size   : 16
  <925d>   DW_AT_decl_file   : 1
  <925e>   DW_AT_decl_line   : 7
  <925f>   DW_AT_decl_column : 8
  <9260>   DW_AT_containing_type: <0x92fd>
  <9264>   DW_AT_sibling : <0x92f8>
...
  <1><93be>: Abbrev Number: 66 (DW_TAG_structure_type)
 <93bf>   DW_AT_name: (indirect string, offset: 0x8455): foo<10>
 <93c3>   DW_AT_byte_size   : 16
 <93c4>   DW_AT_decl_file   : 1
 <93c5>   DW_AT_decl_line   : 7
 <93c6>   DW_AT_decl_column : 8
 <93c7>   DW_AT_containing_type: <0x92fd>
 <93cb>   DW_AT_sibling : <0x945f>

If there are two types with the same name, how is gdb expected to differentiate
them?

If we can't rely on the DW_AT_name anymore to differentiate templated types, 
then
the only alternative I see would be to make GDB ignore the template part of the
DW_AT_name value, and reconstruct it in the format it expects (with the u) from 
the
DW_TAG_template_value_param DIEs children of DW_TAG_structure_type (there's 
already
code to do that in dwarf2_compute_name).  Their types correctly point to the 
signed
int or unsigned int DIE, so we have the necessary information.  However, that 
would
mean reading many more full DIEs early on, when just building partial symbols, 
which
would slow done loading the symbols of pretty much any C++ program.

 From what I understand from the original change that caused all this [1], 
removing
the suffixes was meant to make the error messages more readable for the user.
However, since foo<10>::print() and foo<10u>::print() are not the same function,
I think it would actually be more confusing if an error message talked about the
instantiation with the unsigned type, but mentioned "foo<10>::print()".  For 
example,
if you put a

   static_assert (std::is_signed::value);

in the print method, this is the error message from gcc:

   test.cpp: In instantiation of 'void foo::print() [with auto IVAL = 
10]':
   test.cpp:24:1:   required from here
   test.cpp:12:22: error: static assertion failed
  static_assert (std::is_signed::value);
 ^~~

Wouldn't the message make more sense with a u suffix?

Probably so.



Simon

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78165



Re: gdb 8.x - g++ 7.x compatibility

2018-02-07 Thread Manfred



On 02/07/2018 02:44 PM, Simon Marchi wrote:

On 2018-02-07 02:21, Daniel Berlin wrote:
As the person who, eons ago, wrote a bunch of the the GDB code for 
this C++

ABI support, and as someone who helped with DWARF support in both GDB and
GCC, let me try to propose a useful path forward (in the hopes that 
someone

will say "that's horrible, do it this  instead")

Here are the constraints i believe we are working with.

1. GDB should work with multiple DWARF producers and multiple C++ 
compilers

implementing the C++ ABI
2. There is no canonical demangled format for the C++ ABI
3. There is no canoncial target demangler you can say everyone should use
(and even if there was, you don't want to avoid debugging working because
someone chose not to)
4. You don't want to slow down GDB if you can avoid it
5. Despite them all implementation the same ABI, it's still possible to
distinguish the producers by the producer/compiler in the dwarf info.

Given all that:

GDB has ABI hooks that tell it what to do for various C++ ABIs. This 
is how
it knows to call the right demangler for gcc v3's abi vs gcc v2's abi. 
and

handle various differences between them.

See gdb/cp-abi.h

The IMHO, obvious thing to do here is: Handle the resulting demangler
differences with 1 or more new C++ ABI hooks.
Or, introduce C++ debuginfo producer hooks that the C++ ABI hooks use if
folks want it to be separate.

Once the producer is detected, fill in the hooks with a set of functions
that does the right thing.

I imagine this would also clean up a bundle of hacks in various parts of
gdb trying to handle these differences anyway (which is where a lot of 
the

multiple symbol lookups/etc that are often slow come from.
If we just detected and said "this is gcc 6, it behaves like this", we
wouldn't need to do that)

In case you are worried, you will discover this is how a bunch of 
stuff is

done and already contains a ball of hacks.

Using hooks would be, IMHO, a significant improvement.


Hi Daniel,

Thanks for chiming in.

This addresses the issue of how to do good software design in GDB to 
support different producers cleanly, but I think we have some issues 
even before that, like how to support g++ 7.3 and up.  I'll try to 
summarize the issue quickly.  It's now possible to end up with two 
templated classes with the same name that differ only by the signedness 
of their non-type template parameter.  One is Foo and the other 
is Foo (the 10 is unsigned).  Until 7.3, g++ would 
generate names like Foo<10> for the former and names like Foo<10u> for 
the later (in the DW_AT_name attribute of the classes' DIEs).  Since 
7.3, it produces Foo<10> for both.


When GDB wants to know the run time type of an object, it fetches the 
pointer to its vtable, does a symbol lookup to get the linkage name and 
demangles it, which gives a string like "vtable for Foo<10>" or "vtable 
for Foo<10u>".  It strips the "vtable for " and uses the remainder to do 
a type lookup.  Since g++ 7.3, you can see that doing a type lookup for 
Foo<10> may find the wrong type, and doing a lookup for Foo<10u> won't 
find anything.


So the problem here is how to uniquely identify those two classes when 
we are doing this run-time type finding operation (and probably in other 
cases too).


Simon


Hi all,

In the perspective of "type identity", the way I see it the issue has a 
few parts:


1) How GCC compiles such templates
2) How GCC emits debugging information via -g
3) How such information is interpreted (and merged with the compiled 
code) by GDB


Regarding 1) and 2), IMHO I think that there should be a one-to-one 
relationship between the compiled code output and debug info:


This means that if GCC compiles such templates into two different 
classes[1], it should generate two different type identifiers.
Conversely, if it compiles the templates into the same class, then a 
single identifier should be emitted for the single class compiled.

(This goes besides the point of what the standard dictates[2])

If I understand it right, currently the issue is that gcc emits two 
types with the same debug identifier.


Regarding 3), I think that after 1) and 2) are set up, GDB should be 
able to find the correct type definition (using the most appropriate 
design choice).


Hope this helps,
Manfred


[1] According to the findings of Simon, this appears to be the case with 
clang, older GCC, and current GCC master. Do I understand this right?


[2] About handling both templates instantiation as a single class, I 
think that if GCC wants to emit a single class, then its argument type 
instantiation should be well-definined,i.e. independent of the order of 
declaration - see the findings from Simon earlier in this thread where 
you could get the program output either -10 or 4294967286 depending on 
which declaration would come first.


Re: gdb 8.x - g++ 7.x compatibility

2018-02-07 Thread Manfred



On 2/7/2018 4:15 PM, Jonathan Wakely wrote:

On 7 February 2018 at 15:07, Manfred  wrote:



On 02/07/2018 02:44 PM, Simon Marchi wrote:



[...]


This addresses the issue of how to do good software design in GDB to
support different producers cleanly, but I think we have some issues even
before that, like how to support g++ 7.3 and up.  I'll try to summarize the
issue quickly.  It's now possible to end up with two templated classes with
the same name that differ only by the signedness of their non-type template
parameter.  One is Foo and the other is Foo (the 10
is unsigned).  Until 7.3, g++ would generate names like Foo<10> for the
former and names like Foo<10u> for the later (in the DW_AT_name attribute of
the classes' DIEs).  Since 7.3, it produces Foo<10> for both.

When GDB wants to know the run time type of an object, it fetches the
pointer to its vtable, does a symbol lookup to get the linkage name and
demangles it, which gives a string like "vtable for Foo<10>" or "vtable for
Foo<10u>".  It strips the "vtable for " and uses the remainder to do a type
lookup.  Since g++ 7.3, you can see that doing a type lookup for Foo<10> may
find the wrong type, and doing a lookup for Foo<10u> won't find anything.

So the problem here is how to uniquely identify those two classes when we
are doing this run-time type finding operation (and probably in other cases
too).

Simon



Hi all,

In the perspective of "type identity", the way I see it the issue has a few
parts:

1) How GCC compiles such templates
2) How GCC emits debugging information via -g
3) How such information is interpreted (and merged with the compiled code)
by GDB

Regarding 1) and 2), IMHO I think that there should be a one-to-one
relationship between the compiled code output and debug info:

This means that if GCC compiles such templates into two different
classes[1], it should generate two different type identifiers.


What do you mean by "such templates"? There have been several
different examples in the thread, which should be handled differently.


From Roman 2/3/2018
#include 
struct base {
virtual void print() = 0;
};

template< auto IVAL>
struct foo : base {
decltype(IVAL) x = -IVAL;
void print() override { std::cout << x << std::endl; };
};

From Simon 2/4/2018
 base * fi = new foo<10>();
 base * fu = new foo<10u>();


You are right that the original thread was started by Roman with:

struct base {  virtual ~base(){}  };

template< int IVAL, unsigned UVAL, unsigned long long ULLVAL>
struct derived : base {
int x = IVAL + + UVAL + ULLVAL;
};




Conversely, if it compiles the templates into the same class, then a single
identifier should be emitted for the single class compiled.
(This goes besides the point of what the standard dictates[2])

If I understand it right, currently the issue is that gcc emits two types
with the same debug identifier.

Regarding 3), I think that after 1) and 2) are set up, GDB should be able to
find the correct type definition (using the most appropriate design choice).

Hope this helps,


Not really :-)

Sorry for that :-)



You're basically just saying "GCC and GDB should do the right thing"
which is a statement of the obvious.

Besides the obvious, the main point was:
"IMHO I think that there should be a one-to-one relationship between the 
compiled code output and debug info"

and:
"If I understand it right, currently the issue is that gcc emits two 
types with the same debug identifier."


Which was an attempt to help by making obvious what I understood was 
going wrong.






[1] According to the findings of Simon, this appears to be the case with
clang, older GCC, and current GCC master. Do I understand this right?


As I said above, it's not clear what you're referring to.

I had in mind foo<10> and foo<10u>

After your remark, I realize I should have left out "older GCC" because 
'auto' does not apply to it - older GCC dealt with the initial example:

template< int IVAL, unsigned UVAL, unsigned long long ULLVAL>




[2] About handling both templates instantiation as a single class, I think
that if GCC wants to emit a single class, then its argument type
instantiation should be well-definined,i.e. independent of the order of
declaration - see the findings from Simon earlier in this thread where you
could get the program output either -10 or 4294967286 depending on which
declaration would come first.


That's just a GCC 7 bug in the handling of auto template parameters,
see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79092
It's not really relevant here, and is already fixed on trunk.


Thanks for pointing this out.
If I understand it correctly, the solution of the bug is that foo<10> 
and foo<10u> result in two differen

Re: Someone broke bootstrap with gfortran, again!

2005-07-24 Thread Manfred Hollstein
On Sat, 23 Jul 2005, 15:53:21 +0200, Daniel Berlin wrote:
> On Fri, 2005-07-22 at 17:53 -0700, Steve Kargl wrote:
> > On Fri, Jul 22, 2005 at 05:44:44PM -0700, Jerry DeLisle wrote:
> > > Steve Kargl wrote:
> > > >Does this look familiar to anyone?
> > > >
> > > I was having troubles doing a build after a cvs update.  I had to delete 
> > > everything in the build directory and rerun configure and then it would 
> > > build ok. Not sure its the same problem you are seeing, but it happened 
> > > today.  I am running on i686-pc-linux-gnu.
> > > 
> > 
> > I always remove the contents in the build directory, run configure,
> > then do a "gmake bootstrap".  The only exception to this process
> > is when I'm making small changes to gfortran files where I know
> > a "gmake bubblestrap" will not run into problems.
> > 
> > I suspect the commit that broken gfortran is 
> > 
> > 2005-07-22  Manfred Hollstein  <[EMAIL PROTECTED]>
> > 
> > * tree-ssa-structalias.c (merge_graph_nodes): Fix uninitialised
> > warnings.
> > (int_add_graph_edge): Likewise.
> > (collapse_nodes): Likewise.
> > (process_unification_queue): Likewise.
> > 
> > I'll know with certainty in an hour or so.
> 
> I should simply note that the te warnings are invalid.
> We don't *use* the weights uninitialized (this is easily verifiable).

Correct, but unfortunately -Werror errors out at -O3 due to these
warnings... That's why I suggested the fix.

Cheers.

l8er
manfred


Re: _GLOBAL_ query

2006-03-16 Thread Manfred Hollstein
Hi there,

On Thu, 16 Mar 2006, 11:35:56 +0100, Inder wrote:
> Hi All
> 
> I have  a query regarding __GLOBAL_ prefixed symbols.
> while compiling the testcase given below produces
> a symbol '_GLOBAL__I_main', which according to the defination of
> static global initiallization should be a global symbol. But
> gcc makes it a local symbols.
> 
> can anyone explain the reason for this behaviour.

GCC creates a file local function responsible for performing the
execution of any global CTORS. This function is named after the first
function in the file, in your case "main"; try inserting another dummy
function before main and check again. If you look at the other symbols
generated by GCC, you will see that it also creates a globally visible
symbol for main.

> Thanks,
> Inder

HTH, cheers.

l8er
manfred


Re: [PATCH] BIT_FIELD_REF_UNSIGNED considered harmful

2008-03-05 Thread Manfred Hollstein
On Wed, 05 Mar 2008, 13:47:36 +0100, Diego Novillo wrote:
> On 03/05/08 07:18, Richard Guenther wrote:
> 
> >Comments?
> 
> Makes sense to me.
> 
> >+  if (INTEGRAL_TYPE_P (TREE_TYPE (t))
> >+  && (TYPE_PRECISION (TREE_TYPE (t))
> >+  != TREE_INT_CST_LOW (TREE_OPERAND (t, 1
> >+{
> >+  error ("integral result type precision does not match "
> >+ "field size of BIT_FIELD_REF");
> >+  return t;
> >+}
> >+  if (!INTEGRAL_TYPE_P (TREE_TYPE (t))
> 
> 'else if' here?

In theory, yes, but, why? If the first 'if' part evaluates to true,
control flow jumps out of the function, hence the following 'if' is
totally correct (unless someone #define's return to something else...),
or do the GCC/FSF codings conventions require this to be 'else if'?

> Diego.

Cheers.

l8er
manfred


Re: [RFC] Adjust output for strings in tree-pretty-print.c

2008-05-19 Thread Manfred Hollstein
Hi there,

On Mon, 19 May 2008, 15:59:16 +0200, FX wrote:
> [...]
> Any comments? Is it OK to commit as is?

this may sound like nit-picking, but the length of a string cannot be
negative, so, I'd rather make the new parameter `len' an "unsigned int"
or even size_t.

HTH, cheers.

l8er
manfred


wrong code with -fforce-addr

2007-10-03 Thread Manfred Schwarb
Hi,

I have a rather nasty optimization issue with gfortran (as of yesterday). As I 
think it could
be an optimization issue and not an gfortran frontend issue, I post it here,
the fortran list was not able to help so far.

I narrowed my problem to one single fortran function. If I compile this 
function with
"gfortran -O2 -march=pentium4" the output is OK, using
"gfortran -O2 -march=pentium4 -fforce-addr" produces wrong output.

When reducing optimization level to "-O1", the issue vanishes for all tried 
flag combinations.
If I comment out a particular non-functional line in this code, the issue goes 
away:

if (always_true) then
  
else
   this_line_commented_out_makes_it_working_again
   
endif

Further data points:
- output of -fdump-tree-optimized is the same for both the working and the 
broken case (of the 
original function, not the line commenting test).
- there are differences in assembler output though.
- the code calls C functions.
- I could reproduce this issue on an different, non-pentium4 machine using the 
same flags.
- the flag -march=pentium4 is necessary to trigger the issue.


I'm in a loss where to search for the real cause. Has anybody a hint how to 
proceed further?
Should I post the code of this function here on the list (~300 LOC)?

Thanks,
Manfred





Re: wrong code with -fforce-addr

2007-10-03 Thread Manfred Schwarb

> > Hi,
> >
> > I have a rather nasty optimization issue with gfortran (as of
> yesterday). As I think it could
> > be an optimization issue and not an gfortran frontend issue, I post it
> here,
> > the fortran list was not able to help so far.
> >
> > I narrowed my problem to one single fortran function. If I compile this
> function with
> > "gfortran -O2 -march=pentium4" the output is OK, using
> > "gfortran -O2 -march=pentium4 -fforce-addr" produces wrong output.
> >
> > When reducing optimization level to "-O1", the issue vanishes for all
> tried flag combinations.
> > If I comment out a particular non-functional line in this code, the
> issue goes away:
> >
> > if (always_true) then
> >   
> > else
> >this_line_commented_out_makes_it_working_again
> >
> > endif
> >
> > Further data points:
> > - output of -fdump-tree-optimized is the same for both the working and
> the broken case (of the
> > original function, not the line commenting test).
> > - there are differences in assembler output though.
> > - the code calls C functions.
> > - I could reproduce this issue on an different, non-pentium4 machine
> using the same flags.
> > - the flag -march=pentium4 is necessary to trigger the issue.
> >
> >
> > I'm in a loss where to search for the real cause. Has anybody a hint how
> to proceed further?
> > Should I post the code of this function here on the list (~300 LOC)?
> 
> You should file a bugreport with bugzilla instead. 
> http://gcc.gnu.org/bugzilla
> 
> Thanks,
> Richard.

It's http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33638

I first hoped to get help to narrow things a bit further, as my
problem is really a bit strange, so I decided to first write an email.
I simply assigned it now to the middle-end, but of course it could 
also be a gfortran issue.

Thanks,
Manfred



-- 
Ist Ihr Browser Vista-kompatibel? Jetzt die neuesten 
Browser-Versionen downloaden: http://www.gmx.net/de/go/browser


Re: wrong code with -fforce-addr

2007-10-03 Thread Manfred Schwarb

> On Wed, Oct 03, 2007 at 12:24:27PM +0200, Manfred Schwarb wrote:
> > I'm in a loss where to search for the real cause. Has anybody a hint
> > how to proceed further?
> 
> Sounds like weird-but-somewhat-determinist behaviour you can get when
> you do out-of-bounds access on the stack or this kind of problems.
> Did you try valgrind or your os-of-choice equivalent on the code?
> 
>   OG.

I retested with valgrind just now, there was actually one problem
in an unrelated function ("Conditional jump or move depends on 
uninitialised value"). I fixed this now with proper initialization.

After this correction the codes passes cleanly with valgrind and efence.

But this did not help, I still do get wrong code with "-fforce-addr",
the behaviour did not change at all.

Thanks,
Manfred



-- 
Ist Ihr Browser Vista-kompatibel? Jetzt die neuesten 
Browser-Versionen downloaden: http://www.gmx.net/de/go/browser


Suggestion for GCC (C & C++) enhancement - static variable initialisation ordering

2006-04-29 Thread Manfred von Willich
Any interested GCC maintainers/contributors:

I have a suggestion for GCC to eliminate a pernicious problem - that of 
automatically initialising static (i.e. long-lived) variables in the correct 
order based on mutual dependencies, apparently not normally addressed by 
compilers.  This is a thorny issue, and is normally discovered the hard way 
by most programmers - not the way to produce robust code.  I would welcome 
any correspondence/queries on the matter.

The suggestion is for a simple modification to GCC to ensure that static 
variables are always initialised (and destroyed) in the correct order 
determined by dependencies between the actual variables concerned - this 
applies to both C and C++.  Someone may already have tackled this issue, 
though I don't see it on the GNU site.  I would be happy to supply example 
code to illustrate the concept to any interested person.

I have a very simple approach to
   (a) ensure that every static variable used in initialising another is 
always initialised beforehand, even when this dependency is hidden from the 
compiler, and
   (b) detect (unfortunately not at compile time) when correct 
initialisation is impossible due to a cyclic dependency.

A more involved mechanism can also
   (c) ensure that no static variable is destroyed before its last use.

The essence of the idea for (a) is:
   - Access each static variable including static class members via 
(inlined) wrapper code in the same way as is typical for function-local 
variables (i.e. initialised once on first access using a flag).  This 
ensures that every static variable gets initialised before use, especially 
where such use is in the initialisation of another static variable.
   - Retain initialisation of static variables before calling main() (via 
the wrapper code).  This ensures that initialisation always occurs before 
main() is called as would be expected, and does so in a single-threaded 
environment, eliminating any need for a mutex.
   - Preferably initialise all function-local variables at this point too - 
initialisation in a multi-threaded environment using a simple flag (as 
opposed to a mutex) is non-reentrant, and can fail sporadically.  This 
modification would have the effect that function-local statics will always 
be initialised before calling main() even if never used.  The alternative is 
to use a mutex, which may be problematic (OS-dependent).
   - I would avoid the approach apparently taken in C# and Java - that of 
initialising all statics in each class as a set (this is not as robust, 
since references between classes may be cyclic without cyclic dependencies 
for initialisation.  Note that this means that statics in a class may be 
initialised after the first instances of the class are constructed, unless 
used by the constructor.

The approach taken for (b) is to have an intermediate state for the flag 
indicating that the initialisation of a given variable is in progress but 
not yet complete.  If the initialisation is triggered a second time in this 
state, a cyclic dependency exists.

The approach for (c) involves triggering initialisation of every object with 
a destructor early enough that the "atexit" call to the destructor occurs 
after that of every object that has methods that accesses it.  This can be 
dealt with separately, and I will omit the detail for now.

Note that these changes are probably largely ANSI-compliant, and as such may 
not have to be treated as an extension.  When applied to global ("extern") 
variables, additional linker information is needed (a reference to the flag 
and initialisation code).  This will introduce problems when linking code 
generated by different versions of the compiler.

Manfred von Willich 



Re: Suggestion for GCC (C & C++) enhancement - static variable initialisation ordering

2006-05-03 Thread Manfred von Willich
| I'd encourage you to work up a solid proposal for ISO/ANSI and
| propose it there.

Being a newbie, I'd appreciate contact/site details for submissions to the 
ISO/ANSI standardisation forum (do I email [EMAIL PROTECTED]).

I will be happy to draft and submit a proposal, including a hopefully 
compelling motivation.  I would need to know what form a proposal must take. 
I assume that the draft dated 2 December 1996 is what I should use as a 
basis.