Re: Pascal front-end integration

2005-03-01 Thread Waldek Hebisch
James A. Morrison wrote:
> I've decided I'm going to try to take the time and cleanup and update
> the Pascal frontend for gcc and try it get it integrated into the 
> upstream source.

Nice to hear that you want to work on Pascal. However did you notice
that gpc _is_ changing. In particular, the latest snapshot is
gpc-20050217.tar.bz2 (it looks that you started from earlier version).
Since you plan to work on your own branch, it is wise to plan for 
easy merge with frontend changes. 

Joseph S. Myers wrote:

> If GPC developers are interested in having GPC integrated in GCC 4.1 and
> are willing to have it play by the same rules as the rest of GCC - note
> that the Ada maintainers made substantial changes to how they contributed
> patches to GCC in order to follow usual GCC practice more closely - then
> of course coordination would be desirable.  

I would like to see GPC integrated in GCC. However, I feel that playing
by the GCC rules I could do substantially less work for GPC that I am
doing now. GCC rules pay off when there is critical mass of developers.
My ipression was that GPC do not have that critical mass -- so it was
better to keep GPC outside of GCC. Jim contibution can change that.

> If the GPC developers would
> prefer to continue to develop GPC independently of GCC, this need not stop
> integration of some version of GPC in GCC.  I would hope in that case,
> however, there would still be better and closer cooperation between the
> two lines of development than there has been after the g95/gfortran fork
> (for example, that the GPC developers would be willing to make the version
> control repository used for actual development accessible to the public so
> individual patches can be extracted and merged as such).

ATM GPC does not use version control. Frank Heckenbach just periodically
collects flowing patches and his changes into releases. 

-- 
  Waldek Hebisch
[EMAIL PROTECTED] 









Re: Pascal front-end integration

2005-03-02 Thread Waldek Hebisch
> On Wed, 2 Mar 2005, Waldek Hebisch wrote:
> 
> > > If GPC developers are interested in having GPC integrated in GCC 4.1 and
> > > are willing to have it play by the same rules as the rest of GCC - note
> > > that the Ada maintainers made substantial changes to how they contributed
> > > patches to GCC in order to follow usual GCC practice more closely - then
> > > of course coordination would be desirable.  
> > 
> > I would like to see GPC integrated in GCC. However, I feel that playing
> > by the GCC rules I could do substantially less work for GPC that I am
> > doing now. GCC rules pay off when there is critical mass of developers.
> > My ipression was that GPC do not have that critical mass -- so it was
> > better to keep GPC outside of GCC. Jim contibution can change that.
> 
> Which rules are the problem?
> 

Keeping the whole development in lockstep and stages. 

1) With current GPC I can work out a new feature and test it using old
backend. Such feature relatively quickly can go out in a development
snapshot which is usable by ordinary users (the development snapshot
uses old, tested backend which shields users form most bugs outside
the front end). At the same time I can work on adapting GPC to newer
backend. All that when _I_ have time. 

Staged development introduces deadlines and delays, which are very
disruptive for me (it seems that other peope can handle schedules
much better than I can). I can easily miss stage-1 or even 
stage-1 and stage-2 window...

2) GPC developers are supposed to work the mainline. While many changes
are applied automatically to the whole mainline, some will require
manual intervention. While such intervention is usually trivial,
it is still significant distraction from other work (adjusting GPC
to follow mainline I have found that I spent much less time 
doing the work in a single batch, then trying to follow smaller
changes).

3) AFAIU dropping support for multiple backends is considered as a
pre-condition to inclusion of GPC into GCC. GPC release wold be
part of GCC release. People trying GPC snapshots would automatically
get backend snapshot. I am affraid that for Pascal that means
6-8 months extra delay between including a feature in GPC and first
bug reports (and consequently more effort for bug fixing).

4) I feel bad not fixing bugs in release version. But after merges
is stage 1 mainline is significanly changed. More important, large
parts which are logically the same are still subject to a lot of
trivial changes. So fixing bugs in release version means double
work. 



-- 
  Waldek Hebisch
[EMAIL PROTECTED] 




Re: Pascal front-end integration

2005-03-04 Thread Waldek Hebisch
> Certainly porting to 4.x will require private tree codes - for example, 
> SET_TYPE is no longer handled in the core code as not being used by any 
> integrated language, so it will need to become a private Pascal tree code 
> and be lowered in the Pascal gimplification code.  There may be other tree 

Silly question: how are we supposed to emit debug info about sets? While
there are also some problems with gdb, in general gdb currently can
handle Pascal sets. Sets are part of DWARF-2 specification. Even if
gcc had provided a hook to emit front-end specific debug info it would
still be more total work then to have direct backend support. 

-- 
      Waldek Hebisch
[EMAIL PROTECTED] 


Re: Pascal front-end integration

2005-03-04 Thread Waldek Hebisch
> > 3) AFAIU dropping support for multiple backends is considered as a
> > pre-condition to inclusion of GPC into GCC. GPC release wold be
> > part of GCC release. People trying GPC snapshots would automatically
> > get backend snapshot. I am affraid that for Pascal that means
> > 6-8 months extra delay between including a feature in GPC and first
> > bug reports (and consequently more effort for bug fixing).
> 
> The way to avoid this is for front ends to have internal datastructures 
> decoupled from those of the rest of GCC and to have a small piece of code 
> that does the conversion (like the Ada front end's gigi), then just 
> maintain multiple versions of that small piece of code.  This seems to 
> work well for Ada.

GPC uses backend data structure when possible. I see no reason to
duplicate backend functionality (Ada front is written in Ada, so they
_had to_ duplicate a lot of infrastructure). We can hide (and are doing
that now) most differences in macros.

But there are few tricky questions: I old backends want us to call a
function is it OK to always call a wrapper (empty for new backend)?

Or an actual sample (the most nasty case I have found):

case FLOAT_EXPR:
#ifndef GCC_3_4
case FFS_EXPR:
#endif

here we have moderatly sized switch and moving the case constant
to backend dependent part would move the whole switch, hence the
function containing the switch. Is it acceptable to have something
like:

   case FLOAT_EXPR:
   CASE_FFS_EXPR

with `CASE_FFS_EXPR' expanding to nothing for new backend.
 

-- 
  Waldek Hebisch
[EMAIL PROTECTED] 


Re: Questions about trampolines

2005-03-14 Thread Waldek Hebisch
Oyvind Harboe wrote:

> Trampolines are strange things... :-)
> - Lately e.g. the AMD CPU has added support for making it impossible
>   to execute code on the stack. The town isn't big enough for
>   both stack based trampolines and stack code execution protection.
>   What happens now?
> - Do (theoretical?) alternatives to trampolines exist? I.e. something
>   that does not generate code runtime.

The main purpose of tramplines is to provide a function with extra
data privete to given instance (and to do this at runtime). There
are some alternatives: `thick pointers' and double indirection. 
Both methods associate extra data with function pointers and require
different way of calling functions via function pointers. Thick 
pinters carry extra data together with the pointer, making calls 
faster but copying slower. Also, thick pointers beeing bigger then
machine addressess are not compatible with usual pointers.
Double indirection slows calls, but pinter is compatible with 
machine address and cheaper to copy around. 

If a language wants to interface with C then the language must
use the same representaition of function pointers as C. For standard C
thick pointers are both inefficient and may break a lot of real code.
Double indirection is quite compatible (and IFAIK is used by some
system ABI-s). Still for C double indirection is more expensive
then calling pointer directly. And for established targets ABI is
fixed, so there is no choice...

One can always imagine different implementation. For example, assign
invalid adresses to function pointer and use a signal handler to
sort out what should be called and with what arguments. But I will
not advocate such implementation.

So given constraint of compatibility with existing C implementations 
trampolines look like the best choice. Tramplines are in fact standard
implementation technique in functional languages, especially for
interfacing with C.

But there is no need to generate trampolines on the stack. Namely, 
one can generate code in a separate area. In C this causes problems
with garbage collection, which IMHO can be solved, but requre alloca-like
tricks. On the other hand trampolines in separate area may provide
extra functionality, beyond what nested functions give. For example
they can be used to "partially apply" a function, giving it some
frozen arguments, and providing the rest at call time.

There is some connestion with objects: "partial applicatin" can produce
pointer to a method compatible with usual calling convention.


-- 
  Waldek Hebisch
[EMAIL PROTECTED] 


Re: Questions about trampolines

2005-03-14 Thread Waldek Hebisch
Robert Dewar wrote:
> Waldek Hebisch wrote:
> 
> > But there is no need to generate trampolines on the stack. Namely, 
> > one can generate code in a separate area. In C this causes problems
> > with garbage collection, which IMHO can be solved, but requre alloca-like
> > tricks. On the other hand trampolines in separate area may provide
> > extra functionality, beyond what nested functions give. For example
> > they can be used to "partially apply" a function, giving it some
> > frozen arguments, and providing the rest at call time.
> 
> Trampolines do of course have to be handled in a stack like fashion (to
> get recursion right), so you have to be very careful about allocation and
> deallocation in this separate area. And you still have the cache problem,
> so I don't see what it buys.
> 

1) the stack can be execute protected, so simple buffer overflow exploits
   will be detected. On i386 one can use segment register to make stack
   non-executable and still use trampolines

2) On machine with separate code and data areas trampolines can be in 
   (writable) code space. Of course it does not help if all code space
   in non-writable.

3) One limits contention between code and data cache during calls. Of
   course cost of generating trampoline may be slightly higher, but the
   cost of call is likely to be lower. Even for for generation, one can
   limit number of `mprotect' call from one per tramoline to one per page
   (which is likely to be much smaller).

4) One gets extra functionality (partial application), which may
   significanlty reduce need for pointers to nested functions. Namely,
   trampolines for partial application may be re-used much easier
   then nested functions (which have to be regenerated when entering
   the scope)


-- 
  Waldek Hebisch
[EMAIL PROTECTED] 


-fstack-check harmful in main tread (on Linux)

2005-05-19 Thread Waldek Hebisch
I think that current documentation of `-fstack-check' is unclear. The
documentation states that for single-treaded program `-fstack-check'
is not usefull. IMHO for main thread `-fstack-check' is harmfull
and may cause spurious segfault. Namely the following silly
program:

extern int printf(const char * fmt, ...);

int
main(void)
{
int dummy;
printf("0x%lx\n", &dummy);
return 0;
}

compiled with `-fstack-check' and run using the following command line:

for A in 1 2 3 4 5 6 7 8 9 0; do for B in 1 2 3 4 5 6 7 8
9 0; do 
C=${C}a;
 D=$C ./a.out ;
done ; done

crashes multiple times. AFAICS the stack probe generated by gcc accesses
stack  below current stack pointer. The address is not alread allocated
Linux kernel treats such access as segmentation fault (page fault for
address above stack pointer is legal and Linux just allocats more space
to the stack). Since `exec' system call copies environment and arguments
to the new stack one can cause segmentation fault in _any_ program 
compiled with `-fstack-check' just by putting apropriatly sized variable
in the environment (sometimes even renaming the program).

There are already bug reports about this problem (like PR 10127), but
I have not seen written explanation.

So I would suggest to add a warning, for example:

Do not use `-fstack-check' for single-thread programs (or main thread
in multi-threaded programs), on some systems (for example Linux) 
it may cause spurious segmentation fault during startup.

Alternatively, `-fstack-check' should be fixed and use different method
for stack probes (but the goals of current stack probe look incompatible
with kernel policy for stack extension).

-- 
  Waldek Hebisch
[EMAIL PROTECTED] 


Ada subtypes and base types

2006-02-27 Thread Waldek Hebisch
Jeffrey A Law wrote:
> My suspicions appear to be correct.  This never triggers except for
> Ada code and it's relatively common in Ada code.  No surprise since
> I don't think any other front-end abuses TYPE_MAX_VALUE in the way
> the Ada front-end does.  This wouldn't be the first time we've had
> to hack up something in the generic optimizers to deal with the
> broken TYPE_MAX_VALUE.

What do you mean by "abuse"?  TYPE_MAX_VALUE means maximal value
allowed by given type.  For range types it is clearily the upper
bound of the range.  Of course, upper bound should be representable,
so TYPE_MAX_VALUE <= (2**TYPE_PRECISION - 1) for unsigned types
and TYPE_MAX_VALUE <= (2**(TYPE_PRECISION - 1) - 1) for signed types.
However, if the language has non-trivial range types you can expect
strict inequality.  Note, that if you do not allow strict inequality
above, TYPE_MAX_VALUE would be redundant.

FYI GNU Pascal is using such representation for range types, so for
example:

type t = 0..5;

will give you TYPE_PRECISION equal to 32 (this is an old decision
which tries to increase speed at the cost of space, otherwise 8
would be enough) and TYPE_MAX_VALUE equal to 5.

GNU Pascal promotes arguments of operators, so that arithmetic take
place in "standard" types -- I belive Ada is doing the same.

BTW, setting TYPE_PRECISION to 3 for the type above used to cause
wrong code, so the way above was forced by the backend.

If you think that such behaviour is "abuse" then why to have sparate
TYPE_MAX_VALUE. How to represent range types so that optimizers
will know about allowed ranges (and use them!)? And how about debug
info?

-- 
  Waldek Hebisch
[EMAIL PROTECTED] 


Re: Ada subtypes and base types

2006-03-13 Thread Waldek Hebisch
Jeffrey A Law wrote:
> On Mon, 2006-02-27 at 20:08 +0100, Waldek Hebisch wrote:
> 
> > What do you mean by "abuse"?  TYPE_MAX_VALUE means maximal value
> > allowed by given type.
> As long as you're *absolutely* clear that  a variable with a
> restricted range can never hold a value outside that the
> restricted range in a conforming program, then I'll back off
> the "abuse" label and merely call it pointless :-)
> 
> The scheme you're using "promoting" to a base type before all
> arithmetic creates lots of useless type conversions and means
> that the optimizers can't utilize TYPE_MIN_VALUE/TYPE_MAX_VALUE
> anyway.  ie, you don't gain anything over keeping that knowledge
> in the front-end.
> 

Pascal arithmetic essentially is untyped: operators take integer
arguments and are supposed to give mathematically correct result
(provided all intermediate results are representable in machine
arithmetic, overflow is treated as user error). OTOH for proper
type checking front end have to track ranges associated to
variables. So "useless" type conversions are needed due to
Pascal standard and backend constraints.

I think that it is easy for back end to make good use of
TYPE_MIN_VALUE/TYPE_MAX_VALUE. Namely, consider the assignment

x := y + z * w;

where variables y, z and w have values in the interval [0,7] and
x have values in [0,1000]. Pascal converts the above to the
following C like code:

int tmp = (int) y + (int) z * (int) w;
x = (tmp < 0 || tmp > 1000)? (Range_Check_Error (), 0) : tmp;
 
I expect VRP to deduce that tmp will have values in [0..56] and
eliminate range check. Also, it should be clear that in the
assigment above artithmetic can be done using any convenient
precision.

In principle Pascal front end could deduce more precise types (ranges),
but that would create some extra type conversions and a lot
of extra types. Moreover, I assume that VRP can do better job
at tracking ranges then Pascal front end.

-- 
  Waldek Hebisch
[EMAIL PROTECTED] 


Re: Ada subtypes and base types

2006-03-17 Thread Waldek Hebisch
Robert Dewar wrote:
> Laurent GUERBY wrote:
> > On Mon, 2006-03-13 at 15:31 -0700, Jeffrey A Law wrote:
> >> On Mon, 2006-02-27 at 20:08 +0100, Waldek Hebisch wrote:
> >>
> >>> What do you mean by "abuse"?  TYPE_MAX_VALUE means maximal value
> >>> allowed by given type.
> >> As long as you're *absolutely* clear that  a variable with a
> >> restricted range can never hold a value outside that the
> >> restricted range in a conforming program, then I'll back off
> >> the "abuse" label and merely call it pointless :-)
> > 
> > Variables in a non erroneous Ada program all have their value between
> > their type bounds from the optimizer perspective (the special 'valid
> > case put aside).
> 
> Not quite right. If you have an uninitialized variable, the value is
> invalid and may be out of bounds, but this is a bounded error situation,
> not an erroneous program. So the possible effects are definitely NOT
> unbounded, and the use of such values cannot turn a program erroneous.
> (that's an Ada 95 change, this used to be erroneous in Ada 83).
> 

What about initializing _all_ variables? If we allow out of range values
at one place than we loose usefull invariant. I see no way that a frontend
could allow out of range values and simultaneously use TYPE_MAX_VALUE for
optimization. Also, out of range values would make TYPE_MAX_VALUE of little
use to backend. 

I see two drawbacks if we initialize all variables:
1) bigger program and lower runtime efficienecy
2) we loose diagnostics about uninitialized variables

I would guess that cost of extra initializiations will be small, because
backend will remove unused variables and obviously redundant initializations
so only tricky cases will remain. Also, extra initializiatin is likely to
be cheaper then extra tests which are needed otherwise.

I am more concerned about loss of diagnostics. That could be resolved if
forntend could mark generated initializations so that backend will generate
code to initialize variable, but also issue diagnostics if such initializatin
can not be removed (yes, I understand that it is conceptually a small
change but it would probably require largish change to backend code).

-- 
  Waldek Hebisch
[EMAIL PROTECTED] 


Re: GNU Pascal branch

2006-04-03 Thread Waldek Hebisch
Steven Bosscher wrote:

> The fact is, that the GNU Pascal crew did not want integration with
> gcc the last time this was discussed. GCC, the project, can not just
> suck in every front end out there if the maintainers of that front end
> do not want that.

Not "did not want integration". At leat I personally would support
integration very much. But there are practical problems:

1) When gcc releases version n gcc development works with version n+1.
   At the same time gpc developers typically work with gcc version n-1.
   So, there is substantial work involved to update gpc from gcc version n-1
   to gcc version n+1
2) Adjusting gpc development model. In particular, gpc uses rather short
   feedback loop: new features are released (as alphas) when they are ready.
   This is possible because gpc uses stable backend, so that users are
   exposed only to front end bugs. With development backends there is a
   danger that normal user will try new front end features only after
   full gcc release.
3) gcc develops in lockstep, which requires constant attention from
   maintainers. It is clear if such attention will be available. I
   must say that in last few years there were frequenty weeks in which
   I had no time for gpc work and even some such months.

Also, I have problems with "all or nothing" attitude to integration.
gpc is a mature front and to keep comunity alive it has to regularly
deliver bug fixes and enhancements. Realistically, succesfull integration
started when gcc development version is n+1 can deliver stable gpc
first in version n+2 (n+1 version almost surely will contain serious bugs).
Which means 2-3 years after starting integration. 2-3 years without
a stable release may disintegrate the gpc comunity. Also, there is
a risk that integration will just not work (if in tree gpc turns out
to be too buggy gcc developers may just skip testing it resulting in
even more bit-rot).

So, please understand that I do not want to drop work on all-backend
gpc before success of gpc tied to current backend is clear. And since
I was multiple time assured that all-backend gpc is inacceptable in
gcc tree I have tried to update out of tree gpc to support current 
development version of gcc. Since gcc is a moving target that turned
out to require more effort that I can spent on gpc.

Finally, coming to original topic: I do not know if Adrian's idea
is a good one. But I think that his intention was to bring gcc
and gpc development closer together with integration as an ultimate
goal.

-- 
      Waldek Hebisch
[EMAIL PROTECTED] 


Ipa and CONSTRUCTOR in global vars

2006-04-06 Thread Waldek Hebisch
I am updating GNU Pascal to work with gcc-4.x. I have hit the following
problem: `get_base_var' (in ipa-utils.c:224) can not handle a CONSTRUCTOR
and I get ICE.

AFAICS the CONSTRUCTOR in question comes from initializer of global
variable. More precisely, I have a global variable with an initializer.
This initialize is a CONSTRUCTOR. One of its components is ADDR_EXPR
of another CONSTRUCTOR.

Now, the question is: should `get_base_var' be extended to handle
CONSTRUCTOR nodes or is ADDR_EXPR of a CONSTRUCTOR forbidden from
getting there?

-- 
      Waldek Hebisch
[EMAIL PROTECTED] 


Re: Ipa and CONSTRUCTOR in global vars

2006-04-07 Thread Waldek Hebisch
> > Now, the question is: should `get_base_var' be extended to handle
> > CONSTRUCTOR nodes or is ADDR_EXPR of a CONSTRUCTOR forbidden from
> > getting there?
> 
> The latter.  See PR c++/23171 and PR ada/22533.
> 

Thanks. I have put addresable constructors in temporaries and the
problem went away (the only cluprit were Pascal strings).

-- 
      Waldek Hebisch
[EMAIL PROTECTED]