Re: Pascal front-end integration
James A. Morrison wrote: > I've decided I'm going to try to take the time and cleanup and update > the Pascal frontend for gcc and try it get it integrated into the > upstream source. Nice to hear that you want to work on Pascal. However did you notice that gpc _is_ changing. In particular, the latest snapshot is gpc-20050217.tar.bz2 (it looks that you started from earlier version). Since you plan to work on your own branch, it is wise to plan for easy merge with frontend changes. Joseph S. Myers wrote: > If GPC developers are interested in having GPC integrated in GCC 4.1 and > are willing to have it play by the same rules as the rest of GCC - note > that the Ada maintainers made substantial changes to how they contributed > patches to GCC in order to follow usual GCC practice more closely - then > of course coordination would be desirable. I would like to see GPC integrated in GCC. However, I feel that playing by the GCC rules I could do substantially less work for GPC that I am doing now. GCC rules pay off when there is critical mass of developers. My ipression was that GPC do not have that critical mass -- so it was better to keep GPC outside of GCC. Jim contibution can change that. > If the GPC developers would > prefer to continue to develop GPC independently of GCC, this need not stop > integration of some version of GPC in GCC. I would hope in that case, > however, there would still be better and closer cooperation between the > two lines of development than there has been after the g95/gfortran fork > (for example, that the GPC developers would be willing to make the version > control repository used for actual development accessible to the public so > individual patches can be extracted and merged as such). ATM GPC does not use version control. Frank Heckenbach just periodically collects flowing patches and his changes into releases. -- Waldek Hebisch [EMAIL PROTECTED]
Re: Pascal front-end integration
> On Wed, 2 Mar 2005, Waldek Hebisch wrote: > > > > If GPC developers are interested in having GPC integrated in GCC 4.1 and > > > are willing to have it play by the same rules as the rest of GCC - note > > > that the Ada maintainers made substantial changes to how they contributed > > > patches to GCC in order to follow usual GCC practice more closely - then > > > of course coordination would be desirable. > > > > I would like to see GPC integrated in GCC. However, I feel that playing > > by the GCC rules I could do substantially less work for GPC that I am > > doing now. GCC rules pay off when there is critical mass of developers. > > My ipression was that GPC do not have that critical mass -- so it was > > better to keep GPC outside of GCC. Jim contibution can change that. > > Which rules are the problem? > Keeping the whole development in lockstep and stages. 1) With current GPC I can work out a new feature and test it using old backend. Such feature relatively quickly can go out in a development snapshot which is usable by ordinary users (the development snapshot uses old, tested backend which shields users form most bugs outside the front end). At the same time I can work on adapting GPC to newer backend. All that when _I_ have time. Staged development introduces deadlines and delays, which are very disruptive for me (it seems that other peope can handle schedules much better than I can). I can easily miss stage-1 or even stage-1 and stage-2 window... 2) GPC developers are supposed to work the mainline. While many changes are applied automatically to the whole mainline, some will require manual intervention. While such intervention is usually trivial, it is still significant distraction from other work (adjusting GPC to follow mainline I have found that I spent much less time doing the work in a single batch, then trying to follow smaller changes). 3) AFAIU dropping support for multiple backends is considered as a pre-condition to inclusion of GPC into GCC. GPC release wold be part of GCC release. People trying GPC snapshots would automatically get backend snapshot. I am affraid that for Pascal that means 6-8 months extra delay between including a feature in GPC and first bug reports (and consequently more effort for bug fixing). 4) I feel bad not fixing bugs in release version. But after merges is stage 1 mainline is significanly changed. More important, large parts which are logically the same are still subject to a lot of trivial changes. So fixing bugs in release version means double work. -- Waldek Hebisch [EMAIL PROTECTED]
Re: Pascal front-end integration
> Certainly porting to 4.x will require private tree codes - for example, > SET_TYPE is no longer handled in the core code as not being used by any > integrated language, so it will need to become a private Pascal tree code > and be lowered in the Pascal gimplification code. There may be other tree Silly question: how are we supposed to emit debug info about sets? While there are also some problems with gdb, in general gdb currently can handle Pascal sets. Sets are part of DWARF-2 specification. Even if gcc had provided a hook to emit front-end specific debug info it would still be more total work then to have direct backend support. -- Waldek Hebisch [EMAIL PROTECTED]
Re: Pascal front-end integration
> > 3) AFAIU dropping support for multiple backends is considered as a > > pre-condition to inclusion of GPC into GCC. GPC release wold be > > part of GCC release. People trying GPC snapshots would automatically > > get backend snapshot. I am affraid that for Pascal that means > > 6-8 months extra delay between including a feature in GPC and first > > bug reports (and consequently more effort for bug fixing). > > The way to avoid this is for front ends to have internal datastructures > decoupled from those of the rest of GCC and to have a small piece of code > that does the conversion (like the Ada front end's gigi), then just > maintain multiple versions of that small piece of code. This seems to > work well for Ada. GPC uses backend data structure when possible. I see no reason to duplicate backend functionality (Ada front is written in Ada, so they _had to_ duplicate a lot of infrastructure). We can hide (and are doing that now) most differences in macros. But there are few tricky questions: I old backends want us to call a function is it OK to always call a wrapper (empty for new backend)? Or an actual sample (the most nasty case I have found): case FLOAT_EXPR: #ifndef GCC_3_4 case FFS_EXPR: #endif here we have moderatly sized switch and moving the case constant to backend dependent part would move the whole switch, hence the function containing the switch. Is it acceptable to have something like: case FLOAT_EXPR: CASE_FFS_EXPR with `CASE_FFS_EXPR' expanding to nothing for new backend. -- Waldek Hebisch [EMAIL PROTECTED]
Re: Questions about trampolines
Oyvind Harboe wrote: > Trampolines are strange things... :-) > - Lately e.g. the AMD CPU has added support for making it impossible > to execute code on the stack. The town isn't big enough for > both stack based trampolines and stack code execution protection. > What happens now? > - Do (theoretical?) alternatives to trampolines exist? I.e. something > that does not generate code runtime. The main purpose of tramplines is to provide a function with extra data privete to given instance (and to do this at runtime). There are some alternatives: `thick pointers' and double indirection. Both methods associate extra data with function pointers and require different way of calling functions via function pointers. Thick pinters carry extra data together with the pointer, making calls faster but copying slower. Also, thick pointers beeing bigger then machine addressess are not compatible with usual pointers. Double indirection slows calls, but pinter is compatible with machine address and cheaper to copy around. If a language wants to interface with C then the language must use the same representaition of function pointers as C. For standard C thick pointers are both inefficient and may break a lot of real code. Double indirection is quite compatible (and IFAIK is used by some system ABI-s). Still for C double indirection is more expensive then calling pointer directly. And for established targets ABI is fixed, so there is no choice... One can always imagine different implementation. For example, assign invalid adresses to function pointer and use a signal handler to sort out what should be called and with what arguments. But I will not advocate such implementation. So given constraint of compatibility with existing C implementations trampolines look like the best choice. Tramplines are in fact standard implementation technique in functional languages, especially for interfacing with C. But there is no need to generate trampolines on the stack. Namely, one can generate code in a separate area. In C this causes problems with garbage collection, which IMHO can be solved, but requre alloca-like tricks. On the other hand trampolines in separate area may provide extra functionality, beyond what nested functions give. For example they can be used to "partially apply" a function, giving it some frozen arguments, and providing the rest at call time. There is some connestion with objects: "partial applicatin" can produce pointer to a method compatible with usual calling convention. -- Waldek Hebisch [EMAIL PROTECTED]
Re: Questions about trampolines
Robert Dewar wrote: > Waldek Hebisch wrote: > > > But there is no need to generate trampolines on the stack. Namely, > > one can generate code in a separate area. In C this causes problems > > with garbage collection, which IMHO can be solved, but requre alloca-like > > tricks. On the other hand trampolines in separate area may provide > > extra functionality, beyond what nested functions give. For example > > they can be used to "partially apply" a function, giving it some > > frozen arguments, and providing the rest at call time. > > Trampolines do of course have to be handled in a stack like fashion (to > get recursion right), so you have to be very careful about allocation and > deallocation in this separate area. And you still have the cache problem, > so I don't see what it buys. > 1) the stack can be execute protected, so simple buffer overflow exploits will be detected. On i386 one can use segment register to make stack non-executable and still use trampolines 2) On machine with separate code and data areas trampolines can be in (writable) code space. Of course it does not help if all code space in non-writable. 3) One limits contention between code and data cache during calls. Of course cost of generating trampoline may be slightly higher, but the cost of call is likely to be lower. Even for for generation, one can limit number of `mprotect' call from one per tramoline to one per page (which is likely to be much smaller). 4) One gets extra functionality (partial application), which may significanlty reduce need for pointers to nested functions. Namely, trampolines for partial application may be re-used much easier then nested functions (which have to be regenerated when entering the scope) -- Waldek Hebisch [EMAIL PROTECTED]
-fstack-check harmful in main tread (on Linux)
I think that current documentation of `-fstack-check' is unclear. The documentation states that for single-treaded program `-fstack-check' is not usefull. IMHO for main thread `-fstack-check' is harmfull and may cause spurious segfault. Namely the following silly program: extern int printf(const char * fmt, ...); int main(void) { int dummy; printf("0x%lx\n", &dummy); return 0; } compiled with `-fstack-check' and run using the following command line: for A in 1 2 3 4 5 6 7 8 9 0; do for B in 1 2 3 4 5 6 7 8 9 0; do C=${C}a; D=$C ./a.out ; done ; done crashes multiple times. AFAICS the stack probe generated by gcc accesses stack below current stack pointer. The address is not alread allocated Linux kernel treats such access as segmentation fault (page fault for address above stack pointer is legal and Linux just allocats more space to the stack). Since `exec' system call copies environment and arguments to the new stack one can cause segmentation fault in _any_ program compiled with `-fstack-check' just by putting apropriatly sized variable in the environment (sometimes even renaming the program). There are already bug reports about this problem (like PR 10127), but I have not seen written explanation. So I would suggest to add a warning, for example: Do not use `-fstack-check' for single-thread programs (or main thread in multi-threaded programs), on some systems (for example Linux) it may cause spurious segmentation fault during startup. Alternatively, `-fstack-check' should be fixed and use different method for stack probes (but the goals of current stack probe look incompatible with kernel policy for stack extension). -- Waldek Hebisch [EMAIL PROTECTED]
Ada subtypes and base types
Jeffrey A Law wrote: > My suspicions appear to be correct. This never triggers except for > Ada code and it's relatively common in Ada code. No surprise since > I don't think any other front-end abuses TYPE_MAX_VALUE in the way > the Ada front-end does. This wouldn't be the first time we've had > to hack up something in the generic optimizers to deal with the > broken TYPE_MAX_VALUE. What do you mean by "abuse"? TYPE_MAX_VALUE means maximal value allowed by given type. For range types it is clearily the upper bound of the range. Of course, upper bound should be representable, so TYPE_MAX_VALUE <= (2**TYPE_PRECISION - 1) for unsigned types and TYPE_MAX_VALUE <= (2**(TYPE_PRECISION - 1) - 1) for signed types. However, if the language has non-trivial range types you can expect strict inequality. Note, that if you do not allow strict inequality above, TYPE_MAX_VALUE would be redundant. FYI GNU Pascal is using such representation for range types, so for example: type t = 0..5; will give you TYPE_PRECISION equal to 32 (this is an old decision which tries to increase speed at the cost of space, otherwise 8 would be enough) and TYPE_MAX_VALUE equal to 5. GNU Pascal promotes arguments of operators, so that arithmetic take place in "standard" types -- I belive Ada is doing the same. BTW, setting TYPE_PRECISION to 3 for the type above used to cause wrong code, so the way above was forced by the backend. If you think that such behaviour is "abuse" then why to have sparate TYPE_MAX_VALUE. How to represent range types so that optimizers will know about allowed ranges (and use them!)? And how about debug info? -- Waldek Hebisch [EMAIL PROTECTED]
Re: Ada subtypes and base types
Jeffrey A Law wrote: > On Mon, 2006-02-27 at 20:08 +0100, Waldek Hebisch wrote: > > > What do you mean by "abuse"? TYPE_MAX_VALUE means maximal value > > allowed by given type. > As long as you're *absolutely* clear that a variable with a > restricted range can never hold a value outside that the > restricted range in a conforming program, then I'll back off > the "abuse" label and merely call it pointless :-) > > The scheme you're using "promoting" to a base type before all > arithmetic creates lots of useless type conversions and means > that the optimizers can't utilize TYPE_MIN_VALUE/TYPE_MAX_VALUE > anyway. ie, you don't gain anything over keeping that knowledge > in the front-end. > Pascal arithmetic essentially is untyped: operators take integer arguments and are supposed to give mathematically correct result (provided all intermediate results are representable in machine arithmetic, overflow is treated as user error). OTOH for proper type checking front end have to track ranges associated to variables. So "useless" type conversions are needed due to Pascal standard and backend constraints. I think that it is easy for back end to make good use of TYPE_MIN_VALUE/TYPE_MAX_VALUE. Namely, consider the assignment x := y + z * w; where variables y, z and w have values in the interval [0,7] and x have values in [0,1000]. Pascal converts the above to the following C like code: int tmp = (int) y + (int) z * (int) w; x = (tmp < 0 || tmp > 1000)? (Range_Check_Error (), 0) : tmp; I expect VRP to deduce that tmp will have values in [0..56] and eliminate range check. Also, it should be clear that in the assigment above artithmetic can be done using any convenient precision. In principle Pascal front end could deduce more precise types (ranges), but that would create some extra type conversions and a lot of extra types. Moreover, I assume that VRP can do better job at tracking ranges then Pascal front end. -- Waldek Hebisch [EMAIL PROTECTED]
Re: Ada subtypes and base types
Robert Dewar wrote: > Laurent GUERBY wrote: > > On Mon, 2006-03-13 at 15:31 -0700, Jeffrey A Law wrote: > >> On Mon, 2006-02-27 at 20:08 +0100, Waldek Hebisch wrote: > >> > >>> What do you mean by "abuse"? TYPE_MAX_VALUE means maximal value > >>> allowed by given type. > >> As long as you're *absolutely* clear that a variable with a > >> restricted range can never hold a value outside that the > >> restricted range in a conforming program, then I'll back off > >> the "abuse" label and merely call it pointless :-) > > > > Variables in a non erroneous Ada program all have their value between > > their type bounds from the optimizer perspective (the special 'valid > > case put aside). > > Not quite right. If you have an uninitialized variable, the value is > invalid and may be out of bounds, but this is a bounded error situation, > not an erroneous program. So the possible effects are definitely NOT > unbounded, and the use of such values cannot turn a program erroneous. > (that's an Ada 95 change, this used to be erroneous in Ada 83). > What about initializing _all_ variables? If we allow out of range values at one place than we loose usefull invariant. I see no way that a frontend could allow out of range values and simultaneously use TYPE_MAX_VALUE for optimization. Also, out of range values would make TYPE_MAX_VALUE of little use to backend. I see two drawbacks if we initialize all variables: 1) bigger program and lower runtime efficienecy 2) we loose diagnostics about uninitialized variables I would guess that cost of extra initializiations will be small, because backend will remove unused variables and obviously redundant initializations so only tricky cases will remain. Also, extra initializiatin is likely to be cheaper then extra tests which are needed otherwise. I am more concerned about loss of diagnostics. That could be resolved if forntend could mark generated initializations so that backend will generate code to initialize variable, but also issue diagnostics if such initializatin can not be removed (yes, I understand that it is conceptually a small change but it would probably require largish change to backend code). -- Waldek Hebisch [EMAIL PROTECTED]
Re: GNU Pascal branch
Steven Bosscher wrote: > The fact is, that the GNU Pascal crew did not want integration with > gcc the last time this was discussed. GCC, the project, can not just > suck in every front end out there if the maintainers of that front end > do not want that. Not "did not want integration". At leat I personally would support integration very much. But there are practical problems: 1) When gcc releases version n gcc development works with version n+1. At the same time gpc developers typically work with gcc version n-1. So, there is substantial work involved to update gpc from gcc version n-1 to gcc version n+1 2) Adjusting gpc development model. In particular, gpc uses rather short feedback loop: new features are released (as alphas) when they are ready. This is possible because gpc uses stable backend, so that users are exposed only to front end bugs. With development backends there is a danger that normal user will try new front end features only after full gcc release. 3) gcc develops in lockstep, which requires constant attention from maintainers. It is clear if such attention will be available. I must say that in last few years there were frequenty weeks in which I had no time for gpc work and even some such months. Also, I have problems with "all or nothing" attitude to integration. gpc is a mature front and to keep comunity alive it has to regularly deliver bug fixes and enhancements. Realistically, succesfull integration started when gcc development version is n+1 can deliver stable gpc first in version n+2 (n+1 version almost surely will contain serious bugs). Which means 2-3 years after starting integration. 2-3 years without a stable release may disintegrate the gpc comunity. Also, there is a risk that integration will just not work (if in tree gpc turns out to be too buggy gcc developers may just skip testing it resulting in even more bit-rot). So, please understand that I do not want to drop work on all-backend gpc before success of gpc tied to current backend is clear. And since I was multiple time assured that all-backend gpc is inacceptable in gcc tree I have tried to update out of tree gpc to support current development version of gcc. Since gcc is a moving target that turned out to require more effort that I can spent on gpc. Finally, coming to original topic: I do not know if Adrian's idea is a good one. But I think that his intention was to bring gcc and gpc development closer together with integration as an ultimate goal. -- Waldek Hebisch [EMAIL PROTECTED]
Ipa and CONSTRUCTOR in global vars
I am updating GNU Pascal to work with gcc-4.x. I have hit the following problem: `get_base_var' (in ipa-utils.c:224) can not handle a CONSTRUCTOR and I get ICE. AFAICS the CONSTRUCTOR in question comes from initializer of global variable. More precisely, I have a global variable with an initializer. This initialize is a CONSTRUCTOR. One of its components is ADDR_EXPR of another CONSTRUCTOR. Now, the question is: should `get_base_var' be extended to handle CONSTRUCTOR nodes or is ADDR_EXPR of a CONSTRUCTOR forbidden from getting there? -- Waldek Hebisch [EMAIL PROTECTED]
Re: Ipa and CONSTRUCTOR in global vars
> > Now, the question is: should `get_base_var' be extended to handle > > CONSTRUCTOR nodes or is ADDR_EXPR of a CONSTRUCTOR forbidden from > > getting there? > > The latter. See PR c++/23171 and PR ada/22533. > Thanks. I have put addresable constructors in temporaries and the problem went away (the only cluprit were Pascal strings). -- Waldek Hebisch [EMAIL PROTECTED]