Re: AVR: CC0 to CCmode conversion
> From: Denis Chertykov <[EMAIL PROTECTED]> >> - possibly something like: ? >> >> (define_insn "*addhi3" >> [(set (match_operand:HI 0 ...) >>(plus:HI (match_operand:HI 1 ...) >> (match_operand:HI 2 ...))) >> (set (reg ZCMP_FLAGS) >>(compare:HI (plus:HI (match_dup 1) (match_dup 2))) (const_int 0)) >> (set (reg CARRY_FLAGS) >>(compare:HI (plus:HI (match_dup 1) (match_dup 2))) (const_int 0))] >> "" >> "@ add %A0,%A2\;adc %B0,%B2 >>..." >> [(set_attr "length" "2, ...")]) > > You have presented a very good example. Are you know any port which > already used this technique ? > As I remember - addhi3 is a special insn which used by reload. > The reload will generate addhi3 and reload will have a problem with > two modified regs (ZCMP_FLAGS, CARRY_FLAGS) which will be a bad > surprise for reload. :( As I remember. Thanks for your patience, and now that I understand GCC's spill/reload requirements/limitations a little better; I understand your desire to merge compare-and-branch. However, as an alternative to merging compare-and-branch's to overcome the fact that the use of a conventional add operation to calculate the effective spill/reload address for FP offsets >63 bytes would corrupt the machine's cc-state that a following conditional skip/branch may be dependant on; I wonder if it may be worth considering simply saving the status register to a temp register and restoring it after computing the spill/reload address when a large FP offset is required. (which seems infrequent relative to those with <63 byte offsets, so would typically not seem to be required?) If this were done, then not only could compares be split from branches, and all side-effects fully disclosed; but all compares against 0 resulting from any arbitrary expression calculation may be initially directly optimized without relying on a subsequent peephole optimization to accomplish. Further, if there were a convenient way to determine if the now fully exposed cc-status register was "dead" (i.e. having no dependants), then it should be then possible to eliminate its preservation when calculating large FP offset spill/reload effective addresses, as it would be known that no subsequent conditional skip/branch operations were dependant on it. With this same strategy, it may even be desirable to then conditionally preserve the cc-status register abound all corrupting effective address calculations when cc-status register is not "dead", as it would seem to be potentially more efficient to do so rather than otherwise needing to re-compute an explicit comparison afterward? (Observing that I'm basically suggesting treating the cc-status register like any other hard register, who's value would need to be saved/restored around any corrupting operation if it's value has live dependants; what's preventing GCC's register and value dependency tracking logic from being able to manage its value properly just like it can for other register allocated values ?)
GCC no longer synthesizing v2sf operations from v4sf operations?
Hi! For typedef float v4sf __attribute__((vector_size(16))); void foo(v4sf *a, v4sf *b, v4sf *c) { *a = *b + *c; } we no longer (since 4.0) synthesize v2sf (aka sse) operations for f.i. -march=athlon (not that we were too successful at this in 3.4 - we generated horrible code instead). Instead for !sse2 architectures we generate standard i387 FP code (with some unnecessary temporaries, but reasonably well). Does this mean the generic manual vectorization with +-* experiment has failed? I.e. are we really supposed to use ?mmintrin.h and friends, or of course rely on auto-vectorization? Thanks for clarification, Richard. -- Richard Guenther WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/
Useless vectorization of small loops
Hi! On mainline we now use loop versioning and peeling for alignment for the following loop (-march=pentium4): void foo3(float * __restrict__ a, float * __restrict__ b, float * __restrict__ c) { int i; for (i=0; i<4; ++i) a[i] = b[i] + c[i]; } which results only in slower and larger code. I also cannot see why we zero the mm registers before loading and why we load them high/low separated: .L13: xorps %xmm1, %xmm1 movlps (%edx,%esi), %xmm1 movhps 8(%edx,%esi), %xmm1 xorps %xmm0, %xmm0 movlps (%edx,%ebx), %xmm0 movhps 8(%edx,%ebx), %xmm0 addps %xmm0, %xmm1 movaps %xmm1, (%edx,%eax) addl$1, %ecx addl$16, %edx cmpl%ecx, -16(%ebp) ja .L13 but the point is, there is nothing to win vectorizing the loop in the first place if we do not know alignment before. Richard.
Re: Useless vectorization of small loops
On Mon, 21 Mar 2005 13:45:19 +0100 (CET), Richard Guenther <[EMAIL PROTECTED]> wrote: > Hi! > > On mainline we now use loop versioning and peeling for alignment > for the following loop (-march=pentium4): > > void foo3(float * __restrict__ a, float * __restrict__ b, > float * __restrict__ c) > { > int i; > for (i=0; i<4; ++i) > a[i] = b[i] + c[i]; > } > > which results only in slower and larger code. I also cannot > see why we zero the mm registers before loading and why we > load them high/low separated: > > .L13: > xorps %xmm1, %xmm1 > movlps (%edx,%esi), %xmm1 > movhps 8(%edx,%esi), %xmm1 > xorps %xmm0, %xmm0 > movlps (%edx,%ebx), %xmm0 > movhps 8(%edx,%ebx), %xmm0 > addps %xmm0, %xmm1 > movaps %xmm1, (%edx,%eax) > addl$1, %ecx > addl$16, %edx > cmpl%ecx, -16(%ebp) > ja .L13 > > but the point is, there is nothing to win vectorizing the loop > in the first place if we do not know alignment before. Uh, and with -funroll-loops we seem to be lost completely, as we produce peeling/loops for a eight times four rolling loop! Where is the information about the loop counter gone?? It looks like vectorization interacts badly with the rest of the loop optimizers. Ugh. Richard.
Specifying alignment of pointer targets
Hi! I'd like to specify (for vectorization) the alignment of the target of a pointer. I.e. I have a vector of floats that I know is suitable aligned and that get's passed to a function like typedef afloatp; void foo(afloatp __restrict__ a, afloatp __restrict__ b, afloatp __restrict__ c) { int i; for (i=0; i<4; ++i) a[i] = b[i] + c[i]; } now, the obvious typedef float __attribute__((aligned(16))) * afloatp; doesn't have any effect on (*a)s alignment, and specifying the alignment in the function argument list like void foo(float __attribute__((aligned(16))) * __restrict__ a, float __attribute__((aligned(16))) * __restrict__ b, float __attribute__((aligned(16))) * __restrict__ c) gets me simd.c:12: error: alignment may not be specified for 'a' simd.c:13: error: alignment may not be specified for 'b' simd.c:14: error: alignment may not be specified for 'c' which I find confusing. Specifying alignment of the pointer itself gets me beyond compiling but of course doesn't buy me anything (the results are similar to using the typedef). The only way I was able to convince gcc that the target of a is aligned is using *ghasp* an aligned struct like struct v4sf { float v[4]; } __attribute__((aligned(16))); void foo(struct v4sf * __restrict__ a, struct v4sf * __restrict__ b, struct v4sf * __restrict__ c) { int i; for (i=0; i<4; ++i) a->v[i] = b->v[i] + c->v[i]; } !? Is this really the only way? Is it supposed to be the only way? I remember the thread about alignment specifications for arrays and agree that float __attribute__((aligned(16))) x[4]; is ill-formed, but float x[4] __attribute__((aligned(16))); and float __attribute__((aligned(16))) *x; are not? Thanks for any suggestions, Richard. -- Richard Guenther WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/
Re: Specifying alignment of pointer targets
On Mon, 21 Mar 2005, Richard Guenther wrote: > I'd like to specify (for vectorization) the alignment of the > target of a pointer. I.e. I have a vector of floats that I > know is suitable aligned and that get's passed to a function > like > > typedef afloatp; > > void foo(afloatp __restrict__ a, afloatp __restrict__ b, >afloatp __restrict__ c) > { > int i; > for (i=0; i<4; ++i) > a[i] = b[i] + c[i]; > } > > now, the obvious > > typedef float __attribute__((aligned(16))) * afloatp; > > doesn't have any effect on (*a)s alignment, and specifying > the alignment in the function argument list like In fact, #include typedef float __attribute__((aligned(16))) afloat; typedef float __attribute__((aligned(16))) * afloatp; typedef float afloata[4] __attribute__((aligned(16))); void foo2(afloat * __restrict__ a, afloatp __restrict__ b, afloata c) { printf("%i %i %i %i\n", __alignof__(*a), __alignof__(a[1]), __alignof__(a[2]), __alignof__(a[3])); printf("%i %i %i %i\n", __alignof__(*b), __alignof__(b[1]), __alignof__(b[2]), __alignof__(b[3])); printf("%i %i %i %i\n", __alignof__(c[0]), __alignof__(c[1]), __alignof__(c[2]), __alignof__(c[3])); } int main() { float x; foo2(&x, &x, &x); return 0; } compiled with -O2 -fno-inline prints 16 16 16 16 4 4 4 4 4 4 4 4 and the first is obviously not what we want, though element stride seems to be still four in this case. Ideally we'd get from a solution 16 4 8 4 though 16 4 4 4 would be acceptable, too. Richard.
Re: Useless vectorization of small loops
> Hi! > > On mainline we now use loop versioning and peeling for alignment > for the following loop (-march=pentium4): > we don't yet use loop-versioning in the vectorizer in mainline (we do in autovect). we do apply peeling. > void foo3(float * __restrict__ a, float * __restrict__ b, > float * __restrict__ c) > { > int i; > for (i=0; i<4; ++i) > a[i] = b[i] + c[i]; > } > > which results only in slower and larger code. I also cannot > see why we zero the mm registers before loading and why we > load them high/low separated: > > .L13: > xorps %xmm1, %xmm1 > movlps (%edx,%esi), %xmm1 > movhps 8(%edx,%esi), %xmm1 > xorps %xmm0, %xmm0 > movlps (%edx,%ebx), %xmm0 > movhps 8(%edx,%ebx), %xmm0 > addps %xmm0, %xmm1 > movaps %xmm1, (%edx,%eax) > addl$1, %ecx > addl$16, %edx > cmpl%ecx, -16(%ebp) > ja .L13 > > > but the point is, there is nothing to win vectorizing the loop > in the first place if we do not know alignment before. > The vectorizer is currently greedy - vectorizes as much as it can, no cost considerations applied yet. Since it is not on by default under any optimization level, and is relatively new and requires as much testing as possible, this seemed like a reasonable approach. Indeed, as we are handling more and more cases (unknown loop bound, misalignment) and introducing more and more overheads, it is starting to be imperative to consider cost and size treadoffs. (It's also on the vectorizer wish-list - http://gcc.gnu.org/projects/tree-ssa/vectorization.html#vec_todo). dorit > Richard. >
Re: Useless vectorization of small loops
> On Mon, 21 Mar 2005 13:45:19 +0100 (CET), Richard Guenther > <[EMAIL PROTECTED]> wrote: > ... > > Uh, and with -funroll-loops we seem to be lost completely, as we > produce peeling/loops for a eight times four rolling loop! Where is > the information about the loop counter gone?? > the thing is you don't know at compile time what is the alignment of the access you're peeling for, so the peel-loop has unknown number of iterations, and consequently the "main" (vectorized) loop has unknown number of iterations. dorit > Ugh. > > Richard.
Re: Useless vectorization of small loops
On Mon, 21 Mar 2005, Dorit Naishlos wrote: > > > > > > On Mon, 21 Mar 2005 13:45:19 +0100 (CET), Richard Guenther > > <[EMAIL PROTECTED]> wrote: > > ... > > > > Uh, and with -funroll-loops we seem to be lost completely, as we > > produce peeling/loops for a eight times four rolling loop! Where is > > the information about the loop counter gone?? > > > > the thing is you don't know at compile time what is the alignment of the > access you're peeling for, so the peel-loop has unknown number of > iterations, and consequently the "main" (vectorized) loop has unknown > number of iterations. Ah, ok, I see. I guess there is no way to propagate information on the upper bound for the loop count (which is <= 4 in any case here). Without -funroll-loops we are currently not able to remove the loop exit test, i.e. we keep zeroing the IV at the beginning, adding four and then comparing with four and conditionally branching back... Unrolling removes this, but has bad effects on eventually peeled loops. Of course this is yet another artifact of tree-complete-peeling not enabled by default / not enablable(?) without generic rtl loop unrolling. Richard. -- Richard Guenther WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/
Re: GCC no longer synthesizing v2sf operations from v4sf operations?
Hello! typedef float v4sf __attribute__((vector_size(16))); void foo(v4sf *a, v4sf *b, v4sf *c) { *a = *b + *c; } we no longer (since 4.0) synthesize v2sf (aka sse) operations for f.i. -march=athlon (not that we were too successful at this in 3.4 - we generated horrible code instead). Instead for !sse2 architectures we generate standard i387 FP code (with some unnecessary temporaries, but reasonably well). SSE _is_ v4sf. 'gcc -O2 -msse -S -fomit-frame-pointer' produces: foo: movl12(%esp), %eax movaps (%eax), %xmm0 movl8(%esp), %eax addps (%eax), %xmm0 movl4(%esp), %eax movaps %xmm0, (%eax) ret SSE2 is v2df. Athlon does not handle SSE insns. Uros.
Re: GCC no longer synthesizing v2sf operations from v4sf operations?
On Mon, 21 Mar 2005, Uros Bizjak wrote: > Hello! > > >typedef float v4sf __attribute__((vector_size(16))); > >void foo(v4sf *a, v4sf *b, v4sf *c) > >{ > >*a = *b + *c; > >} > > > >we no longer (since 4.0) synthesize v2sf (aka sse) operations > >for f.i. -march=athlon (not that we were too successful at this > >in 3.4 - we generated horrible code instead). Instead for !sse2 > >architectures we generate standard i387 FP code (with some > >unnecessary temporaries, but reasonably well). > > > > > > > SSE _is_ v4sf. 'gcc -O2 -msse -S -fomit-frame-pointer' produces: > > foo: > movl12(%esp), %eax > movaps (%eax), %xmm0 > movl8(%esp), %eax > addps (%eax), %xmm0 > movl4(%esp), %eax > movaps %xmm0, (%eax) > ret > > SSE2 is v2df. > > Athlon does not handle SSE insns. Oh, so we used to expand to 3dnow? I see with gcc 3.4 produced: foo: pushl %ebp movl%esp, %ebp pushl %ebx subl$84, %esp movl12(%ebp), %eax movl16(%ebp), %edx [...] movq-64(%ebp), %mm0 movl%ebx, -72(%ebp) movl-36(%ebp), %ebx movl%ebx, -68(%ebp) pfadd -72(%ebp), %mm0 movq%mm0, -56(%ebp) movl12(%eax), %eax etc. This doesn't happen anymore with 4.0/4.1. Richard. -- Richard Guenther WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/
[autovect] Bootstrap failure on i686
Hi! Bootstrap of autovect-branch fails on i686 with stage1/xgcc -Bstage1/ -B/home/rguenth/ix86/gcc-autovect-210305/i686-pc-linux-gnu/bin/ -c -O2 -g -fomit-frame-pointer -DIN_GCC -W -Wall -Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes -pedantic -Wno-long-long -Wno-variadic-macros -Wold-style-definition -Werror-DHAVE_CONFIG_H -I. -I. -I/net/alwazn/home/rguenth/src/gcc/gcc-autovect/gcc -I/net/alwazn/home/rguenth/src/gcc/gcc-autovect/gcc/. -I/net/alwazn/home/rguenth/src/gcc/gcc-autovect/gcc/../include -I/net/alwazn/home/rguenth/src/gcc/gcc-autovect/gcc/../libcpp/include /net/alwazn/home/rguenth/src/gcc/gcc-autovect/gcc/tree-data-ref.c -o tree-data-ref.o cc1: warnings being treated as errors /net/alwazn/home/rguenth/src/gcc/gcc-autovect/gcc/tree-data-ref.c: In function 'address_analysis': /net/alwazn/home/rguenth/src/gcc/gcc-autovect/gcc/tree-data-ref.c:1181: warning: comparison between signed and unsigned Richard. -- Richard Guenther WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/
Re: Extra gcc-3.3 java failures when using expect-5.43
> From: Andrew Haley > > Kaveh R. Ghazi writes: > > After I upgraded to expect-5.43, I noticed that I'm getting extra > > java failures on the 3.3 branch on x86_64-unknown-linux-gnu. Other > > gcc branches do not have problems. > > > > http://gcc.gnu.org/ml/gcc-testresults/2005-03/msg01295.html > > > > I'm using an expect-5.43 binary on x86_64 that was compiled on i686 > > if that matters. > > > > When I back down to expect-5.42.1, the testsuite results go back to > > normal. Anyone else seeing this? > > Could you post a snippet of the log, please? > Andrew. There was nothing useful in libjava.log to indicate what the problem is. I reran the testsuite with --verbose and all the errors show up like this: spawning command /tmp/kg/33/build/x86_64-unknown-linux-gnu/./libjava/gij ArrayStore exp6 file5 close result is child killed: SIGABRT FAIL: ArrayStore execution - gij test Don't know who/what is sending a SIGABRT. Again, if I back down to expect 5.42.1 everything passes. And also it only occurs on the 3.3 branch. Other branches and mainline pass fine. So there may be a diff in the testsuite harness. (?) --Kaveh -- Kaveh R. Ghazi [EMAIL PROTECTED]
Re: Ada and ARM build assertion failure
On Mar 21, 2005, at 02:54, Nick Burrett wrote: This seems to be a reoccurance of PR5677. I'm sorry, but I can't see any way this is related, could you elaborate? for Aligned_Word'Alignment use - Integer'Min (2, Standard'Maximum_Alignment); + Integer'Min (4, Standard'Maximum_Alignment); This patch is wrong, as it implicitly increases the size of Aligned_Word from 2 to 4 bytes: size is always a multiple of the alignment. However, it is really dubious you need to change this package, as it is only used for DEC Ada compatibility on VMS systems. -Geert
Re: GCC no longer synthesizing v2sf operations from v4sf operations?
Richard Guenther wrote: Oh, so we used to expand to 3dnow? I see with gcc 3.4 produced: foo: pushl %ebp movl%esp, %ebp pushl %ebx subl$84, %esp movl12(%ebp), %eax movl16(%ebp), %edx [...] movq-64(%ebp), %mm0 movl%ebx, -72(%ebp) movl-36(%ebp), %ebx movl%ebx, -68(%ebp) pfadd -72(%ebp), %mm0 movq%mm0, -56(%ebp) movl12(%eax), %eax etc. This doesn't happen anymore with 4.0/4.1. IIRC, any generic code that produces MMX or 3DNow! instructions is disabled ATM, because gcc doesn't know how/when to insert emms/femms instruction. You don't want to mix 3dNow insns with x87 insn and use shared 3DNow/x87 registers without this insn... Uros.
Re: AVR indirect_jump addresses limited to 16 bits
On Sun, Mar 20, 2005 at 04:29:01PM -0800, Richard Henderson wrote: > The easiest way is to do this in the linker instead of the compiler. > See the xstormy16 port and how it handles R_XSTORMY16_FPTR16. This > has the distinct advantage that you do not commit to the creation of > an indirect jump until you discover that the target label is outside > the low 64k. Looks perfect to me. So we are not the first architecture needing such tricks... AVR would need 3 new relocs, used like this: .word pm16(label) ldi r30,pm16_lo8(label) ldi r31,pm16_hi8(label) and the linker can do the rest of the magic (add jumps in a section below 64K words if the label is above). Cc: to Denis, as I may need help actually implementing these changes (you know binutils internals much better than I do). Thanks, Marek
Re: Ada and ARM build assertion failure
Geert Bosch wrote: On Mar 21, 2005, at 02:54, Nick Burrett wrote: This seems to be a reoccurance of PR5677. I'm sorry, but I can't see any way this is related, could you elaborate? Sorry, I completely misread the PR. It is not related. for Aligned_Word'Alignment use - Integer'Min (2, Standard'Maximum_Alignment); + Integer'Min (4, Standard'Maximum_Alignment); This patch is wrong, as it implicitly increases the size of Aligned_Word from 2 to 4 bytes: size is always a multiple of the alignment. OK, but if I don't apply the patch, GNAT complains that the alignment should be 4, not 2 and compiling ceases. However, it is really dubious you need to change this package, as it is only used for DEC Ada compatibility on VMS systems. OK, but all systems build it, as it is unconditionally defined in Makefile.rtl::GNATRTL_NONTASKING_OBJS And here it exists in a i686-linux build: [EMAIL PROTECTED] rts]$ ls -l s-aux* lrwxrwxrwx 1 nick nick50 Mar 18 12:51 s-auxdec.adb -> /home/nick/riscos-elf/gcc-4.0/gcc/ada/s-auxdec.adb lrwxrwxrwx 1 nick nick50 Mar 18 12:51 s-auxdec.ads -> /home/nick/riscos-elf/gcc-4.0/gcc/ada/s-auxdec.ads -r--r--r-- 1 nick nick 19835 Mar 18 12:57 s-auxdec.ali -rw-rw-r-- 1 nick nick 32908 Mar 18 12:57 s-auxdec.o [EMAIL PROTECTED] rts]$ Nick.
Re: Ada and ARM build assertion failure
On Mar 21, 2005, at 11:02, Nick Burrett wrote: OK, but if I don't apply the patch, GNAT complains that the alignment should be 4, not 2 and compiling ceases. Yes, this is related to PR 17701 as Arno pointed out to me in a private message. Indeed, the patch you used works around this failure and can be used as a kludge. Properly disabling building this package would be better, but there isn't a mechanism for that yet. However, this all is entirely unrelated to the failure you're seeing.
Re: AVR: CC0 to CCmode conversion
Richard Henderson <[EMAIL PROTECTED]> writes: > On Sun, Mar 20, 2005 at 01:59:44PM +0300, Denis Chertykov wrote: > > The reload will generate addhi3 and reload will have a problem with > > two modified regs (ZCMP_FLAGS, CARRY_FLAGS) which will be a bad > > surprise for reload. :( As I remember. > > In order to expose the flags register before reload, you *must* Precisely to say while reload_in_progress. > have load, store, reg-reg move, and add operations that do not > modify the flags. They (load, store, add) can modify flags before reload. (while no reload_in_progress) Is this OK ? Denis.
Re: AVR: CC0 to CCmode conversion
Paul Schlie <[EMAIL PROTECTED]> writes: > > From: Denis Chertykov <[EMAIL PROTECTED]> > >> - possibly something like: ? > >> > >> (define_insn "*addhi3" > >> [(set (match_operand:HI 0 ...) > >>(plus:HI (match_operand:HI 1 ...) > >> (match_operand:HI 2 ...))) > >> (set (reg ZCMP_FLAGS) > >>(compare:HI (plus:HI (match_dup 1) (match_dup 2))) (const_int 0)) > >> (set (reg CARRY_FLAGS) > >>(compare:HI (plus:HI (match_dup 1) (match_dup 2))) (const_int 0))] > >> "" > >> "@ add %A0,%A2\;adc %B0,%B2 > >>..." > >> [(set_attr "length" "2, ...")]) > > > > You have presented a very good example. Are you know any port which > > already used this technique ? > > As I remember - addhi3 is a special insn which used by reload. > > The reload will generate addhi3 and reload will have a problem with > > two modified regs (ZCMP_FLAGS, CARRY_FLAGS) which will be a bad > > surprise for reload. :( As I remember. > > Thanks for your patience, and now that I understand GCC's spill/reload > requirements/limitations a little better; I understand your desire to merge > compare-and-branch. I don't want to merge compare-and-branch because (as Richard said) "explicit compare elimination by creating even larger fused operate-compare-and-branch instructions that could be recognized by combine. I wouldn't actually recommend this though, because branch instructions with output reloads are EXTREMELY DIFFICULT to implement properly.". (IMPOSSIBLE for AVR) I want to have two separate insns compare and branch. > However, as an alternative to merging compare-and-branch's to > overcome the fact that the use of a conventional add operation to > calculate the effective spill/reload address for FP offsets >63 > bytes would corrupt the machine's cc-state that a following > conditional skip/branch may be dependant on; I wonder if it may be > worth considering simply saving the status register to a temp > register and restoring it after computing the spill/reload address > when a large FP offset is required. (which seems infrequent relative > to those with <63 byte offsets, so would typically not seem to be > required?) > > If this were done, then not only could compares be split from branches, and > all side-effects fully disclosed; but all compares against 0 resulting from > any arbitrary expression calculation may be initially directly optimized > without relying on a subsequent peephole optimization to accomplish. > > Further, if there were a convenient way to determine if the now fully > exposed cc-status register was "dead" (i.e. having no dependants), then > it should be then possible to eliminate its preservation when calculating > large FP offset spill/reload effective addresses, as it would be known that > no subsequent conditional skip/branch operations were dependant on it. > > With this same strategy, it may even be desirable to then conditionally > preserve the cc-status register abound all corrupting effective address > calculations when cc-status register is not "dead", as it would seem to > be potentially more efficient to do so rather than otherwise needing > to re-compute an explicit comparison afterward? I think that it's a better way. I will test it. > (Observing that I'm basically suggesting treating the cc-status register > like any other hard register, who's value would need to be saved/restored > around any corrupting operation if it's value has live dependants; what's > preventing GCC's register and value dependency tracking logic from being > able to manage its value properly just like it can for other register > allocated values ?) Why not CCmode register ? Denis.
Re: AVR indirect_jump addresses limited to 16 bits
Marek Michalkiewicz <[EMAIL PROTECTED]> writes: > On Sun, Mar 20, 2005 at 04:29:01PM -0800, Richard Henderson wrote: > > > The easiest way is to do this in the linker instead of the compiler. > > See the xstormy16 port and how it handles R_XSTORMY16_FPTR16. This > > has the distinct advantage that you do not commit to the creation of > > an indirect jump until you discover that the target label is outside > > the low 64k. > > Looks perfect to me. So we are not the first architecture needing > such tricks... AVR would need 3 new relocs, used like this: > > .word pm16(label) > > ldi r30,pm16_lo8(label) > ldi r31,pm16_hi8(label) > > and the linker can do the rest of the magic (add jumps in a section > below 64K words if the label is above). > > Cc: to Denis, as I may need help actually implementing these changes > (you know binutils internals much better than I do). Richard right. Better to support this in binutils. Right now I'm busy with CC0 to CCmode conversion. (you must learn binutils ;) Denis.
Re: AVR indirect_jump addresses limited to 16 bits
> From: Marek Michalkiewicz <[EMAIL PROTECTED]> >> On Sun, Mar 20, 2005 at 04:29:01PM -0800, Richard Henderson wrote: >> The easiest way is to do this in the linker instead of the compiler. >> See the xstormy16 port and how it handles R_XSTORMY16_FPTR16. This >> has the distinct advantage that you do not commit to the creation of >> an indirect jump until you discover that the target label is outside >> the low 64k. > > Looks perfect to me. So we are not the first architecture needing > such tricks... AVR would need 3 new relocs, used like this: > > .word pm16(label) > > ldi r30,pm16_lo8(label) > ldi r31,pm16_hi8(label) > > and the linker can do the rest of the magic (add jumps in a section > below 64K words if the label is above). > > Cc: to Denis, as I may need help actually implementing these changes > (you know binutils internals much better than I do). - yup, and nicer than trying to play games with alignment, etc., And just to double check, using the earlier example: > int foo(int dest) > { >__label__ l1, l2, l3; >void *lb[] = { &&l1, &&l2, &&l3 }; >int x = 0; > >goto *lb[dest]; > > l1: >x += 1; > l2: >x += 1; > l3: >x += 1; >return x; > } It would seem that the only time the pm16(label) address would ever be used, would as an initializing constant pointer value being assigned to a _label_/function pointer variable; as a CALL/JUMP LABEL instruction would be used to call/jump-to the true entry point directly otherwise. (is that correct?)
Re: Weird behavior in ivopts code
On Fri, 2005-03-18 at 18:25 +0100, Zdenek Dvorak wrote: > Hello, > > > Which appears to walk down the array and try and choose better IV sets. > > Since it walks down the IV array, which is in SSA_NAME_VERSION order. > > Thus two loops which are equivalent in all ways except that they use > > different SSA_NAME_VERSIONs can get different IV sets. > > > > Anyway, the instability of the IV opts code when presented with loops > > that differ only in the version #s in the SSA_NAMEs they use is really > > getting in the way of understanding the performance impact of the > > new jump threading code. I would expect this instability to also make > > it difficult to analyze the IVopts in general. > > there's not much to do about the matter. The choice of ivs in ivopts is > just a heuristics (and this cannot be changed, due to compiler > performance reasons), and as such it is prone to such instabilities. > In fact, both choices of ivs make sense, and they have the same cost > w.r. to the cost function used by ivopts. Sigh. I had a feeling that might be the case. > > Anyway, in this particular case, the patch below should make ivopts > to prefer the choice of preserving the variable 'i' (which is better > from the point of preserving debugging information). Seems reasonable to me given that it preserves debugging information; it also seems to stabilize IV opts significantly for EON (I was seeing +-20% performance swings due to this issue for EON). Consider it pre-approved once you run a bootstrap and regression test. jeff
Re: AVR: CC0 to CCmode conversion
> From: Denis Chertykov <[EMAIL PROTECTED]> >> Paul Schlie <[EMAIL PROTECTED]> writes: >> ... >> (Observing that I'm basically suggesting treating the cc-status register >> like any other hard register, who's value would need to be saved/restored >> around any corrupting operation if it's value has live dependants; what's >> preventing GCC's register and value dependency tracking logic from being >> able to manage its value properly just like it can for other register >> allocated values ?) > > Why not CCmode register ? - For what? As it would seem that as along as all rtl instruction data-flow dependencies are satisfied, the code will execute the program correctly ? (as all conditionals are effectively based upon a comparison of a result against 0, and GCC always converts both operands to the same type, so all that's necessary to know is if that type is signed or unsigned, as even floats compare just like signed integers. Therefore it would seem that the only difference between a compare operation and a subtract, is that it doesn't produce a value result which clobbers one of it's operands; otherwise they're identical, therefore arguably just an optimization to be used instead of a subtract when the result value isn't needed or invalid when comparing floats. It would seem ?)
Runtime-library versioning patch checked in
I have checked in the patch to clean up after GCC's change to version number handling. This should address all reported issues with build, installation, etc. Per Ian's suggestion, I am doing a multilib-ful build with a relative $(srcdir), which may expose more problems, which will be addressed in a follow-up. It is now safe again to modify configure scripts in the gcc repository; you should not encounter missing .m4 files. I have not audited the src repository for problematic constructs. The odds are low that any problems exist, since the affected macros and variables all have to do with passing information from the gcc subdirectory to runtime library directories, but just for the record: TL_AC_GCC_VERSION no longer exists, and TL_AC_GXX_INCLUDE_DIR has changed semantics. Also, the top level Makefile no longer passes down certain variables: libsubdir, libstdcxx_incdir, gxx_include_dir, gcc_version, gcc_version_full, and gcc_version_trigger. I would like to take this opportunity to encourage the libjava, libffi, and libstdc++ maintainers to convert to nonrecursive Makefiles. (In other words, just one Makefile.am/Makefile.in pair at the top level of your subdirectory; perhaps with included fragments in lower-level subdirectories.) zw
removal of -mflat in GCC 4.0 (sparc)
Esteemed GCC developers: I am writing to request the that the sparc -mflat option be retained in GCC 4.0. The reason this particular register model is important to me is that I use GCC on the microSPARC-IIep (actually, a SoC variant produced by Infineon called the "copernicus") to build firmware for Sun Ray appliances (ultra-thin client). These SparcV8 processors only have two register windows, so the -mflat option is required for performance. (The Sun Ray kernel also doesn't have a handler for register window over/underflow traps.) The register windows are used exclusively for entry into the interrupt handler. I believe that the uSPARC-IIep is a truly "open-source" processor: the specs for the entire processor are freely downloadable from Sun -- anyone can produce cores with this chip for free. While I'm not a VLSI engineer, I think that everything one needs to go to production on such cores is already freely available from Sun. (So I'm saying that the chip *implementation* is open, not just the instruction set or pin-outs.) Note also that these parts (Copernicus at least) are real, currently shipping products. This is *not* a legacy product, and active new development is continuing on this stuff. Btw, I think the uSPARC-IIep is also used certain older JavaStations, which can and do run Linux. I'm not sure how many of them there are out there, but Linux seems to be the OS of choice for these boxes. I don't know whether -mflat is used in that environment or not. It certainly seems like it could be, too good effect for performance. (Otherwise nearly every save/restore would involve an interrupt for window overflow and underflow.) I do realize (and am very grateful) that GCC is a free project, and that the needs of the many may outweigh the needs of the few. (I am also asking my company to make a donation to the FSF in support of GCC, independent of the decision about -mflat. Whether such a donation is made or not depends on management though, so I can't promise anything.) But I also thought it was a real possibility that the GCC team might not be aware of anyone who had a real requirement for the -mflat option. I certainly appreciate the value of trimming support for defunct architectures and such from a product to reduce code size and complexity. However, in this case such a trim does potentially impact otherwise happy GCC users. I'd be grateful if the team could consider reversing the decision on -mflat. While I can continue to use older versions of GCC, I'd like the ability to update to new compilers later if possible. (It certainly seems like some of the new optimizations in GCC 4.0 might be very useful to us.) Thank you in advance for your consideration of this request. -- Garrett
Re: Copyright question: libgcc GPL exceptions
On Mar 19, 2005, at 7:23 AM, Bernd Schmidt wrote: I'm updating the copyrights in the Blackfin port, and I noticed that there appear to be two versions of the wording that allows more-or-less unlimited use of libgcc files. One can be found e.g. in config/arm/crtn.asm: As a special exception, if you link this library with files compiled with GCC to produce an executable, this does not cause the resulting executable to be covered by the GNU General Public License. This exception does not however invalidate any other reasons why the executable file might be covered by the GNU General Public License. the other in config/arm/lib1funcs.asm: In addition to the permissions in the GNU General Public License, the Free Software Foundation gives you unlimited permission to link the compiled version of this file into combinations with other programs, and to distribute those combinations without any restriction coming from the use of this file. (The General Public License restrictions do apply in other respects; for example, they cover modification of the file, and distribution when not linked into a combine executable.) The canonical form can be found in gcc/libgcc2.c: In addition to the permissions in the GNU General Public License, the Free Software Foundation gives you unlimited permission to link the compiled version of this file into combinations with other programs, and to distribute those combinations without any restriction coming from the use of this file. (The General Public License restrictions do apply in other respects; for example, they cover modification of the file, and distribution when not linked into a combine executable.) Is there a particular reason to use one or the other, or are they equivalent? Yes, one is old and should be updated to the canonical form.
Licensing question about libobjc
I notice that libobjc have a different exception than all of the other ones which have an exception to the GPL. Is there is a reason behind this? The different between the libobjc exception and the one in libgcc/libstdc++ is that the exception only takes into account when all sources were compiled with GCC. Thanks, Andrew Pinski
Obsoleting more ports for 4.0.
Hi, First off, Mark, if you think this stuff is too late for 4.0, I'll postpone this to 4.1. Please note that all we have to do is add a few lines to config.gcc as far as printing the "obsolete" message is concerned. Below, I propose to obsolete the following three architectures for GCC 4.0 and remove them for 4.1 unless somebody steps up and does *real work*. If you are working on these ports, please send us real patches. If you would like to work on these ports privately, please refrain from telling us that port xxx should be kept. More ports we have, more work we would have to do to clean up the infrastructure connecting the middle end and the backends. arc --- No maintainer. PR 8972 hasn't been fixed. GCC miscompiles a function as simple as. int f (int x, int i) { return x << i; } There were some recent interests like http://gcc.gnu.org/ml/gcc/2004-10/msg00408.html Obsoleting a port on the grounds of a single bug may seem a bit strange. However, PR 8972 implies that nobody is working on the FSF's mainline at least. PR 8973 hasn't been fixed. fr30 The same justification as http://gcc.gnu.org/ml/gcc/2004-06/msg01113.html Nobody showed an interest in keeping this port. i860 A hardware implementation is not currently being manufactured. Jason Eckhardt, the maintainer of i860, has told us that it would be OK to go. http://gcc.gnu.org/ml/gcc-patches/2005-03/msg02033.html ip2k PR 20582: libgcc build fails despite some interests, such as http://gcc.gnu.org/ml/gcc/2004-06/msg01128.html ns32k - The same justification as http://gcc.gnu.org/ml/gcc/2004-06/msg01113.html Nobody showed an interest in keeping this port. Kazu Hirata
Re: Licensing question about libobjc
On Mar 21, 2005, at 3:05 PM, Andrew Pinski wrote: I notice that libobjc have a different exception than all of the other ones which have an exception to the GPL. Is there is a reason behind this? The different between the libobjc exception and the one in libgcc/ libstdc++ is that the exception only takes into account when all sources were compiled with GCC. I believe if you researched this, you would find that the derive from a common ancestor, and that libobjc just fell behind. Here is rcsdiff -r1.1 -r1.158 libgcc2.c from oldgcc: < /* As a special exception, if you link this library with files/* As a special exception, if you link this library with other files, >some of which are compiled with GCC, to produce an executable, >this library does not by itself cause the resulting executable >to be covered by the GNU General Public License. which shows some of the history. You can update to the current canonical spelling in libgcc2.c.
RFA; DFP and REAL_TYPE?
So I've been looking at using REAL_TYPE to represent decimal floating point values internally (to implement the C extensions for decimal floating point.) I believe David and yourself had some discussions on this some short time back. Anyway, I've now had a chance to play with this a bit, but not quite sure how well I like the way its coming out (though the alternative of introducing new type seems worse, imo). Warning: My thinking is likely clouded by a goal to wire in the decNumber routines to implement the algorithms/encodings for decimal floats (still working through permissions for this to happen though). First, I think we need to avoid going into the GCC REAL internal binary float representation for decimal floats. I'm guessing going into the binary representation (then performing various arithmetic operations) and then eventually dropping back out to decimal float will end up with errors that are trying to be avoided by decimal float in the first place. I'm looking for advice to going forward. I've already hacked up real_value.sig to hold a decimal128 encoded value. This is fugly, and obviously all sorts of things in real.c would break if I started using the various functions for real. But before I put down any significant work down the REAL_TYPE path, I thought it best to get guidance. 1) Stick with REAL_TYPE or is it hopeless and I should create DFLOAT_TYPE? 2) If the recommendation is to stick with REAL_TYPE. Is it ok to have some other internal representation? 3) Is there a preferred way to override real_value functions? I'm assuming that even if I use the real_value->sig field to hold the coeefficient rather than the ugly hack of holding a decimal128, I'll need to override various functions in real.c to 'do the right thing' for radix 10 reals. I could add a field to real_value to point to a function table, that if present to be called through. Or simply add various "if (r->b == 10) checks throughout real.c. Or other. Thoughts/concerns/questions/advice? Best Regards, Jon Grimm IBM Linux Technology Center.
A question about java/lang.c:java_get_callee_fndecl.
Hi, I see that the implementation of LANG_HOOKS_GET_CALLEE_FNDECL in Java always returns NULL (at least for the time being). static tree java_get_callee_fndecl (tree call_expr) { tree method, table, element, atable_methods; HOST_WIDE_INT index; /* FIXME: This is disabled because we end up passing calls through the PLT, and we do NOT want to do that. */ return NULL; : : Is anybody planning to fix this? If not, I'm thinking about removing this language hook. The reason is not just clean up. Rather it is because I need to change the prototype of get_callee_fndecl and LANG_HOOKS_GET_CALLEE_FNDECL.. Currently, fold_ternary has the following call tree. fold_ternary get_callee_fndecl java_get_callee_fndecl If I change fold_ternary to take components of CALL_EXPR like the address expression of CALL_EXPR and the argument list, instead of CALL_EXPR itself, I would have to change java_get_callee_fndecl to take the first operand of a CALL_EXPR, instead of a CALL_EXPR. It's not that the change is so involved, but it doesn't make much sense to keep something dead up to date. In other words, when I posted the following patch http://gcc.gnu.org/ml/gcc-patches/2005-03/msg02038.html Roger Sayle requested to keep the call to get_callee_fndecl so that we can "fold" the first operand of a CALL_EXPR to a FUNCTION_DECL. FYI, the above FIXME comes from http://gcc.gnu.org/ml/java-patches/2004-q2/msg00083.html Kazu Hirata
expand_binop misplacing results?
gcc.c-torture/execute/2403-1.c tripped over this on an internal (16 bit) port doing SImode subtract. The comments for expand_binop() explicitly state that you can't rely on the target being set: If TARGET is nonzero, the value is generated there, if it is convenient to do so. but we seem to have that expectation later: /* Main add/subtract of the input operands. */ x = expand_binop (word_mode, binoptab, op0_piece, op1_piece, target_piece, unsignedp, next_methods); The only place where target_piece is assigned is in the (i > 0) case: if (i > 0) { . . . emit_move_insn (target_piece, newx); } So it seems to me that in the (i == 0) case we need to see if target_piece happened to receive the result, and if not, assign it. It seems to me that this kind of bug should have been noticed already, so... am I missing something? 2005-03-21 DJ Delorie <[EMAIL PROTECTED]> * optabs.c (expand_binop): Make sure the first subword's result gets stored. Index: optabs.c === RCS file: /cvs/gcc/gcc/gcc/optabs.c,v retrieving revision 1.265 diff -p -U3 -r1.265 optabs.c --- optabs.c16 Mar 2005 18:29:23 - 1.265 +++ optabs.c22 Mar 2005 01:25:35 - @@ -1534,6 +1534,11 @@ expand_binop (enum machine_mode mode, op } emit_move_insn (target_piece, newx); } + else + { + if (x != target_piece) + emit_move_insn (target_piece, x); + } carry_in = carry_out; }
Re: RFA; DFP and REAL_TYPE?
Jon Grimm wrote: So I've been looking at using REAL_TYPE to represent decimal floating point values internally (to implement the C extensions for decimal floating point.) I believe David and yourself had some discussions on this some short time back. FWIW, I'd rather see you stick with REAL_TYPE. I think that decimal floating point is similar enough to binary floating point to make that worthwhile. What I would hope would work would be modifying real.c to (a) directly suport the decimal format for storage, and (b) update the emulation of floating-point operations to work correctly on the decimal format. I definitely agree that translating into the binary format is likely to result in various errors. I don't have an opinion on exactly what method of modifying real.c would be cleanest. -- Mark Mitchell CodeSourcery, LLC [EMAIL PROTECTED] (916) 791-8304
Re: RFA; DFP and REAL_TYPE?
Mark Mitchell wrote: What I would hope would work would be modifying real.c to (a) directly suport the decimal format for storage, and (b) update the emulation of floating-point operations to work correctly on the decimal format. I definitely agree that translating into the binary format is likely to result in various errors. I see no reason not to use binary format to store decimal numbers ... I don't have an opinion on exactly what method of modifying real.c would be cleanest.
Re: RFA; DFP and REAL_TYPE?
Robert Dewar wrote: Mark Mitchell wrote: What I would hope would work would be modifying real.c to (a) directly suport the decimal format for storage, and (b) update the emulation of floating-point operations to work correctly on the decimal format. I definitely agree that translating into the binary format is likely to result in various errors. I see no reason not to use binary format to store decimal numbers ... I would expect that some decimal floating point values are not precisely representable in the binary format. -- Mark Mitchell CodeSourcery, LLC [EMAIL PROTECTED] (916) 791-8304
Re: RFA; DFP and REAL_TYPE?
Mark Mitchell wrote: I would expect that some decimal floating point values are not precisely representable in the binary format. OK, I agree that decimal floating-point needs its own format. But still you can store the decimal mantissa and decimal exponent in binary format without any problem, and that's probably what you want to do on a machine that does not have native decimal format support. Even on a machine that does have some support for decimal formats or arithmetic, you want to check timing to see if these instructions are actually attractive to use.
Re: RFA; DFP and REAL_TYPE?
Robert Dewar wrote: Mark Mitchell wrote: I would expect that some decimal floating point values are not precisely representable in the binary format. OK, I agree that decimal floating-point needs its own format. But still you can store the decimal mantissa and decimal exponent in binary format without any problem, and that's probably what you want to do on a machine that does not have native decimal format support. I would think that, as elsewhere in real.c, you would probably want to use the same exact bit representation that will be used on the target. This is useful so that you can easily emit assembly literals by simply printing the bytes in hex, for example. Of course, you could do as you suggest (storing the various fields of the decimal number in binary formats), and, yes, on many host machines that would in more efficient internal computations. But, I'm not confident that the savings you would get out of that would outweigh the appeal of having bit-for-bit consistency between the host and target. In any case, this is rather a detail; the key decision Jon is trying to make is whether or not he has to introduce a new format in real.c, together with new routines to perform oeprations on that format, to which I think we agree the answer is in the affirmative. -- Mark Mitchell CodeSourcery, LLC [EMAIL PROTECTED] (916) 791-8304
java on darwin8?
Certainly I am doing something wrong, but if not... anyone else seeing this? /Volumes/mrs3/net/gcc-darwin/./gcc/xgcc -B/Volumes/mrs3/net/gcc- darwin/./gcc/ -B/Volumes/mrs3/Packages/gcc-20050128/powerpc-apple- darwin8.0.0/bin/ -B/Volumes/mrs3/Packages/gcc-20050128/powerpc-apple- darwin8.0.0/lib/ -isystem /Volumes/mrs3/Packages/gcc-20050128/powerpc- apple-darwin8.0.0/include -isystem /Volumes/mrs3/Packages/ gcc-20050128/powerpc-apple-darwin8.0.0/sys-include -m64 - DHAVE_CONFIG_H -I. -I../../../../gcc/libjava -I./include -I./gcj - I../../../../gcc/libjava -Iinclude -I../../../../gcc/libjava/include - I../../../../gcc/libjava/../boehm-gc/include -I../boehm-gc/include - I../../../../gcc/libjava/libltdl -I../../../../gcc/libjava/libltdl - I../../../../gcc/libjava/.././libjava/../gcc -I../../../../gcc/ libjava/../zlib -I../../../../gcc/libjava/../libffi/include -I../ libffi/include -Wextra -Wall -O2 -g -O2 -m64 -MT java/lang/sf_fabs.lo -MD -MP -MF java/lang/.deps/sf_fabs.Tpo -c ../../../../gcc/libjava/ java/lang/sf_fabs.c -fno-common -DPIC -o java/lang/.libs/sf_fabs.o In file included from ../../../../gcc/libjava/java/lang/fdlibm.h:23, from ../../../../gcc/libjava/java/lang/sf_fabs.c:20: ../../../../gcc/libjava/java/lang/ieeefp.h:157:2: error: #error Endianess not declared!! In file included from ../../../../gcc/libjava/java/lang/mprec.h:30, from ../../../../gcc/libjava/java/lang/fdlibm.h:25, from ../../../../gcc/libjava/java/lang/sf_fabs.c:20: ../../../../gcc/libjava/java/lang/ieeefp.h:157:2: error: #error Endianess not declared!! In file included from ../../../../gcc/libjava/java/lang/fdlibm.h:25, from ../../../../gcc/libjava/java/lang/sf_fabs.c:20: ../../../../gcc/libjava/java/lang/mprec.h:95: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'one' In file included from ../../../../gcc/libjava/java/lang/sf_fabs.c:20: ../../../../gcc/libjava/java/lang/fdlibm.h:227:3: error: #error Must define endianness make[4]: *** [java/lang/sf_fabs.lo] Error 1
Re: java on darwin8?
On Mar 21, 2005, at 10:10 PM, Mike Stump wrote: Certainly I am doing something wrong, but if not... anyone else seeing this? You want to change the following "#if" in that file, to include __ppc64__: #if defined (__PPC__) || defined (__ppc__) Thanks, Andrew Pinski Who helped with the first porting of libjava in this place.
Re: expand_binop misplacing results?
Hi DJ, On Mon, 21 Mar 2005, DJ Delorie wrote: > 2005-03-21 DJ Delorie <[EMAIL PROTECTED]> > > * optabs.c (expand_binop): Make sure the first subword's result > gets stored. This is OK for mainline, provided that you bootstrap and regression test it somewhere. Thanks. You're quite right that callers can't rely on expand_binop placing the result in target, however most backends and RTL expansion sequences try hard to honor the request. There does appear to be a bug in the "i == 0" case, which has probably never been an issue as most targets are either able to place the result of the addition/subtraction in the requested destination or provide their own adddi3/addti3 expanders. Thanks for finding/fixing this. This might be a candidate for backporting to the GCC 4.0 branch if we can find a target/testcase that triggers a problem. Roger --
Re: Obsoleting more ports for 4.0.
Kazu Hirata wrote: Hi, First off, Mark, if you think this stuff is too late for 4.0, I'll postpone this to 4.1. Please note that all we have to do is add a few lines to config.gcc as far as printing the "obsolete" message is concerned. I think that if you get no objections to your message within a week, it's fine to obsolete these for 4.0. You might consider leaving the ports in 4.1 until after 4.0 has been out for a month or so. That will give users a chance to speak up, if they really do want these old ports. -- Mark Mitchell CodeSourcery, LLC [EMAIL PROTECTED] (916) 791-8304
Re: Copyright question: libgcc GPL exceptions
Mike Stump wrote: The canonical form can be found in gcc/libgcc2.c: [...] (The General Public License restrictions do apply in other respects; for example, they cover modification of the file, and distribution when not linked into a combine executable.) (Been wondering about this for a while...) Possibly this is one of those North American dialect things, but to this (non-American) English speaker this canonical form appears to contain a typo. "[...] when not linked into a combined executable", surely? John
ICE in gcc-4.0-20050305 for m68k
I tried building glibc-2.3.4 for m68k-unknown-linux-gnu with gcc-4.0-20050305, and the compiler fell over in iconv/skeleton.c: In file included from iso-2022-cn-ext.c:657: ../iconv/skeleton.c: In function 'gconv': ../iconv/skeleton.c:801: internal compiler error: output_operand: invalid expression as operand ... make[2]: Leaving directory `...build/m68k-unknown-linux-gnu/gcc-4.0-20050305-glibc-2.3.4/glibc-2.3.4/iconvdata' I'll post a proper problem report when I get a chance, this is just a little heads-up. - Dan -- Trying to get a job as a c++ developer? See http://kegel.com/academy/getting-hired.html
Re: Copyright question: libgcc GPL exceptions
John Marshall <[EMAIL PROTECTED]> writes: > Mike Stump wrote: >> The canonical form can be found in gcc/libgcc2.c: >> >> [...] (The General Public License restrictions >> do apply in other respects; for example, they cover modification of >> the file, and distribution when not linked into a combine >> executable.) > > (Been wondering about this for a while...) Possibly this is one of > those North American dialect things, but to this (non-American) > English speaker this canonical form appears to contain a typo. > > "[...] when not linked into a combined executable", surely? I think you are right, but changes to this wording need to be run by the FSF. zw
fyi: gcc_update merged to release branches
I've merged the gcc_update --silent changes, and Andreas' quoting fix, from mainline to the 3.4 and 4.0 branches. zw
gcc with arm -vfp instructions
hi, i like to know whether gcc can generate vfp instructions.. main() { float a=88.88,b=99.99,c=0; c=a+b; printf("%f",c); } i used the following option to compile the above program arm-elf-gcc -mfp=2 -S new.c but it produces the new.s file without any special kind of (vfp instructions) instructions how to generate them using gcc or i have to use inline assembly for this operation for this if then, whether it will be supported on binutils and the gdb simulator thanks -- __ Check out the latest SMS services @ http://www.linuxmail.org This allows you to send and receive SMS through your mailbox. Powered by Outblaze