答复: How to deal with unrecognizable RTL code
Hi, Jeff: > OK. Subregs of MEMs are a little special. In general, you don't want to > see these at all. I wouldn't necessarily expect this to be a > PROMOTE_MODE problem. > > As others have suggested, find the first pass where you see a (subreg > (mem)) expression. That will be a big help. Extra credit if you can > correlate the insns in that dump with insns in the previous dump file > which would show how the pass in question modified the RTL incorrectly. > Given that it should be relatively easy to start digging into the problem. > Yeah, You are right. Now I, making reference to ARM movm pattern, wanna rewrite the movm pattern. Now the movm pattern do not performs well. > This looks like a different problem. What pass generates insn 17? What > does insn 17 look like in the prior pass? If r14 is your stack/frame > pointer, this might point to a problem in how your port interacts with > register allocation/reload as reload can replace a pseudo with an > equivalent memory location. > Here is RTL in *.22.lreg: (insn:HI 12 13 20 0 (set (reg/v/f:SI 91 [ frame ]) (reg:SI 5 R5 [ frame ])) 14 {move_regs_si} (nil) (expr_list:REG_DEAD (reg:SI 5 R5 [ frame ]) (nil))) (insn:HI 20 12 11 0 (set (reg:SI 88 [ D.1221 ]) (mem/s:SI (plus:SI (reg/v/f:SI 91 [ frame ]) (const_int 4 [0x4])) [13 .mode+0 S4 A32])) 11 {load_si} (insn_list:REG_DEP_TRUE 12 (nil)) (nil)) You can see the two insns are right. Reg91 <-- R5, Reg88 <-- (R91 + 4). But in *.23.greg, RTL is wrong: (insn:HI 12 13 545 0 (set (reg:SI 5 R5) (reg:SI 5 R5 [ frame ])) 14 {move_regs_si} (nil) (nil)) (insn 545 12 20 0 (set (mem/f:SI (plus:SI (reg/f:SI 14 R14) (const_int 172 [0xac])) [35 frame+0 S4 A32]) (reg:SI 5 R5)) 8 {store_si} (nil) (nil)) (insn:HI 20 545 11 0 (set (reg:SI 6 R6 [orig:88 D.1221 ] [88]) (mem/s:SI (plus:SI (mem/f:SI (plus:SI (reg/f:SI 14 R14) (const_int 172 [0xac])) [35 frame+0 S4 A32]) (const_int 4 [0x4])) [13 .mode+0 S4 A32])) 11 {load_si} (insn_list:REG_DEP_TRUE 12 (nil)) (nil)) Why there is a crap insn existence: R5 <-- R5?? The following RTL insn is right which means storing R5 register to a specific memory location. The finally insn is weird. Actually, we can see that the location in (mem/f:SI (plus:SI (reg/f:SI 14 R14) (const_int 172 [0xac])) is R5. I don't why it is placed and what criteria the displacement bases. Does Gcc like to merge some RTL? If it does, gcc merge RTL base what? I mean if the merging happening, some generating RTL will be incorrect. Thank you for your guys help! Daniel.Tian ___ Best Regards Daniel Tian Mavrix Technology, Inc. Address:200 Zhangheng Road, #3501, Building 3, Zhangjiang Hi-tech Park, Shanghai, P.R.China (201204) Tel:86(21)51095958 - 8125 Fax:86(21)50277658 email:daniel.t...@mavrixtech.com.cn www.mavrixtech.com
Re:How to deal with unrecognizable RTL code
Hi, I check the MIPS and ARM, both those cc1 files opened in Insight debug tool contain the mips.md and arm.md file. It is convenient while break point can be set in it. My port md file doesn't appear in the insight. How can I make it? ___ Best Regards Daniel Tian Mavrix Technology, Inc. Address:200 Zhangheng Road, #3501, Building 3, Zhangjiang Hi-tech Park, Shanghai, P.R.China (201204) Tel:86(21)51095958 - 8125 Fax:86(21)50277658 email:daniel.t...@mavrixtech.com.cn www.mavrixtech.com
Re: Phase 1 of gcc-in-cxx now complete
On Sat, Jun 27, 2009 at 6:25 PM, Ian Lance Taylor wrote: > Richard Guenther writes: > >> All that above said - do you expect us to carry both vec.h (for VEC in >> GC memory) and std::vector (for VECs in heap memory) (and vec.h >> for the alloca trick ...)? Or do you think we should try to make the GTY >> machinery C++ aware and annotate the standard library (ick...)? > > I expect us to write a GC allocator, and use that with std::vector. > This will require more hooks into the GC code, but I think it is doable. I'm curious about this. I thought c++ wasn't a garbage collected language on purpose. Why does GCC have one? What purpose does it serve? I'm not suggesting otherwise, but just trying to learn more about the way things are done.
Re: Phase 1 of gcc-in-cxx now complete
NightStrike wrote: > On Sat, Jun 27, 2009 at 6:25 PM, Ian Lance Taylor wrote: >> Richard Guenther writes: >> >>> All that above said - do you expect us to carry both vec.h (for VEC in >>> GC memory) and std::vector (for VECs in heap memory) (and vec.h >>> for the alloca trick ...)? Or do you think we should try to make the GTY >>> machinery C++ aware and annotate the standard library (ick...)? >> I expect us to write a GC allocator, and use that with std::vector. >> This will require more hooks into the GC code, but I think it is doable. > > I'm curious about this. I thought c++ wasn't a garbage collected > language on purpose. Why does GCC have one? What purpose does it > serve? I'm not suggesting otherwise, but just trying to learn more > about the way things are done. This is for gcc's internal use, not for the users of gcc. We have to manage memory inside gcc somehow, so we use a garbage collector. Andrew.
GCC feature req: warn when bitops exceed type size (was: conntrack untracked match is broken)
Hi gcc list, I am forwarding below's bugreport here(*), to implicitly make aware of a feature that I deem important to have in a future gcc. (*) should have posted to bugzilla instead? Don't feel like setting up a bugmenot tho.. -- Forwarded message -- Date: Mon, 29 Jun 2009 14:34:10 From: Patrick McHardy To: Jan Engelhardt Cc: Netfilter Developer Mailing List , Philip Craig Subject: Re: conntrack untracked match is broken (kernel patch) >On Monday 2009-06-22 08:31, Philip Craig wrote: >>The problem is that state_mask in 'struct xt_conntrack_mtinfo1' is >>only 8 bit, but XT_CONNTRACK_STATE_UNTRACKED == 256. >>Unfortunately, gcc doesn't warn about this for '|=', only for '='. [i.e. uint8_t x = 0; x |= 256; ] > >I smell a gcc-missing-feature there.
Re: GCC feature req: warn when bitops exceed type size (was: conntrack untracked match is broken)
On Mon, Jun 29, 2009 at 3:10 PM, Jan Engelhardt wrote: > Hi gcc list, > > > I am forwarding below's bugreport here(*), to implicitly make aware > of a feature that I deem important to have in a future gcc. > -Wconversion should say t.c:4: warning: conversion to ‘unsigned char’ from ‘int’ may alter its value > > (*) should have posted to bugzilla instead? Don't feel like setting > up a bugmenot tho.. > > -- Forwarded message -- > Date: Mon, 29 Jun 2009 14:34:10 > From: Patrick McHardy > To: Jan Engelhardt > Cc: Netfilter Developer Mailing List , > Philip Craig > Subject: Re: conntrack untracked match is broken (kernel patch) > >>On Monday 2009-06-22 08:31, Philip Craig wrote: >>>The problem is that state_mask in 'struct xt_conntrack_mtinfo1' is >>>only 8 bit, but XT_CONNTRACK_STATE_UNTRACKED == 256. >>>Unfortunately, gcc doesn't warn about this for '|=', only for '='. > > [i.e. uint8_t x = 0; x |= 256; ] > >> >>I smell a gcc-missing-feature there. >
BIGGEST_ALIGNMENT vs g++ compat test expectation on aix ?
Working on a collect2 related patch resubmission for ppc-aix, I stumbled on regressions in the c++ series for a problem unrelated to my change. Not sure how you'd typically deal with these, so providing the datapoints here. tmpdir-g++.dg-struct-layout-1/t027 and a couple of other tests failed from constructs like struct S2661 { union{v16sf b[31];Tal2short c __attribute__((aligned (4)));}a; float d; }; struct S2661 s2661; struct S2661 a2661[5]; info.als = __alignof__ (s2661); if (&a2661[3] & (info.als - 1)) FAIL ... the alignment check is up to that of v16sf (64bytes), and this occasionaly fails because csects are "only" aligned in accordance with rs6000.h:#define BIGGEST_ALIGNMENT 128 The discrepancy is visible straight from the assembly output for e.g. t027_y (out of the unchanged compiler as well): .csect .data[RW],4 ^^^ .align 6 ^^^ a2661: .space 10240 Olivier
PR 23296: Strange -O3 -finstrument-functions behaviour
A colleague recently came across the interaction between -finstrument-functions and inline functions mentioned here: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23296 As the bug submitter says, the problem is that for something like: #include void g (void) { printf("Here\n"); } int main (void) { g (); return 0; } compiled with "-O3 -finstrument-functions", you get two successive calls to __cyg_profile_func_enter _with the same arguments_. That is, both calls say that they are entering main(): ... __cyg_profile_func_enter (&main, ); ... __cyg_profile_func_enter (&main, ); ... __cyg_profile_func_exit (&main, ); ... __cyg_profile_func_exit (&main, ); ... Is this really the intended behaviour? Andrew closed the bug as invalid, saying that this is what we expect, but the docs seem to suggest that we ought to do something like: ... __cyg_profile_func_enter (&main, ...); ... __cyg_profile_func_enter (&g, ...); ... __cyg_profile_func_exit (&g, ...); ... __cyg_profile_func_exit (&main, ...); ... instead. [I'm going off: This instrumentation is also done for functions expanded inline in other functions. The profiling calls will indicate where, conceptually, the inline function is entered and exited. This means that addressable versions of such functions must be available. If all your uses of a function are expanded inline, this may mean an additional expansion of code size. If you use @samp{extern inline} in your C code, an addressable version of such functions must be provided. (This is normally the case anyways, but if you get lucky and the optimizer always expands the functions inline, you might have gotten away without providing static copies.) Although it doesn't explicitly say _why_ we need addressable versions, the context suggests to me that we're supposed to be passing them as the "this_fn" argument to the hooks.] Richard
Re: GCC feature req: warn when bitops exceed type size (was: conntrack untracked match is broken)
2009/6/29 Richard Guenther : > On Mon, Jun 29, 2009 at 3:10 PM, Jan Engelhardt wrote: >> Hi gcc list, >> >> >> I am forwarding below's bugreport here(*), to implicitly make aware >> of a feature that I deem important to have in a future gcc. >> > > -Wconversion should say > > t.c:4: warning: conversion to ‘unsigned char’ from ‘int’ may alter its value From GCC 4.4.0 it says: bug.c:6: warning: conversion to ‘uint8_t’ from ‘int’ may alter its value More info abut new Wconversion: http://gcc.gnu.org/wiki/NewWconversion Cheers, Manuel.
Re: Phase 1 of gcc-in-cxx now complete
On Mon, Jun 29, 2009 at 5:39 AM, NightStrike wrote: > On Sat, Jun 27, 2009 at 6:25 PM, Ian Lance Taylor wrote: >> Richard Guenther writes: >> >>> All that above said - do you expect us to carry both vec.h (for VEC in >>> GC memory) and std::vector (for VECs in heap memory) (and vec.h >>> for the alloca trick ...)? Or do you think we should try to make the GTY >>> machinery C++ aware and annotate the standard library (ick...)? >> >> I expect us to write a GC allocator, and use that with std::vector. >> This will require more hooks into the GC code, but I think it is doable. > > I'm curious about this. I thought c++ wasn't a garbage collected > language on purpose. Many people are confused about what the 'purpose' is. > Why does GCC have one? What purpose does it > serve? I'm not suggesting otherwise, but just trying to learn more > about the way things are done. > This is usually an ingredient for long, heated, debates :-) -- Gaby
Re: Register Pressure in Instruction Level Parallelism
Michael Kruse wrote: Hello GCC developers, I am going to write a Master's Thesis about instruction scheduling based on the technique presented in [1]. This will also include a implementation. The basic idea is to have an additional pass on the data dependency graph before instruction scheduling and register allocation takes place. It estimates the minimal (register sufficiency) and maximal number of registers (register saturation) required by schedules based on that graph. If the register sufficiency is higher than the physical number of registers, spill code is added to the graph. For me, that is the most interesting part, unfortunately Touti (as I remember) in his thesis say practically nothing about this. If the register saturation is higher than the physical number of registers, artificial dependencies are added to the graph, such that the instruction scheduler is forced to generate a schedule with less registers. The aim is that the instruction scheduler can focus on the optimal arrangement of instructions to exploit a maximal amount of instruction level parallelism. [1] also suggests heuristics for all these problems. According to the author, these are "nearly optimal" in practice. The heuristics for estimating register sufficiency and saturation are both optimistic, meaning that it might still be necessary to add more spill code to the final code. Hence, this this technique is just an optimization pass and doesn't replace existing register allocation or instruction scheduling passes. [1] also has a part about loop scheduling, but my thesis will be about basic blocks only. See [2] for an presentation of this technique. So, now my questions: How much do you think could this could improve compiled code speed? Would the current GCC/YARA benefit from such an optimization pass at all? I think nobody can answer the questions until it is implemented. I am working on register-pressure sensitive insn scheduling too for a few last months. I started with simple heuristic described in Muchnic book: when the current register pressure is high, choose insn decreasing the pressure and when the pressure is low, use the usual scheduling heuristic. This did not work at all (increasing code size in about 3-5% for x86). The problem was in that even the current register pressure is low at given point, issuing insn at this point could still increase register pressure in later points. So I am working now on evaluation register pressure in later points too (some form of look ahead). This approach is similar what Touati proposed (but without actually adding additional DD arcs). If you are going to work on this project, some small advice about evaluating register sufficiency. I found that register pressure is already practically minimized before insn-scheduling (I suspect that it is mostly done in TER). By the way, it also means that tree balancing (which is actually much more complicated issue because it is not tree but DAG) is not necessary for the backend as it was done in Preston Briggs project (and mentioned in proposed Ken Zadeck's pass stack). What are the chances that this could get into the main GCC tree if it shows up to be an improvement? I don't see any problem to get the code into main GCC tree if you get even 1-2% improvement. Although there are some technical questions (like code fitting into gcc practice standards) and commitment to maintain the code. But this problems could be definitely overcome. There has been a short discussion on this mailing list already [3] some years ago (Note if you read this: intLP has been used to compare the heuristic to the optimal result only). To my knowledge, there hasn't been any more on this topic yet (anywhere). I'd prefer to implement this for the gcc, but my advisor wants me to do it for the university's own compiler. Therefore I could also need arguments why to do it for the GCC. Personally, I'd love to see this work done in GCC. I believe the research work in compiler field should be done in real industrial compilers because that is a final target of the research. I saw many times that researchers report big improvements of their algorithms because they are using toy compilers but when the same algorithms are implemented in GCC they give practically nothing. For me a toy compiler criteria is defined how good they are on some generally used compiler benchmarks as SPEC (the better existing compiler performance, the harder to get the same percent improvement). So if your university compiler performance is close for example to SPECFP2000, I think you could use it to get a real optimization impact. On the other hand, you definitely will get better numbers (and spending less time) using a toy compiler than GCC and your research probably could look more successful with the academic point of view. [1] http://www.prism.uvsq.fr/~touati/thesis.html [2] http://tel.archives-ouvert
Re: Register Pressure in Instruction Level Parallelism
Dave Korn wrote: Michael Kruse wrote: So, now my questions: How much do you think could this could improve compiled code speed? Would the current GCC/YARA benefit from such an optimization pass at all? What are the chances that this could get into the main GCC tree if it shows up to be an improvement? One of the major problems in gcc is the intertangling of instruction selection with register allocation and spill generation. If these could be separated it would almost certainly generate better code and be welcomed with open arms! Although I mostly agree with this. I am not sure about universality of this claim. As an example, choosing x86 add insn instead of lea earlier could result in an additional spill and vise versa choosing lea could result in usage 3 different register and bigger code. I'd prefer to implement this for the gcc, but my advisor wants me to do it for the university's own compiler. Therefore I could also need arguments why to do it for the GCC. Because destroying reload(*) would be an incalculable public service and your name will be remembered in history as the one who slew the dragon? ;-)
Re: Register Pressure in Instruction Level Parallelism
Dave Korn wrote: Albert Cohen wrote: Unfortunately, the state of the art (more recent that the thesis referenced in the original email, see Touati's web page) is limited to basic block and software-pipelining scopes, and limited to scheduling. Compared to the tasks currently managed by reload, it certainly misses a whole bunch of instruction selection and immediate operand/address mode corner case problems (depending on the target). It also misses global scheduling, but extended BB scheduling is not very hard to approximate, as well as superblock scheduling. I'm not at all a knowledgeable person to tell you what to do in the case of GCC, but for sure saturation/sufficiency-based approches are not sufficient to kill the dragon. In a brief exchange I had with Michael off-list, we discussed that. I observed that of the things that reload does, constraint-satisfaction/insn-variant-selection is its primary purpose, and spill/reload code generation is something it often does suboptimally (and secondary reloads even worse). If a clever pass running before reload could insert explicit spill/reload code at well-chosen places (bearing in mind class-based register pressure), it could relieve reload of the necessity to generate its own spill code most of the time, and let it just do what it does best. IRA actually already inserts spill code in most important places (on loop borders). Besides loop regions, IRA could be extended to other regions (and even bb parts to relief pressure inside the blocks). I am going to work on it to evaluate how much it could give. Most spill/restore code at least for x86 is generated by reload because one insn operand should be a register. Some cooperation of IRA and reload on this issue, I hope, has a potential to improve code performance more. So, yes, it doesn't kill the dragon, reload would need to be retained and would still need the last-resort ability to do all the stuff it does now - but mostly it wouldn't have to in practice, and a sleeping dragon is still an improvement on a wide-awake one that's stomping round in a furious bad temper burning everything in sight Reload would definitely generate better code if it was fed stuff that avoided exercising its weakest points; sounds like a pre-conditioning pass based on your techniques might work really well.
Re: Updating Autoconf and Automake versions used in GCC and src
> "Ralf" == Ralf Wildenhues writes: Ralf> I'd be grateful about feedback as to whether the general plan is Ralf> acceptable for everyone; thanks. Sounds good to me. Please don't forget Classpath. Ralf> I'd prefer to do the transition on the GCC trunk rather than in a Ralf> branch, unless there is opposition to this. Keeping a branch in sync Ralf> with the src tree seems awkward. BTW, should I keep any other trees in Ralf> sync with these two (archer, out-of-tree libffi)? You don't need to bother about the archer repository. We'll handle it via merges. Tom
Re: Updating Autoconf and Automake versions used in GCC and src
Hi Tom, * Tom Tromey wrote on Mon, Jun 29, 2009 at 07:05:26PM CEST: > > "Ralf" == Ralf Wildenhues writes: > > Ralf> I'd be grateful about feedback as to whether the general plan is > Ralf> acceptable for everyone; thanks. > > Sounds good to me. Please don't forget Classpath. If you are talking about gcc/libjava/classpath, then yes, I have that on my radar; or should I consider out-of-tree Classpath, too? Thanks, Ralf
Re: Updating Autoconf and Automake versions used in GCC and src
> "Ralf" == Ralf Wildenhues writes: Ralf> I'd be grateful about feedback as to whether the general plan is Ralf> acceptable for everyone; thanks. Tom> Sounds good to me. Please don't forget Classpath. Ralf> If you are talking about gcc/libjava/classpath, then yes, I have that on Ralf> my radar; or should I consider out-of-tree Classpath, too? If Classpath requires changes to the auto* input files, then yes, the changes have to go upstream. I can handle that, just CC me on the needed patch. Note that Classpath doesn't check in the generated files. So if not input file changes were needed, then there's nothing to bother about. Tom
Re: GCC feature req: warn when bitops exceed type size (was: conntrack untracked match is broken)
On Monday 2009-06-29 16:09, Manuel López-Ibáñez wrote: >2009/6/29 Richard Guenther wrote: >> On Mon, Jun 29, 2009 at 3:10 PM, Jan Engelhardt wrote: >>> Hi gcc list, >>> >>> >>> I am forwarding below's bugreport here(*), to implicitly make aware >>> of a feature that I deem important to have in a future gcc. >>> >> >> -Wconversion should say >> >> t.c:4: warning: conversion to ‘unsigned char’ from ‘int’ may alter its value I added -Wconversion to the Linux kernel's global cflags (KBUILD_CFLAGS) just to see what would happen. As I expected, I get swamped with warnings, like "conversion to int from unsigned int". All legitimate in themselves, I would have preferred an option (or even no option at all, given that the "large integer implicitly truncated to unsigned type" warning is shown without any -W flags) that only flags up truncation problems with literals.
Base register restrictions
I am working on a port to an architecture with some strict rules. The restriction that I am unable to figure out how to enforce is a base register that is allowed in the destination operand, but not in a source operand. For example, this would be allowed "add 4($1), $8, $9", but this would not be allowed "add $8, 4($1), $9" because $1 can only be used a base register for the destination operand. Is there any way to get that kind of information in GO_IF_LEGITIMATE_ADDRESS or can you think of some other way to handle this? Anyone know of a port with something similar that I could look at? Thanks. -- View this message in context: http://www.nabble.com/Base-register-restrictions-tp24259346p24259346.html Sent from the gcc - Dev mailing list archive at Nabble.com.
Re: Register Pressure in Instruction Level Parallelism
Vladimir Makarov wrote: Michael Kruse wrote: If the register sufficiency is higher than the physical number of registers, spill code is added to the graph. For me, that is the most interesting part, unfortunately Touti (as I remember) in his thesis say practically nothing about this. In the thesis, a modified Poletto algorithm is presented to add spill code. So, now my questions: How much do you think could this could improve compiled code speed? Would the current GCC/YARA benefit from such an optimization pass at all? I think nobody can answer the questions until it is implemented. My main intention to ask this is that somebody might have said, that it was not worth the effort. Therefore, I could have saved me a lot of work. If you are going to work on this project, some small advice about evaluating register sufficiency. I found that register pressure is already practically minimized before insn-scheduling (I suspect that it is mostly done in TER). By the way, it also means that tree balancing (which is actually much more complicated issue because it is not tree but DAG) is not necessary for the backend as it was done in Preston Briggs project (and mentioned in proposed Ken Zadeck's pass stack). Thank you. I am grateful for any advice. What are the chances that this could get into the main GCC tree if it shows up to be an improvement? I don't see any problem to get the code into main GCC tree if you get even 1-2% improvement. Although there are some technical questions (like code fitting into gcc practice standards) and commitment to maintain the code. But this problems could be definitely overcome. I'd be willing to do this. I'd prefer to implement this for the gcc, but my advisor wants me to do it for the university's own compiler. Therefore I could also need arguments why to do it for the GCC. Personally, I'd love to see this work done in GCC. I believe the research work in compiler field should be done in real industrial compilers because that is a final target of the research. I saw many times that researchers report big improvements of their algorithms because they are using toy compilers but when the same algorithms are implemented in GCC they give practically nothing. For me a toy compiler criteria is defined how good they are on some generally used compiler benchmarks as SPEC (the better existing compiler performance, the harder to get the same percent improvement). So if your university compiler performance is close for example to SPECFP2000, I think you could use it to get a real optimization impact. On the other hand, you definitely will get better numbers (and spending less time) using a toy compiler than GCC and your research probably could look more successful with the academic point of view. I don't think it a toy project. Industry is involved (embedded systems) and it also has multiple back-ends. The problem with it is, that it is (at least partially) proprietary. And I don't know about the other part. However, you can't download it on the web. Regards, Michael Kruse -- Tardyzentrismus verboten! smime.p7s Description: S/MIME Cryptographic Signature
Re: Register Pressure in Instruction Level Parallelism
Michael Kruse wrote: Vladimir Makarov wrote: Michael Kruse wrote: If the register sufficiency is higher than the physical number of registers, spill code is added to the graph. For me, that is the most interesting part, unfortunately Touti (as I remember) in his thesis say practically nothing about this. In the thesis, a modified Poletto algorithm is presented to add spill code. I've just checked the thesis again. I don't think decreasing register pressure through spilling will work well. First of all Polleto linear scan RA is worse than Chaitin-Briggs approach. Even its major improvement extending linear scan is worse than Chaitin-Briggs approach. My experience with an ELS implementation in GCC has shown me this although in original article about ELS the opposite is stated (the comparison in the article was done in GCC but with the new ra project which was unsuccessful implementation of Chaitin-Briggs RA and it was done only on ppc64. I am sure that on x86/x86_64 ELS would behave even worse). That is about basic RA spill in Touti's thesis. The bigger problem is that decreasing register pressure does not take live range splitting into account what good modern RAs do. With this point of view, an approach for register pressure decrease in Bob Morgan's book is more perspective because it does also live range splitting (by the way, I tried Morgan's approach with the old RA more than 5 year ago and did not look compelling for me that time). I'd prefer to implement this for the gcc, but my advisor wants me to do it for the university's own compiler. Therefore I could also need arguments why to do it for the GCC. Personally, I'd love to see this work done in GCC. I believe the research work in compiler field should be done in real industrial compilers because that is a final target of the research. I saw many times that researchers report big improvements of their algorithms because they are using toy compilers but when the same algorithms are implemented in GCC they give practically nothing. For me a toy compiler criteria is defined how good they are on some generally used compiler benchmarks as SPEC (the better existing compiler performance, the harder to get the same percent improvement). So if your university compiler performance is close for example to SPECFP2000, I think you could use it to get a real optimization impact. On the other hand, you definitely will get better numbers (and spending less time) using a toy compiler than GCC and your research probably could look more successful with the academic point of view. I don't think it a toy project. Industry is involved (embedded systems) and it also has multiple back-ends. The problem with it is, that it is (at least partially) proprietary. And I don't know about the other part. However, you can't download it on the web. Could you tell us what compiler is this?
Re: Register Pressure in Instruction Level Parallelism
Vladimir Makarov schrieb: Michael Kruse wrote: In the thesis, a modified Poletto algorithm is presented to add spill code. I've just checked the thesis again. I don't think decreasing register pressure through spilling will work well. First of all Polleto linear scan RA is worse than Chaitin-Briggs approach. Even its major improvement extending linear scan is worse than Chaitin-Briggs approach. My experience with an ELS implementation in GCC has shown me this although in original article about ELS the opposite is stated (the comparison in the article was done in GCC but with the new ra project which was unsuccessful implementation of Chaitin-Briggs RA and it was done only on ppc64. I am sure that on x86/x86_64 ELS would behave even worse). That is about basic RA spill in Touti's thesis. The bigger problem is that decreasing register pressure does not take live range splitting into account what good modern RAs do. With this point of view, an approach for register pressure decrease in Bob Morgan's book is more perspective because it does also live range splitting (by the way, I tried Morgan's approach with the old RA more than 5 year ago and did not look compelling for me that time). That this algorithm is used in the thesis does not mean that I have to use that approach. Part of my thesis is also to evaluate different heuristics and compare them to each other. This one would something I'd try. Could you tell us what compiler is this? Unfortunately not (yet). I have just very few information on my own. They just keep telling me how great it is. But I will get the source this week. Regards, Michael Kruse -- Tardyzentrismus verboten! smime.p7s Description: S/MIME Cryptographic Signature
Re: Exploring gcc-in-cxx compiler build requirements
Jerry Quinn writes: > Both 3.1.1 and 3.2.3 fail to bootstrap with the following error: > > make[1]: Entering directory `/home/jlquinn/gcc/dev/build/gcc323/gcc' > gcc -c -DIN_GCC--std=gnu89 -W -Wall -Wwrite-strings > -Wstrict-prototypes -Wmissing-prototypes -Wtraditional -pedantic > -Wno-long-long -DHAVE_CONFIG_H -DGENERATOR_FILE-I. -I. > -I../../../gcc-3.2.3/gcc -I../../../gcc-3.2.3/gcc/. > -I../../../gcc-3.2.3/gcc/config > -I../../../gcc-3.2.3/gcc/../include ../../../gcc-3.2.3/gcc/read-rtl.c -o > read-rtl.o > In file included from ../../../gcc-3.2.3/gcc/read-rtl.c:24: > ../../../gcc-3.2.3/gcc/rtl.h:125: warning: type of bit-field ‘code’ is a > GCC extension > ../../../gcc-3.2.3/gcc/rtl.h:128: warning: type of bit-field ‘mode’ is a > GCC extension > ../../../gcc-3.2.3/gcc/read-rtl.c: In function > ‘fatal_with_file_and_line’: > ../../../gcc-3.2.3/gcc/read-rtl.c:62: warning: traditional C rejects ISO > C style function definitions > ../../../gcc-3.2.3/gcc/read-rtl.c: In function ‘read_rtx’: > ../../../gcc-3.2.3/gcc/read-rtl.c:662: error: lvalue required as > increment operand > make[1]: *** [read-rtl.o] Error 1 > make[1]: Leaving directory `/home/jlquinn/gcc/dev/build/gcc323/gcc' > make: *** [all-gcc] Error 2 It looks like you are trying to build gcc 3.1.1 and 3.2.3 with the current gcc. This is known to fail. It is (unfortunately) the case that if you want to build an old version of gcc, you sometimes have to walk backward--use version N to build version N-1, use version N-1 to build version N-2, etc. You don't have to hit every version, but you can't skip too far. > 3.3.6 old compiler builds. branch bootstrap fails with: Actually, there is no longer any reason to use the gcc-in-cxx branch. What you should do now is build trunk using the configure argument --build-with-cxx. > make[3]: Entering directory > `/home/jlquinn/gcc/dev/gcc-in-cxx/host-x86_64-unknown-linux-gnu/gcc' > g++ -c -g -g -DIN_GCC -W -Wall -Wwrite-strings -Wcast-qual > -Wmissing-format-attribute -fno-common -DHAVE_CONFIG_H -I. -I. > -I../.././gcc -I../.././gcc/. -I../.././gcc/../include > -I../.././gcc/../libcpp/include -I../.././gcc/../libdecnumber > -I../.././gcc/../libdecnumber/bid > -I../libdecnumber../.././gcc/c-lang.c -o c-lang.o > In file included from /home/jlquinn/gcc/dev/run/gcc336/include/c > ++/3.3.6/iosfwd:46, > from /usr/include/gmp-x86_64.h:24, > from /usr/include/gmp.h:59, > from ../../gcc/double-int.h:24, > from ../../gcc/tree.h:30, > from ../../gcc/c-lang.c:27: > /home/jlquinn/gcc/dev/run/gcc336/include/c > ++/3.3.6/x86_64-unknown-linux-gnu/bits/c++locale.h:61:40: attempt to use > poisoned "malloc" We may need to ensure that when using C++ gmp.h is included before system.h. Or simply #include in or before system.h. Ian
Re: How to deal with unrecognizable RTL code
"daniel.tian" writes: > I check the MIPS and ARM, both those cc1 files opened in Insight debug tool > contain the mips.md and arm.md file. It is convenient while break point can > be set in it. > My port md file doesn't appear in the insight. You seem to be asking a question about Insight rather than about gcc. I haven't used Insight in a long time, but when I use gdb it knows about the .md file (e.g., "list CPU.md:1" works). So I think this question would be better directed to Insight users. Ian
Re: PR 23296: Strange -O3 -finstrument-functions behaviour
Richard Sandiford writes: > Is this really the intended behaviour? Andrew closed the bug as invalid, > saying that this is what we expect, but the docs seem to suggest that we > ought to do something like: > > ... > __cyg_profile_func_enter (&main, ...); > ... > __cyg_profile_func_enter (&g, ...); > ... > __cyg_profile_func_exit (&g, ...); > ... > __cyg_profile_func_exit (&main, ...); > ... I agree. The docs seem reasonably clear. Ian
Re: Base register restrictions
Dobes writes: > I am working on a port to an architecture with some strict rules. The > restriction that I am unable to figure out how to enforce is a base register > that is allowed in the destination operand, but not in a source operand. > For example, this would be allowed "add 4($1), $8, $9", but this would not > be allowed "add $8, 4($1), $9" because $1 can only be used a base register > for the destination operand. Is there any way to get that kind of > information in GO_IF_LEGITIMATE_ADDRESS or can you think of some other way > to handle this? Anyone know of a port with something similar that I could > look at? Thanks. This kind of restriction would normally be handled via appropriate use of register classes in the define_insn patterns. You will also want to make sure that BASE_REG_CLASS and REGNO_OK_FOR_BASE_P are definedly correctly. Ian
Re: GCC feature req: warn when bitops exceed type size (was: conntrack untracked match is broken)
2009/6/29 Jan Engelhardt : > > On Monday 2009-06-29 16:09, Manuel López-Ibáñez wrote: >>2009/6/29 Richard Guenther wrote: >>> On Mon, Jun 29, 2009 at 3:10 PM, Jan Engelhardt wrote: Hi gcc list, I am forwarding below's bugreport here(*), to implicitly make aware of a feature that I deem important to have in a future gcc. >>> >>> -Wconversion should say >>> >>> t.c:4: warning: conversion to ‘unsigned char’ from ‘int’ may alter its value > > I added -Wconversion to the Linux kernel's global cflags (KBUILD_CFLAGS) > just to see what would happen. As I expected, I get swamped with > warnings, like "conversion to int from unsigned int". All legitimate in > themselves, I would have preferred an option (or even no option at all, > given that the "large integer implicitly truncated to unsigned type" > warning is shown without any -W flags) that only flags up truncation > problems with literals. A lot of false warnings were fixed in GCC 4.4 and 4.5. And if only truncation problems with literals were warned then your original example wouldn't had been. Cheers, Manuel.
Re: Base register restrictions
Thanks. I thought that if I just try to disallow this in the operand predicates/constraints in my define_insn patterns, I would end up getting an unrecognizable instruction error. I will try that and see what happens. Brice On Jun 29, 2009, at 5:11 PM, Ian Lance Taylor wrote: Dobes writes: I am working on a port to an architecture with some strict rules. The restriction that I am unable to figure out how to enforce is a base register that is allowed in the destination operand, but not in a source operand. For example, this would be allowed "add 4($1), $8, $9", but this would not be allowed "add $8, 4($1), $9" because $1 can only be used a base register for the destination operand. Is there any way to get that kind of information in GO_IF_LEGITIMATE_ADDRESS or can you think of some other way to handle this? Anyone know of a port with something similar that I could look at? Thanks. This kind of restriction would normally be handled via appropriate use of register classes in the define_insn patterns. You will also want to make sure that BASE_REG_CLASS and REGNO_OK_FOR_BASE_P are definedly correctly. Ian
Re: How to deal with unrecognizable RTL code
daniel.tian wrote: Hi, I check the MIPS and ARM, both those cc1 files opened in Insight debug tool contain the mips.md and arm.md file. It is convenient while break point can be set in it. My port md file doesn't appear in the insight. The mips.md and arm.md file end up in the debug info because they contain C code segments that get copied into the insn-*.c files. If you look at these insn-*.c files, you will see things like (copied from i386 port) #line 1310 "../../gcc/gcc/config/i386/i386.md" which causes debug info to be generated that includes a reference to this file. If you have a very simple port, you might not have any such cases like this. Without this debug info, Insight (aka gdbtk) won't be able to show the files. Or something could be broken in your port. Jim
Re: Base register restrictions
Dobes wrote: I am working on a port to an architecture with some strict rules. The restriction that I am unable to figure out how to enforce is a base register that is allowed in the destination operand, but not in a source operand. GCC does not provide any way in the old GO_IF_LEGITIMATE_ADDRESS (or the new TARGET_LEGITIMATE_ADDRESS_P) to distinguish between an address used in a dest and an address used in a source. You probably just have to limit this to addresses that are valid in both places. You can then add extra constraints to try to recognize other types of memory addresses. So you could use "mR" for destination operands and "mS" for source operands, where R and S are extra constraints that have been defined appropriately. An example to look at might be the s390 port. The movti pattern for instance uses QRST, where only Q and S can be in the dest, but any of Q, R, S, or T can be in the source. And it doesn't seem to use 'm' much. If you don't have a set of base registers that are valid both in source and dest addresses, then this probably gets messy. Of course, it would be nice if gcc handled this better, but fixing this is probably a lot of work. Jim
RE: How to deal with unrecognizable RTL code
> This looks like a different problem. What pass generates insn 17? What > does insn 17 look like in the prior pass? If r14 is your stack/frame > pointer, this might point to a problem in how your port interacts with > register allocation/reload as reload can replace a pseudo with an > equivalent memory location. > Exactly, But how do I prevent it, replacing a pseudo with an equivalent memory location, happening. Daniel Tian