[Bug tree-optimization/88763] New: Better Output for Loop Unswitching
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88763 Bug ID: 88763 Summary: Better Output for Loop Unswitching Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: marius.messerschmidt at googlemail dot com Target Milestone: --- I work on a project where we heavily rely on the loop-unswitching feature of GCC (-funswitch-loops). I started working with the log generated by -fdump-tree-unswitch. I started noticing, that the output of the option is very limited. It does only report a few cases where the optimizer could not unswitch a loop. But in the source file (gcc/tree-ssa-loop-unswitch.c) there are quite a lot more checks that lead to a not unswitched loop. Especially the "not invariant" case could be realy helpfull in the logs. I tried to implement further warnings, but I do not fully understand every case in the file, so I am asking someone with better understanding of the internals of GCC to either fix the bug or explain a few things for me. Many thanks in Advance :)
[Bug tree-optimization/88763] Better Output for Loop Unswitching
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88763 --- Comment #2 from Marius Messerschmidt --- Sorry but I do not fully understand what you mean. Do you suggest using different command line arguments? So far I tried: -fdump-tree-all -fdump-tree-unswitch and -fopt-info-all-optall But none of them told me the all the things that I would wish to know, most important the reason why a particular loop was skipped during unswitching (e.g. because it is not invariant or so (right now it already reports a few things with -fdump-tree-unswitch like too-many-instructions or too-many-branches))
[Bug tree-optimization/88763] Better Output for Loop Unswitching
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88763 --- Comment #3 from Marius Messerschmidt --- Sorry but I do not fully understand what you mean. Do you suggest using different command line arguments? So far I tried: -fdump-tree-all -fdump-tree-unswitch and -fopt-info-all-optall But none of them told me the all the things that I would wish to know, most important the reason why a particular loop was skipped during unswitching (e.g. because it is not invariant or so (right now it already reports a few things with -fdump-tree-unswitch like too-many-instructions or too-many-branches))
[Bug tree-optimization/88763] Better Output for Loop Unswitching
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88763 --- Comment #7 from Marius Messerschmidt --- Thanks a lot for working on this! A simple example would be the following: -- CODE --- int calc(int x, int y, int *flag) { if(flag > 5) return x + y; else return x * y; *flag += 2; // BAD LINE } int main(int argc, char **argv) { int flag = argc; int array[250*250]; for(int i = 0; i < 250; i++) { for(int j = 0; j < 250 array[i*250 + j] = calc(i, j, &flag); } return array[42 + argc]; } --- The line marked with "BAD LINE" is obviously preventing the unswitching as the loop condition is no longer constant during the loop. If you uncomment the line gcc reports ";; unswitched loop" which is great. But if you keep the line, you get no output at all. The minimal output I would expect is: ";; not unswitching loop: REASON" so in this case: ";; not unswitching loop: Condition is not invariant" To further improve the output it would be great if there would be some more information about the loop, but I do not know which information is available during this stage. The most helpful additional information would be (also applies for the successful message): - File - Function - Line number of the loop head (or some other way to identify the loop, e.g. loop number XY) - Line number of the if-statement that should be unswitched out of the loop - Line number of the issue that caused the loop unswitching to stop so in the example above the commented line. So I think the perfect log message would be something like this: ";; unswitching loop: testFile.c:82 (Condition: otherFile.c:502)" ";; not unswitching loop: testFile.c:91: Condition (otherFile.c:541) is not invariant (modified at otherFile.c:32)" But as I said above I do not know how many information about the original source file is still available during this stage.
[Bug tree-optimization/88763] Better Output for Loop Unswitching
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88763 --- Comment #8 from Marius Messerschmidt --- Oh minor error from my side, the "BAD LINE" should of course be above the if/return block otherwise it would work just fine.
[Bug tree-optimization/88763] Better Output for Loop Unswitching
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88763 --- Comment #9 from Marius Messerschmidt --- Created attachment 45397 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45397&action=edit Basic testcase As there where some more issues in the example I provided, I added it as an attachment. Now it should work just fine.
[Bug tree-optimization/88763] Better Output for Loop Unswitching
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88763 --- Comment #12 from Marius Messerschmidt --- I think this messages look really good! I believe that this contains everything required to actually work on improving automatic unswitching, thank you very much! Do you think that there is a chance that this will be included in GCC9?
[Bug c/89774] New: Add flag to force single precision
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89774 Bug ID: 89774 Summary: Add flag to force single precision Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: marius.messerschmidt at googlemail dot com Target Milestone: --- It would be helpful if there was a flag (e.g. -fsingle-precision-literals) that would cause gcc to treat floating point literals (e.g. 0.5 or 0.25 ...) in the source code as single precision (float) and not double precision (double). This could help improve performance of single precision code as right now there are many conversion instructions (see objdump) which use quite a lot of runtime (see perf).
[Bug c/89774] Add flag to force single precision
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89774 Marius Messerschmidt changed: What|Removed |Added Resolution|WORKSFORME |FIXED --- Comment #2 from Marius Messerschmidt --- This will cause issues the other way around as well. I think that I did not state clearly what I meant... Right now you can only use 'all double' or 'all float'. What I am looking for is some kind of performance-oriented solution that will pick the 'best' option for each literal. Quick example: void f() { float a = 2.0; // 2.0 -> single float b = 2.0 * b; // 2.0 -> single double d = 3.0;// 3.0 -> double double e = 3.0 * d;// 3.0 -> double double z = 2.0 * a + 3.0 * d; // both 2.0 and 3.0 -> double (only cast a) } The basic idea is to increase performance by reducing casting instructions. Is something like that already implemented or if not, do you think that this is useful and could be implemented?
[Bug c/89774] Add flag to force single precision
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89774 Marius Messerschmidt changed: What|Removed |Added Status|RESOLVED|UNCONFIRMED Resolution|FIXED |--- --- Comment #3 from Marius Messerschmidt --- Reopening issue
[Bug c/89774] Add flag to force single precision
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89774 --- Comment #5 from Marius Messerschmidt --- I did checkt the output without --fsingle-precision-constant Is this only enabled via -fsingle-precision-constant or at any optimization level?
[Bug c/89774] Add flag to force single precision
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89774 --- Comment #7 from Marius Messerschmidt --- Looks good, which options did you use?