c-c++-common/analyzer/out-of-bounds-diagram-11.c written type vs toolchain
While on x86_64-pc-linux-gnu, the second diagram shows the type written as 'int', as expected, on a 16 and 32 bit newlib based toolchain, it is being output as int32_t . And all the formatting is also a bit different, probably due to the change in how the int32_t is displayed. What do other people see on toolchains where the regression tests actually have I/O functionality? Would it make sense to handle this with one multi-line pattern for newlib based toolchains, ending with { dg-end-multiline-output "" { target *-*-elf } } */ and one for glibc based toolchain, ending in { dg-end-multiline-output "" { target !*-*-elf } } */ ? I have no idea what toolchains with different libraries (and hence header files) would see.
Re: c-c++-common/analyzer/out-of-bounds-diagram-11.c written type vs toolchain
On Mon, 2024-07-22 at 11:37 +0100, Joern Wolfgang Rennecke wrote: > While on x86_64-pc-linux-gnu, the second diagram shows the type > written > as 'int', as expected, on a 16 and 32 bit newlib based toolchain, it > is > being output as int32_t . And all the formatting is also a bit > different, probably due to the change in how the int32_t is > displayed. Sorry about over-specifying the tests output. > > What do other people see on toolchains where the regression tests > actually have I/O functionality? FWIW on my x86_64-pc-linux-gnu with make check-gcc RUNTESTFLAGS="-v -v --target_board=unix\{-m32,-m64\} analyzer.exp=out-of-bounds-diagram-11.c" I see this output for the 2nd test: ┌───┐ │ write of '(int) 42' │ └───┘ │ │ │ │ v v ┌──┐┌──┐ │ buffer allocated on stack at (1) ││after valid range │ └──┘└──┘ ├┬─┤├┬─┤ │ │ ╭┴───╮ ╭─┴╮ │capacity: '(size * 4) + 3' bytes│ │overflow of 1 byte│ ╰╯ ╰──╯ Does it help to hack this change into prune.exp: diff --git a/gcc/testsuite/lib/prune.exp b/gcc/testsuite/lib/prune.exp index d00d37f015f7..f467d1a97bc6 100644 --- a/gcc/testsuite/lib/prune.exp +++ b/gcc/testsuite/lib/prune.exp @@ -109,7 +109,7 @@ proc prune_gcc_output { text } { # Many tests that use visibility will still pass on platforms that don't support it. regsub -all "(^|\n)\[^\n\]*lto1: warning: visibility attribute not supported in this configuration; ignored\[^\n\]*" $text "" text -#send_user "After:$text\n" +send_user "After:$text\n" return $text } > > Would it make sense to handle this with one multi-line pattern for > newlib based toolchains, ending with > { dg-end-multiline-output "" { target *-*-elf } } */ > and one for glibc based toolchain, ending in > { dg-end-multiline-output "" { target !*-*-elf } } */ > ? Presumably the only difference is in the top-right hand box of the diagram, whereas my objective for those tests was more about the lower part of the diagram - I wanted to verify how we handle symbolic buffer sizes (e.g. (size * 4) + 3, and other run-time-computer sizes). It's rather awkward to test the diagrams with DejaGnu, alas. Would it might make sense to split out that file into three separate tests -11a, -11b, and -11c, and be more aggressive about only running the 2nd test on targets that we know generate "int" in the top-right box? > > I have no idea what toolchains with different libraries (and hence > header files) would see. > Sorry again about this Dave
Re: c-c++-common/analyzer/out-of-bounds-diagram-11.c written type vs toolchain
On 22/07/2024 16:44, David Malcolm wrote: Does it help to hack this change into prune.exp: diff --git a/gcc/testsuite/lib/prune.exp b/gcc/testsuite/lib/prune.exp index d00d37f015f7..f467d1a97bc6 100644 --- a/gcc/testsuite/lib/prune.exp +++ b/gcc/testsuite/lib/prune.exp @@ -109,7 +109,7 @@ proc prune_gcc_output { text } { # Many tests that use visibility will still pass on platforms that don't support it. regsub -all "(^|\n)\[^\n\]*lto1: warning: visibility attribute not supported in this configuration; ignored\[^\n\]*" $text "" text -#send_user "After:$text\n" +send_user "After:$text\n" return $text } I'm baffled. Isn't that statement there just to debug prune_gcc_output? I suppose we could prune the whitespace from the diagram, but prune_gcc_output does not know about types. If there's 'int, that could be int32_t, int16_t, int64_t, ptrdiff_t, or whatever. Unless you want to make all integer types be considered equivalent for dejagnu purposes if they appear somewhere between vertical bars. Would it make sense to handle this with one multi-line pattern for newlib based toolchains, ending with { dg-end-multiline-output "" { target *-*-elf } } */ and one for glibc based toolchain, ending in { dg-end-multiline-output "" { target !*-*-elf } } */ ? Presumably the only difference is in the top-right hand box of the diagram, Unfortunately, there's also a lot of white space change in the rest of the diagram. I have attached the patch I'm currently using for your perusal. whereas my objective for those tests was more about the lower part of the diagram - I wanted to verify how we handle symbolic buffer sizes (e.g. (size * 4) + 3, and other run-time-computer sizes). It's rather awkward to test the diagrams with DejaGnu, alas. Would it might make sense to split out that file into three separate tests -11a, -11b, and -11c, and be more aggressive about only running the 2nd test on targets that we know generate "int" in the top-right box? No, each dg-end-multiline-output stanza already can have its separate target selector, there is no point in putting them in separate files. I guess you could reduce the differences between platforms if you didn't use types as defined by headerfiles directly, as they might be #defines or typedefs or whatever, and instead used your own typedef or struct types.Index: c-c++-common/analyzer/out-of-bounds-diagram-8.c === --- c-c++-common/analyzer/out-of-bounds-diagram-8.c (revision 6640) +++ c-c++-common/analyzer/out-of-bounds-diagram-8.c (revision 6642) @@ -17,6 +17,24 @@ /* { dg-begin-multiline-output "" } + ┌───┐ + │write of '(int32_t) 42'│ + └───┘ + │ + │ + v + ┌───┐ ┌───┐ + │buffer allocated on heap at (1)│ │ after valid range │ + └───┘ └───┘ + ├───┬───┤├─┬──┤├───┬───┤ + │ │ │ +╭─┴╮ ╭───┴───╮ ╭─┴─╮ +│capacity: 'size * 4' bytes│ │4 bytes│ │overflow of 4 bytes│ +╰──╯ ╰───╯ ╰───╯ + + { dg-end-multiline-output "" { target *-*-elf } } */ +/* { dg-begin-multiline-output "" } + ┌───┐ │write of '(int) 42'│ └───┘ @@ -32,4 +50,4 @@ │capacity: 'size * 4' bytes│ │4 bytes│ │overflow of 4 bytes│ ╰──╯ ╰───╯ ╰───╯ - { dg-end-multiline-output "" } */ + { dg-end-multiline-output "" { target !*-*-elf } } */ Index: c-c++-common/analyzer/out-of-bounds-diagram-11.c === --- c-c++-common/analyzer/out-of-bounds-diagram-11.c(revision 6640) +++ c-c++-common/analyzer/out-of-bounds-diagram-11.c(revision 6642) @@ -45,8 +45,30 @@ buf[size] = 42; /* { dg-warning "stack-based buffer overflow" } */ } +/* With a newlib toolchain (at least for esirisc), we end up with int32_t + being shown as itself. */ /* { dg-begin-multiline-output "" } +┌┐ +│write of '(int32_t) 42' │ +
Re: c-c++-common/analyzer/out-of-bounds-diagram-11.c written type vs toolchain
On 22/07/2024 17:13, Joern Wolfgang Rennecke wrote: > I guess you could reduce the differences between platforms if you didn't use types as defined by headerfiles directly, as they might be #defines or typedefs or whatever, and instead used your own typedef or struct types. It seems a typedef to int is seen through, even if you chain two of them together. After preprocessing, newlib has: typedef long int __int32_t; typedef __int32_t int32_t ; So the crucial point seems to be to have 'long int', but that is of course not portable for int32_t. So to get portable code and consistent messages, I suppose we should use a struct: typedef struct { int32_t i; } my_int32; my_int32 s42 = { 42 }; my_int32 *buf = (my_int32 *) __builtin_alloca (4 * size + 3); /* { dg-warning "allocated buffer size is not a multiple of the pointee's size" } */ buf[size] = s42; /* { dg-warning "stack-based buffer overflow" } */ Now suddenly the diagram is made *more* verbose, with the struct keyword added. ┌─┐ │write of ‘struct my_int32’ (4 bytes) │ └─┘ │ │ │ │ v v ┌───┐ ┌┐ │ buffer allocated on stack at (1)│ │ after valid range│ └───┘ └┘ ├───┬───┤ ├───┬┤ │ │ ╭┴───╮ ╭─┴╮ │capacity: ‘(size * 4) + 3’ bytes│ │overflow of 1 byte│ ╰╯ ╰──╯
Re: c-c++-common/analyzer/out-of-bounds-diagram-11.c written type vs toolchain
On Mon, 2024-07-22 at 17:13 +0100, Joern Wolfgang Rennecke wrote: > > > On 22/07/2024 16:44, David Malcolm wrote: > > Does it help to hack this change into prune.exp: > > > > diff --git a/gcc/testsuite/lib/prune.exp > > b/gcc/testsuite/lib/prune.exp > > index d00d37f015f7..f467d1a97bc6 100644 > > --- a/gcc/testsuite/lib/prune.exp > > +++ b/gcc/testsuite/lib/prune.exp > > @@ -109,7 +109,7 @@ proc prune_gcc_output { text } { > > # Many tests that use visibility will still pass on platforms > > that don't support it. > > regsub -all "(^|\n)\[^\n\]*lto1: warning: visibility > > attribute not supported in this configuration; ignored\[^\n\]*" > > $text "" text > > > > - #send_user "After:$text\n" > > + send_user "After:$text\n" > > > > return $text > > } > > I'm baffled. Isn't that statement there just to debug > prune_gcc_output? > I suppose we could prune the whitespace from the diagram, but > prune_gcc_output does not know about types. If there's 'int, that > could > be int32_t, int16_t, int64_t, ptrdiff_t, or whatever. Unless you > want > to make all integer types be considered equivalent for dejagnu > purposes > if they appear somewhere between vertical bars. I may have misunderstood your email; I got the impression that you having trouble seeing what gcc was emitting for you on this case. If there's a mismatch, then the output will survive pruning and get dumped there. But it sounds like that's not at all what you were talking about; sorry. > > > > > > > Would it make sense to handle this with one multi-line pattern > > > for > > > newlib based toolchains, ending with > > > { dg-end-multiline-output "" { target *-*-elf } } */ > > > and one for glibc based toolchain, ending in > > > { dg-end-multiline-output "" { target !*-*-elf } } */ > > > ? > > > > Presumably the only difference is in the top-right hand box of the > > diagram, > > Unfortunately, there's also a lot of white space change in the rest > of > the diagram. > I have attached the patch I'm currently using for your perusal. Thanks. Looks good to me, but... > > whereas my objective for those tests was more about the lower > > part of the diagram - I wanted to verify how we handle symbolic > > buffer > > sizes (e.g. (size * 4) + 3, and other run-time-computer sizes). > > > > It's rather awkward to test the diagrams with DejaGnu, alas. > > > > Would it might make sense to split out that file into three > > separate > > tests -11a, -11b, and -11c, and be more aggressive about only > > running > > the 2nd test on targets that we know generate "int" in the top- > > right > > box? > > No, each dg-end-multiline-output stanza already can have its separate > target selector, there is no point in putting them in separate files. > I guess you could reduce the differences between platforms if you > didn't > use types as defined by headerfiles directly, as they might be > #defines > or typedefs or whatever, and instead used your own typedef or struct > types. ...that might be a better idea. All I care about for the tests that are failing are the sizes so maybe using a struct foo { uint_32_t val; } is the way to go here. Dave
Re: c-c++-common/analyzer/out-of-bounds-diagram-11.c written type vs toolchain
On Mon, 2024-07-22 at 17:54 +0100, Joern Wolfgang Rennecke wrote: > > > On 22/07/2024 17:13, Joern Wolfgang Rennecke wrote: > > I guess you could reduce the differences between platforms if you > didn't > > use types as defined by headerfiles directly, as they might be > > #defines > > or typedefs or whatever, and instead used your own typedef or > > struct types. > > It seems a typedef to int is seen through, even if you chain two of > them > together. > After preprocessing, newlib has: > > typedef long int __int32_t; > > typedef __int32_t int32_t ; > > So the crucial point seems to be to have 'long int', but that is of > course not portable for int32_t. > > So to get portable code and consistent messages, I suppose we should > use > a struct: > > typedef struct { int32_t i; } my_int32; > my_int32 s42 = { 42 }; > my_int32 *buf = (my_int32 *) __builtin_alloca (4 * size + 3); /* { > dg-warning "allocated buffer size is not a multiple of the pointee's > size" } */ > buf[size] = s42; /* { dg-warning "stack-based buffer overflow" } > */ > > Now suddenly the diagram is made *more* verbose, with the struct > keyword > added. > > ┌─┐ > │ write of ‘struct my_int32’ (4 > bytes) │ > > └─┘ > │ │ > │ │ > v v > ┌───┐ > ┌┐ > │ buffer allocated on stack at (1) │ │ after valid > range │ > └───┘ > └┘ > ├───┬───┤ > ├───┬┤ > │ │ > ╭┴───╮ > ╭─┴╮ > │capacity: ‘(size * 4) + 3’ bytes│ │overflow of 1 > byte│ > ╰╯ > ╰──╯ > Sorry, I didn't see this followup before sending my last email. I like this approach, as I don't care about exactly what the wording in that upper-right box is, just the wording towards the bottom of the diagram. Thanks Dave
Planned Fortran unsigned numbers branch
Hi everybody, now that a proposal for unsigned number inclusion in Fortran has passed the J3 hurdle, https://j3-fortran.org/doc/year/24/24-116.txt , I thought I would put my working hours where my mouth is and try my hand at a testbed implementation for gfortran. I am still grateful to Reinhold that he put this on the DIN list as a suggestion. I will use the text above as a preliminary spec. Of course, there is a chance that the feature may actually not make it into F2028, or that there would be differences, but that is what experimental work is for. As for my motivation, I hate having to drop to C because Fortran lacks a feature :-) Everything will be hidden behind a flag, tentatively called -funsigned, to allow inclusion into the compiler at a later date. The amount of work will be substantial, but not too difficult - mostly copying and modifying what already works for integers Putting the work on a public branch probably works best; I will do so in the next few days. As name, I will use fortran_unsigned, unless somebody has a better idea. As to when this will be finished... I don't know, it could already be somewhat usable before being complete. It is also work that can be split into many relatively small parts, just implementing one feature at a time. Best regards Thomas