[Bug c++/56715] New: Explicit Reg Vars are being ignored for consts when using g++

2013-03-24 Thread goswin-v-b at web dot de


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56715



 Bug #: 56715

   Summary: Explicit Reg Vars are being ignored for consts when

using g++

Classification: Unclassified

   Product: gcc

   Version: 4.7.2

Status: UNCONFIRMED

  Severity: normal

  Priority: P3

 Component: c++

AssignedTo: unassig...@gcc.gnu.org

ReportedBy: goswin-...@web.de





Created attachment 29714

  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29714

example source that experiences the bug



I'm trying to pass a value to an `asm' operand using a specific register for

arm with a freestanding compiler. Following the example from the info pages I

have the following code:



void foo() {

register const int r4 asm("r4") = 0x1000;

asm volatile("swi #1" : : "r"(r4));

}



void bar() {

register int r4 asm("r4") = 0x1000;

asm volatile("swi #1" : : "r"(r4));

}



Both foo() and bar() compile correct when using gcc. But when using g++ the

foo() function suddenly uses the "r3" register instead of "r4". The bar()

function remains correct.



% arm-none-eabi-g++ -v 

Using built-in specs.

COLLECT_GCC=arm-none-eabi-g++

COLLECT_LTO_WRAPPER=/usr/local/cross/libexec/gcc/arm-none-eabi/4.7.2/lto-wrapper

Target: arm-none-eabi

Configured with: ../gcc-4.7.2/configure --target=arm-none-eabi

--prefix=/usr/local/cross --disable-nls --enable-languages=c,c++

--without-headers

Thread model: single

gcc version 4.7.2 (GCC) 



% arm-none-eabi-gcc -O2 -save-temps -S bug.c good code

% arm-none-eabi-g++ -O2 -save-temps -S bug.c bad code



--

_Z3foov:

.fnstart

.LFB0:

@ Function supports interworking.

@ args = 0, pretend = 0, frame = 0

@ frame_needed = 0, uses_anonymous_args = 0

@ link register save eliminated.

mov r3, #4096

@ 3 "bug.c" 1

swi #1

@ 0 "" 2

bx  lr

--

The source explicitly asked for "r4" but g++ uses r3 instead.


[Bug c++/56715] Explicit Reg Vars are being ignored for consts when using g++

2013-03-24 Thread goswin-v-b at web dot de


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56715



--- Comment #2 from Goswin von Brederlow  2013-03-25 
00:07:19 UTC ---

(In reply to comment #1)

> const is a bit special in C++, it can be used as part of a const integer

> expression which is what is happening here.



How does that make it right to ignore the register specification? Or how do you

specify which register to use to pass the constant to asm in a specific

register?



To me it seems wrong to ignore the asm("r4") without even a warning. This does

break asm() statements that expect specific registers to be used.


[Bug c++/56715] Explicit Reg Vars are being ignored for consts when using g++

2013-03-25 Thread goswin-v-b at web dot de


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56715



Goswin von Brederlow  changed:



   What|Removed |Added



 Status|RESOLVED|UNCONFIRMED

 Resolution|INVALID |



--- Comment #5 from Goswin von Brederlow  2013-03-25 
11:11:52 UTC ---

If it is invalid, as in not allowed, then I would expect an error. If it is

undefined behaviour then I would expect a warning.



For example:



register const int r4 asm("r4") = 0x1000;

Warning: const expression wont be bound to a specific register.


[Bug target/66960] Add interrupt attribute to x86 backend

2017-07-18 Thread goswin-v-b at web dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66960

--- Comment #20 from Goswin von Brederlow  ---
So it's been a year since my last comment. Is this dead or is someone still
working on it? It would be a nice addition to gcc.

[Bug c/65668] New: gcc does not know how to use __eabi_uldivmod properly

2015-04-03 Thread goswin-v-b at web dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65668

Bug ID: 65668
   Summary: gcc does not know how to use __eabi_uldivmod properly
   Product: gcc
   Version: 4.9.2
   URL: https://gist.github.com/mrvn/0c79b146f74c28da401f
Status: UNCONFIRMED
  Severity: enhancement
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: goswin-v-b at web dot de
 Build: arm-none-eabi

I have a uint64_t free running counter with a frequenzy of 1Mhz and I want to
print that as hours, minutes, seconds and fraction:

volatile uint64_t count = 0x62a54bc4 // for example
uint64_t t = count;
uint32_t frac, seconds, minutes, hours;
frac = t % 100;
t /= 100;
seconds = t % 60;
t /= 60;
minutes = t % 60;
t /= 60;
hours = t;

This results in 6 calls to __eabi_uldivmod, one for every modulo and every
division, instead of just 3 calls. With long division being rather expensive
that is a substantial waste of time.


[Bug web/65699] New: online docs lacks version that it documents

2015-04-08 Thread goswin-v-b at web dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65699

Bug ID: 65699
   Summary: online docs lacks version that it documents
   Product: gcc
   Version: unknown
   URL: https://gcc.gnu.org/onlinedocs/gccint/
Status: UNCONFIRMED
  Severity: enhancement
  Priority: P3
 Component: web
  Assignee: unassigned at gcc dot gnu.org
  Reporter: goswin-v-b at web dot de
CC: goswin-v-b at web dot de

The online docs do not mention what version of the compiler they document. When
something doesn't work as documented this makes it hard to see if that
something is no longer valid in the local version or describes something not
yet present in the local version.


[Bug web/65700] New: Documentation of internals is inconsistent in itself and diverges from reality

2015-04-08 Thread goswin-v-b at web dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65700

Bug ID: 65700
   Summary: Documentation of internals is inconsistent in itself
and diverges from reality
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: web
  Assignee: unassigned at gcc dot gnu.org
  Reporter: goswin-v-b at web dot de

https://gcc.gnu.org/onlinedocs/gccint/Collect2.html says when collect2 is used
it generates a temporary file listing the constructors and destructors and that
the actual calls to the constructors are done from __main().

https://gcc.gnu.org/onlinedocs/gccint/Initialization.html now tells a quite
different storry, including the .ctros/.dtors that are actually used on
x86/x86_64. It still mentions __main() in connection with collect2 being used.

On ARM what actually happens is that there is a .init_array section and the
libc startup files are supposed to process that itself. Despite collect2 being
used there is no __main() function that gets called for this.
There is no .init section but still gcc does NOT insert a call to __main() when
compiling main() like the docs say it would.

Further the .init_array does not hold the constructors in reverse order. It
actually holds a automatic constructor generated by gcc first and then all the
functions manually declared as constructors. Care must be taken by the linker
script to sort them by priority or they are random. So in the case of ARM the
cinstructors need to be called in order, not in reverse order.

Overall I have to say the documentation confuses things more than it actually
helps. I don't know if that is because it hasn't been updated in a long time or
never was complete or internally consistent in the first place. But it sure
could use some love.

If they can't be improved please at least add a comment where they are outdated
or when they where last synced against the source so it becomes clear to the
reader where they are lacking.


[Bug web/65699] online docs lacks version that it documents

2015-04-13 Thread goswin-v-b at web dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65699

--- Comment #4 from Goswin von Brederlow  ---
Yes, a simple statement like that was exactly what I had in mind.


[Bug c++/65199] New: Linker failure with -flto

2015-02-24 Thread goswin-v-b at web dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65199

Bug ID: 65199
   Summary: Linker failure with -flto
   Product: gcc
   Version: 4.8.3
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: goswin-v-b at web dot de
  Host: x86_64-linux-gnu
Target: x86_64-linux-gnu
 Build: arm-none-eabi

I'm building a bare-metal kernel for a Raspberry Pi 2 (armv7) in c++. At some
point this failed with "undefined reference to `memcpy'" so I implemented one
as extern "C" void * memcpy(void *dest, const void *src, uint32_t n). But that
gives a different error:

% arm-none-eabi-g++ -O2 -W -Wall -fPIE -flto -march=armv7-a -mfloat-abi=hard
-mfpu=vfpv3-d16 -ffreestanding -nostdlib -std=gnu++11 -fno-exceptions -fno-rtti
-c -o main.o main.cc
% arm-none-eabi-g++ -fPIE -nostdlib -O2 -flto boot.o font.o main.o -lgcc
-Tlink-arm-eabi.ld -o kernel.elf
`memcpy' referenced in section `.text' of /tmp/cc7IkgU6.ltrans0.ltrans.o:
defined in discarded section `.text' of main.o (symbol from plugin)
collect2: error: ld returned 1 exit status

Running the same command to link but without -flto succeeds:

% arm-none-eabi-g++ -fPIE -nostdlib -O2 boot.o font.o main.o -lgcc
-Tlink-arm-eabi.ld -o kernel.elf


[Bug c++/65199] Linker failure with -flto

2015-02-25 Thread goswin-v-b at web dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65199

--- Comment #2 from Goswin von Brederlow  ---
That fixes it. Isn't it a gcc bug though not to detect that itself?


[Bug lto/65252] New: Link time optimization breaks use of filenames in linker scripts

2015-02-28 Thread goswin-v-b at web dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65252

Bug ID: 65252
   Summary: Link time optimization breaks use of filenames in
linker scripts
   Product: gcc
   Version: 4.8.3
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: lto
  Assignee: unassigned at gcc dot gnu.org
  Reporter: goswin-v-b at web dot de
CC: goswin-v-b at web dot de
  Host: x86_64-linux-gnu
Target: x86_64-linux-gnu
 Build: arm-none-eabi

I'm building a kernel for a Rapsberry Pi 2 with -flto. Most of the code will be
linked to 0x8000. The kernel image will be loaded to 0x8000 and I have set
up LMA and VMA in my linker script accordingly. But I have some bootstrap code
(boot.S and early.cc) that needs to at the physical address. So I put the
following in my linker script:

ENTRY(_start)
PHYS_TO_VIRT = 0x8000;
SECTIONS
{
. = 0x8000;
.early : {
boot.o(.*)
early.o(.*)
}

/* rest of the code runs in higher half virtual address */
. = . + PHYS_TO_VIRT;
.text : AT(ADDR(.text) - PHYS_TO_VIRT) {
...

Using objdump -d I see the boot.o contents show up at 0x8000 exactly as it
should. But all the code from early.o only appears later in the .text section
and at the virtual adress. If I drop the -flto then everything works as
expected. It would be nice if -flto could preserve which file each function and
variable comes from so the linker can place them properly.


[Bug lto/65252] Link time optimization breaks use of filenames in linker scripts

2015-02-28 Thread goswin-v-b at web dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65252

--- Comment #2 from Goswin von Brederlow  ---
As long as it's only one C/C++ file that works. But if one has multiple files
then -fno-lto would optimize less. I was thinking of a more general case than
mine.


[Bug lto/65262] New: Link time optimization breaks use __attribute__((section("..."))) in templates

2015-03-01 Thread goswin-v-b at web dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65262

Bug ID: 65262
   Summary: Link time optimization breaks use
__attribute__((section("..."))) in templates
   Product: gcc
   Version: 4.8.3
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: lto
  Assignee: unassigned at gcc dot gnu.org
  Reporter: goswin-v-b at web dot de
    CC: goswin-v-b at web dot de

Created attachment 34911
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=34911&action=edit
Simple testcase

I'm trying to put a template member functions of a class into a different
section. Without -flto this works but with -flto the function reverts to the
.text section.

g++ -O2 -W -Wall -fvisibility=hidden -fno-inline -Tlink.ld -c -o main.o main.cc
g++ -O2 -W -Wall -fvisibility=hidden -fno-inline -Tlink.ld -o main main.o
g++ -O2 -W -Wall -fvisibility=hidden -fno-inline -Tlink.ld -flto -c -o
main.lto.o main.cc
g++ -O2 -W -Wall -fvisibility=hidden -fno-inline -Tlink.ld -flto -o main.lto
main.lto.o
Without link time optimization:
0200 ld  .text.foo    .text.foo
0210 g F .text.foo  0006  .hidden
foo()
0200  wF .text.foo  0006  .hidden
int foobar()

With link time optimization:
0820 ld  .text.foo    .text.foo
0100 l F .text  0006  int foobar()
0820 l F .text.foo  0006  foo()


[Bug lto/65262] Link time optimization breaks use __attribute__((section("..."))) in templates

2015-03-02 Thread goswin-v-b at web dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65262

--- Comment #2 from Goswin von Brederlow  ---
The linker script is only there because the default script combines all .text.*
into one hiding the effect. One could use different section names that the
default script does not combine and work without a custom linker script.

LTO is free to privatizes template instantiations. But if it doesn't inline the
template then it should preserve the section attribute on it like it does for
normal functions. All optimized clones of a normal functions are still in the
same section the original function was in.

I could understand if a template would end up in the section of the function
causing the instantiation (although what if functions from different sections
use the same instance?). But templates simply end up in the .text section no
matter what they where originally or where they get instantiated. I don't know
the internals but it looks to me like something should copy the section
attribute from the template to the privatized function in LTO mode.

You can't set a section on the template, you can't use a file scope in the
linker, you can't even use __attribute__((always_inline)) and the behaviour
differs from without -flto. How is that a WONTFIX?


[Bug target/66960] Add interrupt attribute to x86 backend

2016-06-30 Thread goswin-v-b at web dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66960

Goswin von Brederlow  changed:

   What|Removed |Added

 CC||goswin-v-b at web dot de

--- Comment #11 from Goswin von Brederlow  ---
I think the design is fundamentally lacking in the following points:

1. interrupt handler must be declared with a mandatory pointer argument:

struct interrupt_frame;

__attribute__ ((interrupt))
void
f (struct interrupt_frame *frame)
{
...
}

and user must properly define the structure the pointer pointing to.

First how does one define the struct interrupt_frame properly? What is in
there? Is that just the data the CPU pushes to the stack? If so then gcc should
define the structure somewhere so code can be written cpu independent.

Since the frame pointer is passed as argument I assume the function prolog will
save the first argument register (on amd64) to stack. Is that to be included in
the struct interrupt_frame?

Secondly how does one access the original register contents? Some kernel
designs use a single kernel stack and switch tasks when returning to user
space. That means that one has to copy all the user registers into the thread
structure and reload a new set of user registers from the new thread on exit
from the interrupt handler. The above interface would not allow this.


2. exception handler:

The exception handler is very similar to the interrupt handler with a
different mandatory function signature:

typedef unsigned int uword_t __attribute__ ((mode (__word__)));

struct interrupt_frame;

__attribute__ ((interrupt))
void
f (struct interrupt_frame *frame, uword_t error_code)
{
...
}

and compiler pops the error code off stack before the 'IRET' instruction.

In a kernel there will always be some exception that simply prints a register
dump and stack backtrace. So again how do you access the original register
contents?

Secondly why pass error_code as argument if is already on the stack and could
be accessed through the frame pointer? The argument register (amd64) would have
to be saved on the stack too causing an extra push/pop. But if it is passed as
argument then why not pop it before the call to keep the stack frame the same
as for interrupts (assuming the saved registers are not included in the frame)?

If it is not poped or saved registers are included in the frame then the
exception stack frame differs from the interrupt frame (extra error_code and
extra register). They should not use the same structure, that's just confusing.

'no_caller_saved_registers' attribute

Use this attribute to indicate that the specified function has no
caller-saved registers.  That is, all registers are callee-saved.

Does that include the argument registers (if the function takes arguments)?
Wouldn't it be more flexible to define a list of registers that the function
will clobber?

[Bug target/66960] Add interrupt attribute to x86 backend

2016-07-04 Thread goswin-v-b at web dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66960

--- Comment #13 from Goswin von Brederlow  ---
(In reply to H.J. Lu from comment #12)
> (In reply to Goswin von Brederlow from comment #11)
> > I think the design is fundamentally lacking in the following points:
> > 
> > 1. interrupt handler must be declared with a mandatory pointer argument:
> > 
> > struct interrupt_frame;
> > 
> > __attribute__ ((interrupt))
> > void
> > f (struct interrupt_frame *frame)
> > {
> > ...
> > }
> > 
> > and user must properly define the structure the pointer pointing to.
> > 
> > First how does one define the struct interrupt_frame properly? What is in
> > there? Is that just the data the CPU pushes to the stack? If so then gcc
> > should define the structure somewhere so code can be written cpu 
> > independent.
> 
> interrupt data is pushed onto stack by CPU:
> 
> struct interrupt_frame
> {
>   uword_t ip;
>   uword_t cs;
>   uword_t flags;
>   uword_t sp;
>   uword_t ss;
> };
> 
> However, void * works if you need to access interrupt data.  Interrupt
> handler should provide its working definition.
> 
> > Since the frame pointer is passed as argument I assume the function prolog
> > will save the first argument register (on amd64) to stack. Is that to be
> > included in the struct interrupt_frame?
> 
> No.  The interrupt frame pointer points to interrupt data on stack
> pushed by CPU.
> 
> > Secondly how does one access the original register contents? Some kernel
> > designs use a single kernel stack and switch tasks when returning to user
> > space. That means that one has to copy all the user registers into the
> > thread structure and reload a new set of user registers from the new thread
> > on exit from the interrupt handler. The above interface would not allow 
> > this.
> 
> The interrupt attribute provides a way to access interrupt data on stack
> pushed by CPU, nothing more and nothing less.

That design seriously limits the usability of this feature.

> > 
> > 2. exception handler:
> > 
> > The exception handler is very similar to the interrupt handler with a
> > different mandatory function signature:
> > 
> > typedef unsigned int uword_t __attribute__ ((mode (__word__)));
> > 
> > struct interrupt_frame;
> > 
> > __attribute__ ((interrupt))
> > void
> > f (struct interrupt_frame *frame, uword_t error_code)
> > {
> > ...
> > }
> > 
> > and compiler pops the error code off stack before the 'IRET' 
> > instruction.
> > 
> > In a kernel there will always be some exception that simply prints a
> > register dump and stack backtrace. So again how do you access the original
> > register contents?
> 
> You need to do that yourself.

Which means __attribute__ ((interrupt)) can't be used for exceptions half the
time.

> > Secondly why pass error_code as argument if is already on the stack and
> > could be accessed through the frame pointer? The argument register (amd64)
> > would have to be saved on the stack too causing an extra push/pop. But if it
> > is passed as argument then why not pop it before the call to keep the stack
> > frame the same as for interrupts (assuming the saved registers are not
> > included in the frame)?
> 
> error_code is a pseudo parameter, which is mapped to error code on stack
> pushed by CPU.  You can write a simple code to see it yourself.

Couldn't the same trick be used for registers? Pass them as pseudo parameters
and they either resolve to the location on the stack where gcc did save them or
become the original register unchanged.

> > If it is not poped or saved registers are included in the frame then the
> > exception stack frame differs from the interrupt frame (extra error_code and
> > extra register). They should not use the same structure, that's just
> > confusing.
> > 
> > 'no_caller_saved_registers' attribute
> > 
> > Use this attribute to indicate that the specified function has no
> > caller-saved registers.  That is, all registers are callee-saved.
> > 
> > Does that include the argument registers (if the function takes arguments)?
> 
> Yes.
> 
> > Wouldn't it be more flexible to define a list of registers that the function
> > will clobber?
> 
> How do programmer know which registers will be clobbered?

The programmer writes the function. He declares what registers will be
clobbered and gcc will add the necessary code to preserve any other registers
it uses inside the function.

[Bug target/66960] Add interrupt attribute to x86 backend

2016-07-05 Thread goswin-v-b at web dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66960

--- Comment #15 from Goswin von Brederlow  ---
(In reply to H.J. Lu from comment #14)
> (In reply to Goswin von Brederlow from comment #13)
> > > > Secondly why pass error_code as argument if is already on the stack and
> > > > could be accessed through the frame pointer? The argument register 
> > > > (amd64)
> > > > would have to be saved on the stack too causing an extra push/pop. But 
> > > > if it
> > > > is passed as argument then why not pop it before the call to keep the 
> > > > stack
> > > > frame the same as for interrupts (assuming the saved registers are not
> > > > included in the frame)?
> > > 
> > > error_code is a pseudo parameter, which is mapped to error code on stack
> > > pushed by CPU.  You can write a simple code to see it yourself.
> > 
> > Couldn't the same trick be used for registers? Pass them as pseudo
> > parameters and they either resolve to the location on the stack where gcc
> > did save them or become the original register unchanged.
> 
> No.  We only do it for data pushed onto stack by CPU.

I was thinking of something like:

__attribute__ ((interrupt("save_regs")))
void
f (struct interrupt_frame *frame, uword_t error_code, struct regs regs)
{
kprintf("user SP = %#016x\n", regs.sp);
}

[Bug target/66960] Add interrupt attribute to x86 backend

2016-07-06 Thread goswin-v-b at web dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66960

--- Comment #17 from Goswin von Brederlow  ---
(In reply to H.J. Lu from comment #16)
> (In reply to Goswin von Brederlow from comment #15)
> 
> > > No.  We only do it for data pushed onto stack by CPU.
> > 
> > I was thinking of something like:
> > 
> > __attribute__ ((interrupt("save_regs")))
> > void
> > f (struct interrupt_frame *frame, uword_t error_code, struct regs regs)
> > {
> > kprintf("user SP = %#016x\n", regs.sp);
> > }
> 
> It is an interesting idea.  But frame and err_code are created by caller,
> which is CPU, not by callee.  You want to not only save all original
> registers of interrupted process, but also make them available to interrupt
> handler.  This won't be supported without significant changes in
> infrastructure.

Is it a significant change?

On a normal function gcc creates a stackframe and pushes callee saved registers
that it later uses onto the stack. I'm suggesting doing much the same with 2
small changes:

1) push all registers unconditionally
2) make the address where the registers got pushed to known to the function

[Bug c/104828] New: Wrong out-of-bounds array access warning on literal pointers

2022-03-07 Thread goswin-v-b at web dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104828

Bug ID: 104828
   Summary: Wrong out-of-bounds array access warning on literal
pointers
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: goswin-v-b at web dot de
  Target Milestone: ---

Trying to access a pointer cast from an integer literal gives a out of bounds
warning:

--
#define UART0_BASE  0x3F201000
void putc(char c) {
  volatile unsigned int *UART0_DR = (volatile unsigned int *)(UART0_BASE);
  volatile unsigned int *UART0_FR = (volatile unsigned int *)(UART0_BASE +
0x18);
  while (*UART0_FR & (1 << 5) ) { }
  *UART0_DR = c;
}
--

:5:3: warning: array subscript 0 is outside array bounds of 'volatile
unsigned int [0]' [-Warray-bounds]
5 |   *UART0_DR = c;
  |   ^

The error goes away if the pointer is global or static.

The error remains if the pointer is returned from a function with alloc_size
attribute:

--
#include 
#include 

#define UART0_BASE  0x3F201000

volatile uint32_t * make(uintptr_t addr, size_t size = 4) __attribute__
((alloc_size (2)));
volatile uint32_t * make(uintptr_t addr, size_t size) {
(void)size;
return (volatile uint32_t *)addr;
}

void putc(char c) {
  volatile uint32_t *UART0_DR = make(UART0_BASE);
  volatile uint32_t *UART0_FR = make(UART0_BASE + 0x18);
  while (*UART0_FR & (1 << 5) ) { }
  *UART0_DR = c;
}
--

:16:3: warning: array subscript 0 is outside array bounds of 'volatile
uint32_t [0]' [-Warray-bounds]
   16 |   *UART0_DR = c;
  |   ^

The warning goes away if the "make" helper is extern and can't be inlined.

Gcc 11.2 and before do not give this warning.

[Bug middle-end/99578] [11/12 Regression] gcc-11 -Warray-bounds or -Wstringop-overread warning when accessing a pointer from integer literal

2022-03-07 Thread goswin-v-b at web dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99578

--- Comment #29 from Goswin von Brederlow  ---
(In reply to Jakub Jelinek from comment #26)

> That is nonsense.  The amount of code in the wild that relies on (type
> *)CONSTANT
> working is insane, you can't annotate it all.  And it has worked fine for
> decades.  The pointers aren't invalid, they point to valid objects in the
> address space.
> POSIX supports MAP_FIXED for a reason (and in many embedded cases one
> doesn't even have an MMU and I/O or other special areas are mapped directly).

A cast from integer to pointer is implementation defined behavior except for

1) 0 which must cast to NULL / nullptr
2) if the integer value was constructed by casting a pointer to integer of
suitable size

There is no garantee in the C standard that '(type *)CONSTANT' will actually
point to the hardware address 'CONSTANT'. It's just how gcc happens to do it in
most cases. So no, your code is not fine. It is fragile. It relies on
implementation details of gcc. But lets not argue about that.


Detecting NULL pointer access and offsets to it is a good thing, except where
it isn't. It's unfortunate it also catches other stuff. Under AmigaOS the
pointer to the exec.library (no program can work without that) is stored in
address 4. So there isn't an universal value of "this is big enough not to be
an offset to NULL".

Detecting if an expression involves NULL might be hard. If it starts as
NULL->member then it's easy. What about (&x - &x)+offsetof(X.member) or
(uintptr_t)&x.member - (uintptr_t)&x or similar stuff you easily get with
macros. On the other side (T*)0x45634534 should be easy to detect as not being
NULL+offset. It's a literal. But the grey zone inbetween the easy cases might
be to big to be useful.

Alternatively an annotation for this would actually go nicely with another bug
I reported: 'add feature to create a pointer to a fixed address as constexpr'
[1].  The annotation would avoid the warning and also make it a pointer literal
that can be used in constexpr (appart from accessing the address). It could
also cause gcc to handle the case where CONSTANT can't just be cast to pointer
and work. Like when using address authentication on ARMv8 CPUs, to name
something modern.

And the size of the object the pointer points to can be taken from its type,
i.e. the pointer is to a single object and never an (infinite) array. If you
want a pointer to an array then cast it to an array of the right size.

--
[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104514

[Bug middle-end/99578] [11 Regression] gcc-11 -Warray-bounds or -Wstringop-overread warning when accessing a pointer from integer literal

2022-03-19 Thread goswin-v-b at web dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99578

--- Comment #38 from Goswin von Brederlow  ---
(In reply to Jonathan Wakely from comment #34)
> (In reply to Goswin von Brederlow from comment #29)
> > There is no garantee in the C standard that '(type *)CONSTANT' will actually
> > point to the hardware address 'CONSTANT'. It's just how gcc happens to do it
> > in most cases. So no, your code is not fine. It is fragile. It relies on
> > implementation details of gcc. But lets not argue about that.
> 
> Actually, lets. It relies on guaranteed behaviour of GCC:
> https://gcc.gnu.org/onlinedocs/gcc/Arrays-and-pointers-implementation.html
> That's not going to change, and neither is the fact that the Linux kernel
> depends on implementation-defined properties of GCC (where
> "implementation-defined" is used in the formal sense, not "just an
> implementation detail that might change tomorrow").

Thank you for agreeing with me that "It relies on implementation details of
gcc". That's exactly what I said.

[Bug c++/104514] New: add feature to create a pointer to a fixed address as constexpr

2022-02-12 Thread goswin-v-b at web dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104514

Bug ID: 104514
   Summary: add feature to create a pointer to a fixed address as
constexpr
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: goswin-v-b at web dot de
  Target Milestone: ---

In the embedded and micro controller world memory mapped registers are very
common. They can be declared as external object and fudged in using linker
scripts, which prevents a lot of optimizations. Or they can be declared as
pointers, in the most reduced form like this:

int *p = (int*)0x12345678;

My problem now is that this isn't a constexpr and can't be used in any
constexpr code:

foo.cc:1:20: error: ‘reinterpret_cast’ from integer to pointer
1 | constexpr int *p = (int*)0x12345678;
  |^~~~

While this is the right thing in general there should be a way to allow this
special case. A way to tell the compiler that an object exists at a fixed
address and still be a constexpr.

[Bug c/105521] New: missed optimization in modulo arithmetic

2022-05-07 Thread goswin-v-b at web dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105521

Bug ID: 105521
   Summary: missed optimization in modulo arithmetic
   Product: gcc
   Version: 11.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: goswin-v-b at web dot de
  Target Milestone: ---

I'm trying to compute (a*a)%n for uint64_t types on x86_64 using "gcc -O2 -W
-Wall" like this:

  #include 
  #include 

  uint64_t sqrmod(uint64_t a, uint64_t n) {
assert(a < n);
unsigned __int128 x = a;
x *= a;
return x % n;
  }

I expected to get the following code:

  sqrmod:
cmpq%rsi, %rdi
jnb .L13 // assert(a < n) failure
movq%rdi, %rax
mul %rdi
div %rsi
movq%rdx, %rax
ret

The compiler does get the "mul" right but instead of the "div" it throws in a
call to "__umodti3". The "__umodti3" function is horribly long code that will
be worlds slower than a simple div.

Note: The "asset(a < n);" should tell the compiler that the "div" instruction
can not overflow and will not cause a #DivisionError. Without the assert the
compiler could (conditionally) add "a %= n;" for the same effect.

https://godbolt.org/z/cd57Wd4oo

[Bug target/105521] missed optimization in modulo arithmetic

2022-05-08 Thread goswin-v-b at web dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105521

--- Comment #3 from Goswin von Brederlow  ---
(In reply to Andrew Pinski from comment #1)
> This requires having a, 64bit/32bit (and 128bit/64bit) pattern really. So
> this is both a middle-end issue and a target issue.
> 
> Note there might be another bug asking for the same optimization.
> 
> Also note x86_64 might be the only popular target which has this kind of div
> instruction so this might not get any attention as it is also a small
> peephole where most people don't use 128bit integers either (they are
> non-standard even).

I know m68k had a 64bit/32bit pattern but it is indeed rare.

On x86_64 a (32bit * 32bit = 64bit) % 32bit uses the 128bit/64bit DIV
instruction and two extra truncation to 32bit for the input registers. On many
cpus that is significantly (factor 3-10) slower than the 64bit/32bit version.

This could potentially affect every / and % operation and preceding *, allowing
for the faster opcodes with fewer bits to be used where the compiler can reason
about the magnitude of the arguments.

[Bug libstdc++/105844] New: std::lcm(50000, 49999) is UB but accepted in a constexpr due to cast to unsigned

2022-06-03 Thread goswin-v-b at web dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105844

Bug ID: 105844
   Summary: std::lcm(5, 4) is UB but accepted in a
constexpr due to cast to unsigned
   Product: gcc
   Version: 12.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: goswin-v-b at web dot de
  Target Milestone: ---

Running "gcc-12.1 -std=c++20 -O2 -W -Wall" on

#include 
constinit int t = std::lcm(5, 4);

produces

t:
.long   -1795017296

The standard says:

> The behavior is undefined if |m|, |n|, or the least common multiple of |m|
> and |n| is not representable as a value of type std::common_type_t. 

Which is the case here, the lvm overflows and is undefined. The negative number
produced is not correct and the compile should fail.

The problem is the __absu function in  casting the arguments to an
unsigned type:

  // std::abs is not constexpr, doesn't support unsigned integers,
  // and std::abs(std::numeric_limits::min()) is undefined.
  template
constexpr _Up
__absu(_Tp __val)
{
  static_assert(is_unsigned<_Up>::value, "result type must be unsigned");
  static_assert(sizeof(_Up) >= sizeof(_Tp),
  "result type must be at least as wide as the input type");
  return __val < 0 ? -(_Up)__val : (_Up)__val;
}

  /// Least common multiple
  template
constexpr common_type_t<_Mn, _Nn>
lcm(_Mn __m, _Nn __n) noexcept
{
  static_assert(is_integral_v<_Mn>, "std::lcm arguments must be integers");
  static_assert(is_integral_v<_Nn>, "std::lcm arguments must be integers");
  static_assert(_Mn(2) == 2, "std::lcm arguments must not be bool");
  static_assert(_Nn(2) == 2, "std::lcm arguments must not be bool");
  using _Up = make_unsigned_t>;
  return __detail::__lcm(__detail::__absu<_Up>(__m),
 __detail::__absu<_Up>(__n));
}

__lvm is called with unsigned arguments which do not overflow for the given
numbers. And any unsigned overflow would not be undefined behavior. The result
of __lcm is then converted back to the signed type, which is not UB.

I suggest the following changes:

  // LCM implementation
  template
constexpr _Tp
__lcm(_Tp __m, _Tp __n)
{
  static_assert(__m == 0 || __n == 0 || __m / __detail::__gcd(__m, __n) <=
std::numeric_limits<_Tp>::max() / __n, "std::lcm not representable in commont
type");
  return (__m != 0 && __n != 0)
? (__m / __detail::__gcd(__m, __n)) * __n
: 0;
}


  /// Least common multiple
  template
constexpr common_type_t<_Mn, _Nn>
lcm(_Mn __m, _Nn __n) noexcept
{
  static_assert(is_integral_v<_Mn>, "std::lcm arguments must be integers");
  static_assert(is_integral_v<_Nn>, "std::lcm arguments must be integers");
  static_assert(_Mn(2) == 2, "std::lcm arguments must not be bool");
  static_assert(_Nn(2) == 2, "std::lcm arguments must not be bool");
  using _Cp = common_type_t<_Mn, _Nn>;
  using _Up = make_unsigned_t>;
  _Up t = __detail::__lcm(__detail::__absu<_Up>(__m),
  __detail::__absu<_Up>(__n));
  static_assert(t <= (_Up)std::numeric_limits<_Cp>::max(), "std::lcm not
representable in commont type");
  return t;
}

[Bug libstdc++/105844] std::lcm(50000, 49999) is UB but accepted in a constexpr due to cast to unsigned

2022-06-03 Thread goswin-v-b at web dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105844

Goswin von Brederlow  changed:

   What|Removed |Added

 CC||goswin-v-b at web dot de

--- Comment #1 from Goswin von Brederlow  ---
Created attachment 53081
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53081&action=edit
Patch for numeric

Patch for the proposed changes

[Bug libstdc++/105844] std::lcm(50000, 49999) is UB but accepted in a constexpr due to cast to unsigned

2022-06-03 Thread goswin-v-b at web dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105844

--- Comment #2 from Goswin von Brederlow  ---
I know the patch doesn't work yet, the static_asserts aren't constexpr. But
hopefully it gives someone enough of an idea to fix it.

[Bug libstdc++/105844] std::lcm(50000, 49999) is UB but accepted in a constexpr due to cast to unsigned

2022-06-04 Thread goswin-v-b at web dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105844

--- Comment #3 from Goswin von Brederlow  ---
Created attachment 53082
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53082&action=edit
Working patch for detecting UB

This will abort if the arguments are too large instead of static_assert, best I
could figure out that would work.