Problem with induction variables optimization pass
Hello, I'm currently writing a gcc 4.4.5 backend for an 18 bit architecture. I have a c-project with some thousand lines of code. Without optimizations it compiles. But with -O1 and -O2 I encounter a problem in the induction variables optimization pass. The main issue is, that a temporary variable is created which works as a memory reference but is a 36 bit value (SImode). This results in an error at explow.c line 326, because pointers have to be 18 bit wide (HImode). The code at this position is: rtx convert_memory_address (enum machine_mode to_mode ATTRIBUTE_UNUSED, rtx x) { #ifndef POINTERS_EXTEND_UNSIGNED gcc_assert (GET_MODE (x) == to_mode || GET_MODE (x) == VOIDmode); This asserts because x has mode SI and to_mode is Pmode/HImode. I used the -fdump-tree-all option and in the source.c.126t.final_cleanup file I found the following: isr00 () { unsigned int ivtmp.112; unsigned int D.3690; long unsigned int D.3683; unsigned int ivtmp.105; unsigned int ivtmp.104; unsigned int prephitmp.91; unsigned int length; unsigned int D.1415; struct usb * usb0.0; … : D.3683 = (long unsigned int) ivtmp.105 + (long unsigned int) ((unsigned int *) (long unsigned int) usb0.0 + 320); spi_temporary_data[ivtmp.112] = [bit_and_expr] (unsigned char) (MEM[base: D.3683] >> 8) & 255; spi_temporary_data[ivtmp.112 + 1] = [bit_and_expr] (unsigned char) MEM[base: D.3683] & 255 The problem is the D.3683 variable. The casts in the assignment for this variable are kind of senseless, because ivtmp.105 is unsigned int (18 bit), usb0.0 is a pointer which is also only 18 bit. I checked the source files and I think the main problem is the use of sizetype in several locations working with addresses. sizetype is defined as long unsigned int and results in SImode when used. I found 3 code positions with an input tree mode of HImode where the use of sizetype or size_type_node results in a tree node with SImode: tree-ssa-address.c line 414 here parts->index becomes a SImode tree node static void add_to_parts (struct mem_address *parts, tree elt) { tree type; if (!parts->index) { parts->index = fold_convert (sizetype, elt); return; } tree-ssa-address.c line 547 here part becomes a SImode tree node static void addr_to_parts (aff_tree *addr, struct mem_address *parts, bool speed) { ... /* Then try to process the remaining elements. */ for (i = 0; i < addr->n; i++) { part = fold_convert (sizetype, addr->elts[i].val); ... } tree.cline 8332 here t becomes a SImode tree node signed_or_unsigned_type_for (int unsignedp, tree type) { tree t = type; if (POINTER_TYPE_P (type)) t = size_type_node; if (!INTEGRAL_TYPE_P (t) || TYPE_UNSIGNED (t) == unsignedp) return t; return lang_hooks.types.type_for_size (TYPE_PRECISION (t), unsignedp); } As a solution, I tried to replace those 3 sizetype uses with global_trees[TI_UINTHI_TYPE]. So the relevant lines look like this: parts->index = fold_convert (global_trees[TI_UINTHI_TYPE] /*sizetype*/, elt); part = fold_convert (global_trees[TI_UINTHI_TYPE] /*sizetype*/, addr->elts[i].val); t = global_trees[TI_UINTHI_TYPE] /*size_type_node*/; With those changes, the compilation comes farther but crashes on the following gcc_assert: tree.cline 3312function build2_stat if (code == POINTER_PLUS_EXPR && arg0 && arg1 && tt) gcc_assert (POINTER_TYPE_P (tt) && POINTER_TYPE_P (TREE_TYPE (arg0)) && INTEGRAL_TYPE_P (TREE_TYPE (arg1)) && useless_type_conversion_p (sizetype, TREE_TYPE (arg1))); If I comment out the "&& useless_type_conversion_p (sizetype, TREE_TYPE (arg1))" part the compilation works. I know its bad, but I don't know how to replace the sizetype in this case so that the assert doesn't happen. I'm aware of the fact that those fixes would result in errors with other architectures. But just for my "strange 18 bit" architecture, is this ok? Might there be problems in some cases? Why is sizetype used in this positions anyways? Wouldn't it be better to somehow use the input tree type instead of sizetype? I hope the description of the problem is understandable. Regards, Eric Neumann
Problem with reload generating direct memory accesses also GO_IF_LEGITIMATE_ADDRESS does not permit this
Hello, I'm currently writing a gcc backend for a microcontroller architecture which can only handle indirect memory accesses. In normal cases all works fine, but there is a special case where the reload pass (<- not sure) produces a direct memory access in -O2 optimization mode which causes the postreload pass to abort. The relevant c-code uses a constant memory address: #define IO_BASE (0x34000) #define UART2OFF(0x08) #define UART2BASE ((void *)(IO_BASE+(UART2OFF*2))) u = (struct uart *)UART2BASE; So basically u has the constant address 0x34010 which is known at compile time. The instruction where the error occurs is the following: u->tx_data = tx; In the source.c.128r.expand dump file this line looks like this: (insn 40 39 41 5 ../uart2sim/uart2i_3.c:272 (set (reg/f:HI 46) (symbol_ref:HI ("tx") [flags 0x2] )) -1 (nil)) (insn 41 40 42 5 ../uart2sim/uart2i_3.c:272 (set (reg:HI 28 [ tx.44 ]) (zero_extend:HI (mem/c/i:QI (reg/f:HI 46) [0 tx+0 S1 A9]))) -1 (nil)) (insn 42 41 43 5 ../uart2sim/uart2i_3.c:272 (set (reg:HI 27 [ D.1392 ]) (reg:HI 28 [ tx.44 ])) -1 (nil)) (insn 43 42 44 5 ../uart2sim/uart2i_3.c:272 (set (reg/f:HI 47) (const_int -49136 [0x4010])) -1 (nil)) (insn 44 43 45 5 ../uart2sim/uart2i_3.c:272 (set (mem/s:HI (plus:HI (reg/f:HI 47) (const_int 4 [0x4])) [2 .tx_data+0 S2 A18]) (reg:HI 27 [ D.1392 ])) -1 (nil)) So basically its like it should be. An indirect store operation using pseudo register 47 as base register. The fact that the constant 0x34010 is negative is due to the fact that our pmode is only 18 bit wide. It shouldn't be a problem if the constant gets loaded into a register. The above insns change a bit during the following passes. In the source.c.168r.asmcons dump file it basically looks like this: (insn 171 36 43 2 ../uart2sim/uart2i_3.c:272 (set (reg/f:HI 104) (reg/f:HI 85)) 2 {movhi} (expr_list:REG_EQUAL (symbol_ref:HI ("tx") [flags 0x2] ) (nil))) (insn 43 171 49 2 ../uart2sim/uart2i_3.c:272 (set (reg/f:HI 105) (reg/f:HI 35)) 2 {movhi} (expr_list:REG_DEAD (reg/f:HI 35) (expr_list:REG_EQUAL (const_int -49136 [0x4010]) (nil (insn 41 39 44 4 ../uart2sim/uart2i_3.c:272 (set (reg:HI 28 [ tx.44 ]) (zero_extend:HI (mem/c/i:QI (reg/f:HI 85) [0 tx+0 S1 A9]))) 26 {zero_extendqihi2} (expr_list:REG_EQUAL (zero_extend:HI (mem/c/i:QI (symbol_ref:HI ("tx") [flags 0x2] ) [0 tx+0 S1 A9])) (nil))) (insn 44 41 45 4 ../uart2sim/uart2i_3.c:272 (set (mem/s:HI (plus:HI (reg/f:HI 105) (const_int 4 [0x4])) [2 .tx_data+0 S2 A18]) (reg:HI 28 [ tx.44 ])) 2 {movhi} (nil)) So the memory instruction is still indirect. But after the ira/reload pass the following is found in the source.c.172r.ira dump file: (insn 41 39 44 4 ../uart2sim/uart2i_3.c:272 (set (reg:HI 7 r7 [orig:28 tx.44 ] [28]) (zero_extend:HI (mem/c/i:QI (reg/f:HI 9 r9 [85]) [0 tx+0 S1 A9]))) 26 {zero_extendqihi2} (expr_list:REG_EQUAL (zero_extend:HI (mem/c/i:QI (symbol_ref:HI ("tx") [flags 0x2] ) [0 tx+0 S1 A9])) (nil))) (insn 44 41 45 4 ../uart2sim/uart2i_3.c:272 (set (mem/s:HI (plus:HI (const_int -49136 [0x4010]) (const_int 4 [0x4])) [2 .tx_data+0 S2 A18]) (reg:HI 7 r7 [orig:28 tx.44 ] [28])) 2 {movhi} (nil)) So NOW it is a direct store operation. And the compiler crashes with the following error message: ../uart2sim/uart2i_3.c: In Funktion »main«: ../uart2sim/uart2i_3.c:307: Fehler: Befehl erfüllt nicht seine Bedingungen: (insn 44 41 45 4 ../uart2sim/uart2i_3.c:272 (set (mem/s:HI (plus:HI (const_int -49136 [0x4010]) (const_int 4 [0x4])) [2 .tx_data+0 S2 A18]) (reg:HI 7 r7 [orig:28 tx.44 ] [28])) 2 {movhi} (nil)) ../uart2sim/uart2i_3.c:307: interner Compiler-Fehler: in reload_cse_simplify_operands, bei postreload.c:396 Its German output. In English it sais that the instruction doesn't match its constraints which are tested in line 396 in postreload.c which looks like this: /* Figure out which alternative currently matches. */ if (! constrain_operands (1)) fatal_insn_not_found (insn); Obviously this line asks the GO_IF_LEGITIMATE_ADDRESS macro. If I allow direct addresses there, the postreload pass doesn't crash. The GO_IF_LEGITIMATE_ADDRESS helper function looks like this: int valid = 0; switch (GET_CODE (x)) { case REG: valid = REG_OK_FOR_BASE_P (x); break; case PLUS: { rtx base = XEXP (x, 0); rtx offset = XEXP (x, 1); valid = (REG == GET_CODE (base) && REGNO_OK_FOR_BASE_P (base) && CONST_INT == GET_CODE (offset) && GET_CODE(offset) != SYMBOL_REF && x_const_ok_for_base (m
Delay slot problem
Hello, I have some trouble dealing with delay slots. The problem is that our architecture can't handle branch instructions using registers which have been changed in the previous instruction. So a delay slot is needed. The difficult part is, that this delay slot is only needed if the same register is used in the branch instruction. I have no idea how I should describe this using the instruction attributes which are usually used for delay slots. Is it somehow possible to extract the register number and store it in a const_int attribute? Thanks in advance! Eric Neumann
How to force an indirect jump?
Hello gcc gurus, I have a problem with jumps. Our architecture can only handle 13bit direct jumps and 18 bit indirect jumps. Sometimes those 13bit are not enough and I want to give the user the possibility to force jumps to be indirect jumps. Somehow I was not able to find a way to do so. The main problem is that I can't find a way to tell the compiler that I need a register for the indirect jump. I thought the most obvious way would be to "force_reg" the label operand in the expander definition. But this fails because the move patterns don't recognize a code_label as a legal operand (What operand type is a code_label, can it be placed in a register via a pattern?). An other idea was to acquire a scratch register, but this results in an endless loop when I try to compile a simple c file. Code was like this: (define_insn "jump" [(set (pc) (label_ref (match_operand 0 "" ""))) (clobber (match_scratch:HI 1 "r"))] ... Building a parallel pattern setting an register with the code label and setting the pc with this register fails, because gen_jump can't handle a pattern with an aditional register operand. So, basically, I'm out of ideas. If anyone has an idea how i can acquire a register and force the label into it, I would be very grateful. Thanks in advance, Eric Neumann