I believe I've found the bug.

When I noticed the program was failing to access the data, I suspected it
might be due to missing initializations. So, I tried using the alternative
loader for RISC-V compilers from 9legacy (thanks to Richard Miller), and
with a small adjustment ($-4), I managed to get the code running.

*— miller.s —*

#define EBREAK  WORD    $(0x73 | 1<<20)

TEXT start(SB), $-4
        /* set static base */
        MOVW    $setSB(SB), R3

        /* set stack pointer */
        MOVW    $(512*1024-16),R2

        /* clear bss */
        MOVW    $edata(SB), R1
        MOVW    $end(SB), R2
        MOVW    R0, 0(R1)
        ADD     $4, R1
        BLT     R2, R1, -2(PC)

        /* call main */
        JAL     R1, main(SB)

TEXT    abort(SB), $-4
        EBREAK
        RET

However, instead of printing the expected string "Hello world.....", only a
single character "v" appeared. Upon debugging, I discovered that in the
printstr function, storing R1/ra (which had the return address 0x80000076)
on the stack ended up overwriting part of the string. That address
coincides with the end of the program and overwrites the string with this
value, which explains why it prints 0x76 (i.e., 'v'), followed by 0x00, and
then stops as expected.

Looking into the loader code, I saw that the stack pointer (SP) should have
been set to a high address (e.g., MOVW $(512*1024-16), R2). However,
according to the gdb disassembly (0x80000014: addi sp, gp, -2032), it ends
up being much lower — right at the end of the program — which causes the
overwrite.

I’ve attached:

   -

   The assembly generated after linking

*— hqmiller.asm —*
term% il -l -a -H 1 -T0x80000000 -R4 -o hqmiller.bin miller.i helloq.i
 80000000:              (3)     TEXT    start+0(SB),$-4
 80000000: 800011b7 87818193(5) MOV     $setSB+0(SB),R3
 80000008: 00080137 ff010113(8) MOV     $524272,R2
 80000010: 81018093     (11)    MOV     $edata+0(SB),R1
 80000014: 81018113     (12)    MOV     $end+0(SB),R2
 80000018: 0000a023     (13)    MOVW    R0,0(R1)
 8000001c: 0091         (15)    ADD     $4,R1
 8000001e: fe20cde3     (15)    BLT     R2,R1,80000018(BRANCH)
 80000022: 20a9         (18)    JAL     ,R1,main+8000006c(BRANCH)
 80000024:              (20)    TEXT    abort+0(SB),$-4
 80000024: 00100073     (21)    WORD    ,$1048691
 80000028: 8082         (22)    JMP     ,0(R1)
 8000002a: 0001         (0)     ADD     $0,R0,R0
 8000002c:              (74)    TEXT    uartputc+0(SB),R0,$-4
 8000002c: 10000637     (74)    MOV     $268435456,R12
 80000030: 01841593 4185d593(74)        MOVB    R8,R11
 80000038: 01859513 41855513(76)        MOVB    R11,R10
 80000040: c208         (76)    MOVW    R10,0(R12)
 80000042: 8082         (76)    JMP     ,0(R1)
 80000044:              (80)    TEXT    printstr+0(SB),R0,$4
 80000044: 1161         (80)    ADD     $-8,R2
 80000046: c006         (80)    MOVW    R1,0(R2)
 80000048: 84a2         (80)    MOV     R8,R9
 8000004a: 00048583     (82)    MOVB    0(R9),R11
 8000004e: c999         (82)    BEQ     R11,80000064(BRANCH)
 80000050: 00148613     (83)    ADD     $1,R9,R12
 80000054: c632         (83)    MOVW    R12,s+0(FP)
 80000056: 00048403     (83)    MOVB    0(R9),R8
 8000005a: 3fc9         (83)    JAL     ,uartputc+8000002c(BRANCH)
 8000005c: 44b2         (83)    MOVW    s+0(FP),R9
 8000005e: 00048583     (82)    MOVB    0(R9),R11
 80000062: f5fd         (82)    BNE     R11,80000050(BRANCH)
 80000064: 4082         (83)    MOVW    0(R2),R1
 80000066: 0121         (83)    ADD     $8,R2
 80000068: 8082         (83)    JMP     ,0(R1)
 8000006a: 0001         (0)     ADD     $0,R0,R0
 8000006c:              (87)    TEXT    main+0(SB),R0,$4
 8000006c: 1161         (87)    ADD     $-8,R2
 8000006e: c006         (87)    MOVW    R1,0(R2)
 80000070: 80018413     (89)    MOV     $.string<>+0(SB),R8
 80000074: 3fc1         (89)    JAL     ,printstr+80000044(BRANCH)
 80000076: a001         (90)    JMP     ,38(APC)

   -

   The corresponding gdb disassembly

*— hqmiller.dump **—*
Dump of assembler code from 0x80000000 to 0x80000078:
=> 0x80000000:  lui     gp,0x80001
   0x80000004:  addi    gp,gp,-1928 # 0x80000878
   0x80000008:  lui     sp,0x80
   0x8000000c:  addi    sp,sp,-16 # 0x7fff0
   0x80000010:  addi    ra,gp,-2032
   0x80000014:  addi    sp,gp,-2032
   0x80000018:  sw      zero,0(ra)
   0x8000001c:  addi    ra,ra,4
   0x8000001e:  blt     ra,sp,0x80000018
   0x80000022:  jal     0x8000006c
   0x80000024:  ebreak
   0x80000028:  ret
   0x8000002a:  nop
   0x8000002c:  lui     a2,0x10000
   0x80000030:  slli    a1,s0,0x18
   0x80000034:  srai    a1,a1,0x18
   0x80000038:  slli    a0,a1,0x18
   0x8000003c:  srai    a0,a0,0x18
   0x80000040:  sw      a0,0(a2)
   0x80000042:  ret
   0x80000044:  addi    sp,sp,-8
   0x80000046:  sw      ra,0(sp)
   0x80000048:  mv      s1,s0
   0x8000004a:  lb      a1,0(s1)
   0x8000004e:  beqz    a1,0x80000064
   0x80000050:  addi    a2,s1,1
   0x80000054:  sw      a2,12(sp)
   0x80000056:  lb      s0,0(s1)
   0x8000005a:  jal     0x8000002c
   0x8000005c:  lw      s1,12(sp)
   0x8000005e:  lb      a1,0(s1)
   0x80000062:  bnez    a1,0x80000050
   0x80000064:  lw      ra,0(sp)
   0x80000066:  addi    sp,sp,8
   0x80000068:  ret
   0x8000006a:  nop
   0x8000006c:  addi    sp,sp,-8
   0x8000006e:  sw      ra,0(sp)
   0x80000070:  addi    s0,gp,-2048
   0x80000074:  jal     0x80000044
   0x80000076:  j       0x80000076

So, while I’ve located the root of the issue, I still don’t understand how
a value defined in the loader ends up being something different after
compilation. Any ideas or suggestions?

Thanks a lot,
José J.
P.S.: Thanks Ron, I hope to learn and enjoy playing with Plan 9.

El vie, 29 ago 2025 a las 22:51, ron minnich (<[email protected]>)
escribió:

> Thank you for your interest in Plan 9. I hope you will continue to study
> it. There are many valuable lessons in the code, which was created by the
> group that invented C, and, of course, Unix.
>
> On Fri, Aug 29, 2025 at 9:39 AM José J. Cabezas Castillo <
> [email protected]> wrote:
>
>> Hi,
>>
>> My name is José J. many years ago, in a college operating systems class,
>> the professor mentioned Plan 9 as a curiosity — an OS where "everything is
>> a file". Recently, I’ve had more time, so I installed 9front from the ISO
>> on a few machines to explore and learn.
>>
>> To get started with RISC-V, I tried creating a simple "Hello World" for
>> TinyEMU. I managed to output some characters using HTIF:
>>
>> ---h.c---
>> #include <u.h>
>>
>> #define         HTIFADDR_BASE                   0x40008000
>>
>> void
>> f(void)
>> {
>>         *((volatile uvlong *) HTIFADDR_BASE) = 0x0101000000000031ull;
>>         *((volatile uvlong *) HTIFADDR_BASE) = 0x0101000000000032ull;
>>         *((volatile uvlong *) HTIFADDR_BASE) = 0x0101000000000033ull;
>>         for(;;);
>> }
>> ---
>> Compiled with:
>> %ic -FVw h.c
>> %il -l -H 1 -T0x80000000 -o h32.bin h.i
>>
>> Then I wanted to improve it to print "Hello World", following this video
>> and GitHub repo:
>>
>>    - https://www.youtube.com/watch?v=HC7b1SVXoKM
>>    - https://github.com/chuckb/riscv-helloworld-c/
>>
>> The example is for Linux, but I tried to adapt it for Plan 9. Using QEMU
>> in Linux, I was able to print characters like before (with just h.c and
>> no functions). However, when I switched to using functions (helloq.c)
>> and a loader (chuck.s), it stopped working.
>>
>> Here is the code:
>>
>> ---helloq.c---
>>
>> #include <u.h>
>>
>> #define UART_BASE               0x10000000
>>
>> void
>> uartputc(char c)
>> {
>>         *((volatile ulong *) UART_BASE) = c;
>> }
>>
>> void
>> printstr(char *s)
>> {
>>         while (*s)
>>                 uartputc(*s++);
>> }
>>
>> void
>> main(void)
>> {
>>         printstr("Hello world\n");
>>         for (;;);
>> }
>> ---
>> ---chuck.s---
>> TEXT start(SB), $0
>>         /* set stack pointer */
>>         MOVW    $0x80020000, R2
>>         /* set frame pointer */
>>         ADD     R0, R2, R8
>>         /* call main */
>>         JAL     R1, main(SB)
>> ---
>> Compiled with:
>> ic -FVSw helloq.c
>> il -l -a -H 1 -T0x80000000 -R4 -o helloq.bin chuck.i helloq.i
>>
>> The assembler generated is:
>>  80000000:              (1)     TEXT    start+0(SB),$4
>>
>> * 80000000: 1161         (1)     ADD     $-8,R2 80000002: c006
>> (1)     MOVW    R1,0(R2)*
>>  80000004: 80020137     (3)     MOV     $-2147352576,R2
>>  80000008: 00010433     (6)     ADD     R0,R2,R8
>>  8000000c: 2091         (9)     JAL     ,R1,main+80000050(BRANCH)
>>  8000000e: 0001         (0)     ADD     $0,R0,R0
>>  80000010:              (74)    TEXT    uartputc+0(SB),R0,$-4
>>  80000010: 10000637     (74)    MOV     $268435456,R12
>>  80000014: 01841593 4185d593(74)        MOVB    R8,R11
>>  8000001c: 01859513 41855513(76)        MOVB    R11,R10
>>  80000024: c208         (76)    MOVW    R10,0(R12)
>>  80000026: 8082         (76)    JMP     ,0(R1)
>>  80000028:              (80)    TEXT    printstr+0(SB),R0,$4
>>  80000028: 1161         (80)    ADD     $-8,R2
>>  8000002a: c006         (80)    MOVW    R1,0(R2)
>>  8000002c: 84a2         (80)    MOV     R8,R9
>>  8000002e: 00048583     (82)    MOVB    0(R9),R11
>>  80000032: c999         (82)    BEQ     R11,80000048(BRANCH)
>>  80000034: 00148613     (83)    ADD     $1,R9,R12
>>  80000038: c632         (83)    MOVW    R12,s+0(FP)
>>  8000003a: 00048403     (83)    MOVB    0(R9),R8
>>  8000003e: 3fc9         (83)    JAL     ,uartputc+80000010(BRANCH)
>>  80000040: 44b2         (83)    MOVW    s+0(FP),R9
>>  80000042: 00048583     (82)    MOVB    0(R9),R11
>>  80000046: f5fd         (82)    BNE     R11,80000034(BRANCH)
>>  80000048: 4082         (83)    MOVW    0(R2),R1
>>  8000004a: 0121         (83)    ADD     $8,R2
>>  8000004c: 8082         (83)    JMP     ,0(R1)
>>  8000004e: 0001         (0)     ADD     $0,R0,R0
>>  80000050:              (87)    TEXT    main+0(SB),R0,$4
>>  80000050: 1161         (87)    ADD     $-8,R2
>>  80000052: c006         (87)    MOVW    R1,0(R2)
>>  80000054: 80018413     (89)    MOV     $.string<>+0(SB),R8
>>  80000058: 3fc1         (89)    JAL     ,printstr+80000028(BRANCH)
>>  8000005a: a001         (90)    JMP     ,30(APC)
>> ----
>>
>> However, when debugging with GDB in QEMU, I found that the instruction at
>> address *0x80000002* causes the PC (program counter) to reset, and
>> execution does not continue. I believe these extra instructions are added
>> by the loader automatically, but I don’t know how to prevent this.
>>
>> I also tried using l.s from the 9legacy compiler sources, but had the
>> same result. I’ve been reading through start.s from the RISC-V kernel and
>> looking at the mkfile, suspecting I might need to pass specific options to
>> compile correctly, but there are too many and I don’t fully understand them
>> yet.
>>
>> Can someone explain how to compile without these extra instructions, or
>> why the PC is being reset and how to avoid it?
>>
>> Thanks in advance, and apologies for my English and the length of this
>> email.
>>
>> Best regards,
>> José J.
>>
>>
>> *9fans <https://9fans.topicbox.com/latest>* / 9fans / see discussions
> <https://9fans.topicbox.com/groups/9fans> + participants
> <https://9fans.topicbox.com/groups/9fans/members> + delivery options
> <https://9fans.topicbox.com/groups/9fans/subscription> Permalink
> <https://9fans.topicbox.com/groups/9fans/T3f252d4d7c5389ee-M7443e2eed479486fd6a55cc6>
>


-- 
José J. Cabezas

------------------------------------------
9fans: 9fans
Permalink: 
https://9fans.topicbox.com/groups/9fans/T3f252d4d7c5389ee-Mf1a4a51e93e914a5b9276778
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

Reply via email to