I thought I'd summarise where things are at with the 6.5 kernel. We've fixed: * the ARM LTP OOM lockup (kernel patch) * the locale ARM selftest failure which was OOM due to silly buffer allocations in 6.5 (kernel commandline) * the ARM jitterentropy errors (kernel patch) * the cryptodev build failures (recipe updated)
We've also: * disabled the strace tests that fail with 6.5. * made sure the serial ports and getty counts match * added ttyrun which wraps serial consoles and avoids hacks * made the qemurunner logging save all the port logs * made the qemurunner write the binary data it is sent verbatim * made sure to use nodelay on qemu's tcpserial This leaves an annoying serial console problem where ttyS1 never has the getty login prompt appear. What we know: * We've only seen this on x86 more recently (yesterday/today) but have seen it on ARM in the days before that. * It affects both sysvinit and systemd images. * Systemd does print that it started a getty on ttyS0 and ttyS1 when the failure occurs * There is a getty running according to "ps" when the failure occurs * There are only ever one or three characters received to ttyS1 in the failure case (0x0d and 0x0a chars, i.e. CR and LF) * It can't be any kind of utf-8 conversion issue since the login prompt isn't visible in the binary log * the kernel boot logs do show the serial port created with the same ioport and irq on x86. Previously we did see some logs with timing issues on the ttyS0 port but the nodelay parameter may have helped with that. There are debug patches in master-next against qemurunner which try and poke around to gather more debug when things fail using ttyS0. The best failure log we have is now this one: https://autobuilder.yoctoproject.org/typhoon/#/builders/79/builds/5874/steps/14/logs/stdio where I've saved the logs: https://autobuilder.yocto.io/pub/failed-builds-data/6.5%20kernel/j/qemu_boot_log.20231007084853 and https://autobuilder.yocto.io/pub/failed-builds-data/6.5%20kernel/j/qemu_boot_log.20231007084853.2 You can see ttyS1 times out after 1000 seconds and the port only has a single byte (in the .2 file). The other log shows ps output showing the getty running for ttyS1. Ideas welcome on where from here. I've tweaked master-next to keep reading the ttyS1 port after we poke it from ttyS0 to see if that reveals anything next time it fails (build running). Cheers, Richard
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#188800): https://lists.openembedded.org/g/openembedded-core/message/188800 Mute This Topic: https://lists.openembedded.org/mt/101824562/21656 Group Owner: [email protected] Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub [[email protected]] -=-=-=-=-=-=-=-=-=-=-=-
