Van: Warner Losh <i...@bsdimp.com>
Datum: dinsdag, 8 november 2022 04:28
Aan: Archimedes Gaviola <archimedes.gavi...@gmail.com>
CC: Mark Millard <mark...@yahoo.com>, freebsd-current
<freebsd-current@freebsd.org>
Onderwerp: Re: 14.0-CURRENT failed to reclaim memory error in RPi 3B build
On Mon, Nov 7, 2022 at 7:40 PM Archimedes Gaviola <archimedes.gavi...@gmail.com> wrote:
On Sat, Nov 5, 2022 at 5:28 PM Archimedes Gaviola <archimedes.gavi...@gmail.com> wrote:
On Thu, Nov 3, 2022 at 7:52 AM Mark Millard <mark...@yahoo.com> wrote:
On 2022-Nov-2, at 14:09, Archimedes Gaviola <archimedes.gavi...@gmail.com>
wrote:
> On Mon, Oct 31, 2022 at 1:47 PM Archimedes Gaviola
<archimedes.gavi...@gmail.com> wrote:
>
> . . .
>
> . . .
>
>
> Hi Mark,
>
> Just an update, as kernel and world compilation is ongoing with my RPi3B
system (with swap partition) is doing so far, so good. It already surpassed the
tough part that breaks the compilation process here.
> ...
>
> llvm-tblgen -gen-asm-matcher -I /usr/src/contrib/llvm-project/llvm/include
-I /usr/src/contrib/llvm-project/llvm/lib/Target/RISCV -d
RISCVGenAsmMatcher.inc.d -o RISCVGenAsmMatcher.inc
/usr/src/contrib/llvm-project/llvm/lib/Target/RISCV/RISCV.td
> llvm-tblgen -gen-asm-writer -I /usr/src/contrib/llvm-project/llvm/include -I
/usr/src/contrib/llvm-project/llvm/lib/Target/RISCV -d RISCVGenAsmWriter.inc.d -o
RISCVGenAsmWriter.inc /usr/src/contrib/llvm-project/llvm/lib/Target/RISCV/RISCV.td
> llvm-tblgen -gen-callingconv -I /usr/src/contrib/llvm-project/llvm/include
-I /usr/src/contrib/llvm-project/llvm/lib/Target/RISCV -d
RISCVGenCallingConv.inc.d -o RISCVGenCallingConv.inc
/usr/src/contrib/llvm-project/llvm/lib/Target/RISCV/RISCV.td
> llvm-tblgen -gen-compress-inst-emitter -I
/usr/src/contrib/llvm-project/llvm/include -I
/usr/src/contrib/llvm-project/llvm/lib/Target/RISCV -d
RISCVGenCompressInstEmitter.inc.d -o RISCVGenCompressInstEmitter.inc
/usr/src/contrib/llvm-project/llvm/lib/Target/RISCV/RISCV.td
> llvm-tblgen -gen-dag-isel -I /usr/src/contrib/llvm-project/llvm/include -I
/usr/src/contrib/llvm-project/llvm/lib/Target/RISCV -d RISCVGenDAGISel.inc.d -o
RISCVGenDAGISel.inc /usr/src/contrib/llvm-project/llvm/lib/Target/RISCV/RISCV.td
> llvm-tblgen -gen-disassembler -I /usr/src/contrib/llvm-project/llvm/include
-I /usr/src/contrib/llvm-project/llvm/lib/Target/RISCV -d
RISCVGenDisassemblerTables.inc.d -o RISCVGenDisassemblerTables.inc
/usr/src/contrib/llvm-project/llvm/lib/Target/RISCV/RISCV.td
> llvm-tblgen -gen-global-isel -I /usr/src/contrib/llvm-project/llvm/include
-I /usr/src/contrib/llvm-project/llvm/lib/Target/RISCV -d
RISCVGenGlobalISel.inc.d -o RISCVGenGlobalISel.inc
/usr/src/contrib/llvm-project/llvm/lib/Target/RISCV/RISCV.td
> llvm-tblgen -gen-instr-info -I /usr/src/contrib/llvm-project/llvm/include -I
/usr/src/contrib/llvm-project/llvm/lib/Target/RISCV -d RISCVGenInstrInfo.inc.d -o
RISCVGenInstrInfo.inc /usr/src/contrib/llvm-project/llvm/lib/Target/RISCV/RISCV.td
> llvm-tblgen -gen-emitter -I /usr/src/contrib/llvm-project/llvm/include -I
/usr/src/contrib/llvm-project/llvm/lib/Target/RISCV -d
RISCVGenMCCodeEmitter.inc.d -o RISCVGenMCCodeEmitter.inc
/usr/src/contrib/llvm-project/llvm/lib/Target/RISCV/RISCV.td
> llvm-tblgen -gen-pseudo-lowering -I
/usr/src/contrib/llvm-project/llvm/include -I
/usr/src/contrib/llvm-project/llvm/lib/Target/RISCV -d
RISCVGenMCPseudoLowering.inc.d -o RISCVGenMCPseudoLowering.inc
/usr/src/contrib/llvm-project/llvm/lib/Target/RISCV/RISCV.td
> llvm-tblgen -gen-register-bank -I /usr/src/contrib/llvm-project/llvm/include
-I /usr/src/contrib/llvm-project/llvm/lib/Target/RISCV -d
RISCVGenRegisterBank.inc.d -o RISCVGenRegisterBank.inc
/usr/src/contrib/llvm-project/llvm/lib/Target/RISCV/RISCV.td
> llvm-tblgen -gen-register-info -I /usr/src/contrib/llvm-project/llvm/include
-I /usr/src/contrib/llvm-project/llvm/lib/Target/RISCV -d
RISCVGenRegisterInfo.inc.d -o RISCVGenRegisterInfo.inc
/usr/src/contrib/llvm-project/llvm/lib/Target/RISCV/RISCV.td
> llvm-tblgen -gen-searchable-tables -I
/usr/src/contrib/llvm-project/llvm/include -I
/usr/src/contrib/llvm-project/llvm/lib/Target/RISCV -d
RISCVGenSearchableTables.inc.d -o RISCVGenSearchableTables.inc
/usr/src/contrib/llvm-project/llvm/lib/Target/RISCV/RISCV.td
> llvm-tblgen -gen-subtarget -I /usr/src/contrib/llvm-project/llvm/include -I
/usr/src/contrib/llvm-project/llvm/lib/Target/RISCV -d
RISCVGenSubtargetInfo.inc.d -o RISCVGenSubtargetInfo.inc
/usr/src/contrib/llvm-project/llvm/lib/Target/RISCV/RISCV.td
> llvm-tblgen -gen-searchable-tables -I
/usr/src/contrib/llvm-project/llvm/include -I
/usr/src/contrib/llvm-project/llvm/lib/Target/RISCV -d
RISCVGenSystemOperands.inc.d -o RISCVGenSystemOperands.inc
/usr/src/contrib/llvm-project/llvm/lib/Target/RISCV/RISCV.td
>
> Any thoughts why this part is quite a challenge when it comes to memory usage? The other architectures do not possess such behavior... just curious.>>>
Hi Mark,
Sorry for the late response, I got fully occupied at work for the past few days due to deliverables. Thanks for your feedback and further inputs!
I've not done any monitoring of buildworld buildkernel build
activity (RAM use, memory space use, swap partition use over
time) on RPi3B class hardware in a very long time.>>>
It's alright, it so happened that I just observed that behavior on that particular part as it requires more memory than other architectures while compiling. The additional 3.5G swap partition really helps on that part that's why I was so happy that the compilation continued and never broke. Your input of having 3.5G swap allocation is very effective.
Even on systems that I have monitored in more recent times,
what I usually monitor tends to be builds with -jN (such as
-j4 fora 4-hardware-thread system). (I once did have an
example of -j3 taking less time than -j4 on a RPi4B.>>>
Wow, this is interesting this -jN. Let me explore this as well. I usually build kernel the old way but recently since I have to include building the world then I need to use the new way.
Basically, the memory subsystem can be saturated without all
the cores being in use. The extra interference made things
take longer.)>>>
Oh I see so it's the reason.
You had listed that you were using the likes of:
# cd /usr/src ; make KERNCONF=ARM TARGET_ARCH=aarch64 \
buildkernel buildworld installkernel installworld distribution \
DESTDIR=/home/freebsd/rpi3b
I'll note that the standard order of the first 2 is:
buildworld buildkernel
This is because buildworld builds some software that
buildkernel does not build for itself but does use.>>>
Okay this is noted, thanks for clarifying and correcting me, I really appreciate it. I'll reflect on the proper build sequence for much efficiency.
There is a kernel-toolchain target for avoiding the
need to do a full buildworld just to buildkernel , so:
kernel-toolchain buildkernel
is an expected sequence.>>>
Okay I'll take note of this too.
I do not know how long a from-scratch buildworld
buildkernel without a -jN takes on a RPi3B these
days. If I remember right, for -jN with 1<N, I last
saw claims about such they were somewhere in the
range 36hr..48hr.>>>
There's an ongoing build at the moment, it's already taking 41 hours since I started it. I took another build when I came back home from the office.
But I'm unsure of the specific N
that was in use. Nor do I know the storage media
type(s) involved, for example. I do not remember
any reports for -j1.>>>
I'll try this with RPi 3B. The current build that I have will be my baseline.
Use of the likes of: vm.pageout_oom_seq=120
was essential to such -jN usage on a RPi3B as N
gets larger. Of course, swap partition use for
paging was also essential.>>>
Wow, that's great! I have this vm.pageout_oom_seq=120 configured in my system now based on your previous inputs.
Likewise use of:
vm.swap_enabled=0
vm.swap_idle_enabled=0
can be important to not losing communication
with the RPi3B. Those last 2 are not tunable
but are writable:
# sysctl -aT | grep swap_
# sysctl -aW | grep swap_
vm.swap_enabled: 0
. . .
vm.swap_idle_enabled: 0
. . .
(This means that they have fewer places where
assignments can be made. For example, the
loader can not make the assignments.)
By contrast, vm.pageout_oom_seq is both
writable and tunable:
# sysctl -aW | grep oom
. . .
vm.pageout_oom_seq: 120
. . .
# sysctl -aT | grep oom
. . .
vm.pageout_oom_seq: 120
. . .
(So even the loader can make such assignments.)>>>
Yes, I have these two sysctl parameters configured in the system. Thanks for the details and further inputs.
I'll note that I've no interest in using arm hardware
to build for other types of hardware. So I normally
have the targeting of support for building for other
architectures disabled when I build on aarch64 or
armv7. (Basically, a less complete clang/clang++
related toolchain ends up being built.)>>>
Ah okay, so you mean to say that you disable these other architectures by declaring and accomplishing it in the /etc/src.conf?
I'll provide an update here once the build is done knowing how long it takes to finish.
Hi Mark,
With this set of build commands now,
# cd /usr/src; make -j3 KERNCONF=ARM TARGET_ARCH=aarch64 buildworld kernel-toolchain buildkernel installworld installkernel distribution DESTDIR=/home/freebsd/rpi3b
in RPi 3B, I encountered the other OOM error which is the 'thread waited too long to allocate a page'. This occurred from every build I conducted. Though the first error on 'failed to reclaim memory' was never experienced again. Below are the error logs.
...
swap_pager: indefinite wait buffer: bufobj: 0, blkno: 256929, size: 4096
swap_pager: indefinite wait buffer: bufobj: 0, blkno: 3628, size: 4096
swap_pager: indefinite wait buffer: bufobj: 0, blkno: 255839, size: 40960
pid 46153 (c++), jid 0, uid 0, was killed: a thread waited too long to allocate
a page
swap_pager: indefinite wait buffer: bufobj: 0, blkno: 255857, size: 28672
swap_pager: indefinite wait buffer: bufobj: 0, blkno: 3634, size: 8192
swap_pager: indefinite wait buffer: bufobj: 0, blkno: 256037, size: 4096
swap_pager: indefinite wait buffer: bufobj: 0, blkno: 255320, size: 8192
This means that paging to the swap partition and/or swap file took too long (> 30 seconds... that's all that indefinite means). It also means that it can't write to backing store dirty pages to give to another process...
Typical reason is that the disk / flash is not responsive to writes for some reason. You'll need to find why... I'd look at trims.
Or.... if you can't change the disk... you need to put less memory pressure on it..
Warner
NB: a way to put less memory pressure on it is not using -j3, but -j2 or -j1 in
your make command.
Regards,
Ronald.