** Description changed: - On some arm64 systems[*] we are seeing a spew of messages on the - console: + [Impact] + We enabled CONFIG_DMA_CMA to fix bug 1803206, but that led to a regression + where other systems began spewing on the order of 10K of these messages on boot: [ 19.534097] cma: cma_alloc: alloc failed, req-size: 64 pages, ret: -12 [ 19.534109] cma: cma_alloc: alloc failed, req-size: 16 pages, ret: -12 [ 19.534113] cma: cma_alloc: alloc failed, req-size: 64 pages, ret: -12 [ 19.534126] cma: cma_alloc: alloc failed, req-size: 16 pages, ret: -12 [ 19.534130] cma: cma_alloc: alloc failed, req-size: 64 pages, ret: -12 [ 19.534142] cma: cma_alloc: alloc failed, req-size: 16 pages, ret: -12 [ 19.534146] cma: cma_alloc: alloc failed, req-size: 64 pages, ret: -12 [ 19.534157] cma: cma_alloc: alloc failed, req-size: 16 pages, ret: -12 [ 19.534161] cma: cma_alloc: alloc failed, req-size: 64 pages, ret: -12 [ 19.534173] cma: cma_alloc: alloc failed, req-size: 16 pages, ret: -12 [ 19.534177] cma: cma_alloc: alloc failed, req-size: 64 pages, ret: -12 - This appears to be non-fatal - impacted systems all eventually boot. - But, at least in the case of the HP m400, it slows down boot enough that - MAAS' default timeout will expire before completing deployment. + In a previous SRU (bug 1828092), we worked around this by just rate- + limiting these messages. These are "err" priority messages though, so + even a lower number of them is still disconcerting. - [*] Observed on a HiSilicon D06 w/ SMMU disabled in the BIOS, as well as - an HP m400 (APM X-Gene) cartridge - although, not on another one that - - in theory - should be identical. + [Fix] + 1) Bump up the amount of available CMA on arm64 to 32M (same as upstream defconfig) + 2) A patch-set from linux-next that redirects dma-direct contiguous allocations to alloc_pages() for single page allocations (single pages are by definition contiguous), avoiding CMA usage/fragmentation. + + [Test Case] + dmesg | grep "cma_alloc: alloc failed" + Some system configs will still have some of these errors even after this fix - but this should reduce them significantly. Per-driver optimizations can be used to make further improvements. + + [Regression Risk] + Tested on a HiSilicon D06 and HP m400 (Hi1620 & X-Gene arm64). + Regression tested on: + - Raspberry Pi 3B (see Comment #22) + - Power9 system (ppc64el) + - z/VM instance (s390x) + - Intel Centerton system (amd64)
** Description changed: [Impact] We enabled CONFIG_DMA_CMA to fix bug 1803206, but that led to a regression - where other systems began spewing on the order of 10K of these messages on boot: + on other arm64 systems that began spewing on the order of 10K of these messages on boot: [ 19.534097] cma: cma_alloc: alloc failed, req-size: 64 pages, ret: -12 [ 19.534109] cma: cma_alloc: alloc failed, req-size: 16 pages, ret: -12 [ 19.534113] cma: cma_alloc: alloc failed, req-size: 64 pages, ret: -12 [ 19.534126] cma: cma_alloc: alloc failed, req-size: 16 pages, ret: -12 [ 19.534130] cma: cma_alloc: alloc failed, req-size: 64 pages, ret: -12 [ 19.534142] cma: cma_alloc: alloc failed, req-size: 16 pages, ret: -12 [ 19.534146] cma: cma_alloc: alloc failed, req-size: 64 pages, ret: -12 [ 19.534157] cma: cma_alloc: alloc failed, req-size: 16 pages, ret: -12 [ 19.534161] cma: cma_alloc: alloc failed, req-size: 64 pages, ret: -12 [ 19.534173] cma: cma_alloc: alloc failed, req-size: 16 pages, ret: -12 [ 19.534177] cma: cma_alloc: alloc failed, req-size: 64 pages, ret: -12 In a previous SRU (bug 1828092), we worked around this by just rate- limiting these messages. These are "err" priority messages though, so even a lower number of them is still disconcerting. [Fix] 1) Bump up the amount of available CMA on arm64 to 32M (same as upstream defconfig) 2) A patch-set from linux-next that redirects dma-direct contiguous allocations to alloc_pages() for single page allocations (single pages are by definition contiguous), avoiding CMA usage/fragmentation. [Test Case] dmesg | grep "cma_alloc: alloc failed" Some system configs will still have some of these errors even after this fix - but this should reduce them significantly. Per-driver optimizations can be used to make further improvements. [Regression Risk] Tested on a HiSilicon D06 and HP m400 (Hi1620 & X-Gene arm64). Regression tested on: - - Raspberry Pi 3B (see Comment #22) - - Power9 system (ppc64el) - - z/VM instance (s390x) - - Intel Centerton system (amd64) + - Raspberry Pi 3B (see Comment #22) + - Power9 system (ppc64el) + - z/VM instance (s390x) + - Intel Centerton system (amd64) ** Description changed: [Impact] We enabled CONFIG_DMA_CMA to fix bug 1803206, but that led to a regression - on other arm64 systems that began spewing on the order of 10K of these messages on boot: + on other arm64 systems that began spewing these messages on boot - sometimes > 10K of them: [ 19.534097] cma: cma_alloc: alloc failed, req-size: 64 pages, ret: -12 [ 19.534109] cma: cma_alloc: alloc failed, req-size: 16 pages, ret: -12 [ 19.534113] cma: cma_alloc: alloc failed, req-size: 64 pages, ret: -12 [ 19.534126] cma: cma_alloc: alloc failed, req-size: 16 pages, ret: -12 [ 19.534130] cma: cma_alloc: alloc failed, req-size: 64 pages, ret: -12 [ 19.534142] cma: cma_alloc: alloc failed, req-size: 16 pages, ret: -12 [ 19.534146] cma: cma_alloc: alloc failed, req-size: 64 pages, ret: -12 [ 19.534157] cma: cma_alloc: alloc failed, req-size: 16 pages, ret: -12 [ 19.534161] cma: cma_alloc: alloc failed, req-size: 64 pages, ret: -12 [ 19.534173] cma: cma_alloc: alloc failed, req-size: 16 pages, ret: -12 [ 19.534177] cma: cma_alloc: alloc failed, req-size: 64 pages, ret: -12 In a previous SRU (bug 1828092), we worked around this by just rate- limiting these messages. These are "err" priority messages though, so even a lower number of them is still disconcerting. [Fix] 1) Bump up the amount of available CMA on arm64 to 32M (same as upstream defconfig) 2) A patch-set from linux-next that redirects dma-direct contiguous allocations to alloc_pages() for single page allocations (single pages are by definition contiguous), avoiding CMA usage/fragmentation. [Test Case] dmesg | grep "cma_alloc: alloc failed" Some system configs will still have some of these errors even after this fix - but this should reduce them significantly. Per-driver optimizations can be used to make further improvements. [Regression Risk] Tested on a HiSilicon D06 and HP m400 (Hi1620 & X-Gene arm64). Regression tested on: - Raspberry Pi 3B (see Comment #22) - Power9 system (ppc64el) - z/VM instance (s390x) - Intel Centerton system (amd64) ** Description changed: [Impact] We enabled CONFIG_DMA_CMA to fix bug 1803206, but that led to a regression on other arm64 systems that began spewing these messages on boot - sometimes > 10K of them: [ 19.534097] cma: cma_alloc: alloc failed, req-size: 64 pages, ret: -12 [ 19.534109] cma: cma_alloc: alloc failed, req-size: 16 pages, ret: -12 [ 19.534113] cma: cma_alloc: alloc failed, req-size: 64 pages, ret: -12 [ 19.534126] cma: cma_alloc: alloc failed, req-size: 16 pages, ret: -12 [ 19.534130] cma: cma_alloc: alloc failed, req-size: 64 pages, ret: -12 [ 19.534142] cma: cma_alloc: alloc failed, req-size: 16 pages, ret: -12 [ 19.534146] cma: cma_alloc: alloc failed, req-size: 64 pages, ret: -12 [ 19.534157] cma: cma_alloc: alloc failed, req-size: 16 pages, ret: -12 [ 19.534161] cma: cma_alloc: alloc failed, req-size: 64 pages, ret: -12 [ 19.534173] cma: cma_alloc: alloc failed, req-size: 16 pages, ret: -12 [ 19.534177] cma: cma_alloc: alloc failed, req-size: 64 pages, ret: -12 In a previous SRU (bug 1828092), we worked around this by just rate- limiting these messages. These are "err" priority messages though, so even a lower number of them is still disconcerting. [Fix] 1) Bump up the amount of available CMA on arm64 to 32M (same as upstream defconfig) 2) A patch-set from linux-next that redirects dma-direct contiguous allocations to alloc_pages() for single page allocations (single pages are by definition contiguous), avoiding CMA usage/fragmentation. [Test Case] dmesg | grep "cma_alloc: alloc failed" - Some system configs will still have some of these errors even after this fix - but this should reduce them significantly. Per-driver optimizations can be used to make further improvements. + Some system configs will still have some of these errors even after this fix - but this should reduce them significantly. Per-driver optimizations can be used to make further improvements, but we should track those in other bugs. [Regression Risk] Tested on a HiSilicon D06 and HP m400 (Hi1620 & X-Gene arm64). Regression tested on: - Raspberry Pi 3B (see Comment #22) - Power9 system (ppc64el) - z/VM instance (s390x) - Intel Centerton system (amd64) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1823753 Title: arm64: cma_alloc errors at boot To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1823753/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs