Hi, Pavel,
On 2025-07-10 at 10:24 +02, Pavel Machek <[email protected]> wrote: > [[PGP Signed Part:Undecided]] > Hi! > > It seems that DMA-BUFs are always uncached on arm64... which is a > problem. > > I'm trying to get useful camera support on Librem 5, and that includes > recording vidos (and taking photos). Earlier this year i tried to solve a similar issue on rkisp1 (Rockchip 3399), and done some measurements, showing that non-coherent buffers + cache flushing for buffers is a viable approach [1]. Unfortunately, that effort stalled, but maybe patch "[PATCH v4 1/2] media: videobuf2: Fix dmabuf cache sync/flush in dma-contig" will be useful to you. [1] https://lore.kernel.org/all/[email protected]/ > memcpy() from normal memory is about 2msec/1MB. Unfortunately, for > DMA-BUFs it is 20msec/1MB, and that basically means I can't easily do > 760p video recording. Plus, copying full-resolution photo buffer takes > more than 200msec! > > There's possibility to do some processing on GPU, and its implemented here: > > https://gitlab.com/tui/tui/-/tree/master/icam?ref_type=heads > > but that hits the same problem in the end -- data is in DMA-BUF, > uncached, and takes way too long to copy out. > > And that's ... wrong. DMA ended seconds ago, complete cache flush > would be way cheaper than copying single frame out, and I still have > to deal with uncached frames. > > So I have two questions: > > 1) Is my analysis correct that, no matter how I get frame from v4l and > process it on GPU, I'll have to copy it from uncached memory in the > end? > > 2) Does anyone have patches / ideas / roadmap how to solve that? It > makes GPU unusable for computing, and camera basically unusable for > video. > > Best regards, > Pavel -- Best regards, Mikhail Rudenko
