heaps: implement DMA_BUF_IOCTL_RW_FILE for system_heap

wangtao Fri, 16 May 2025 02:50:09 -0700


> -----Original Message-----
> From: Christian König <[email protected]>
> Sent: Friday, May 16, 2025 4:36 PM
> To: wangtao <[email protected]>; [email protected];
> [email protected]; [email protected];
> [email protected]; [email protected]
> Cc: [email protected]; [email protected]; linaro-
> [email protected]; [email protected];
> wangbintian(BintianWang) <[email protected]>; yipengxiang
> <[email protected]>; liulu 00013167 <[email protected]>; hanfeng
> 00012985 <[email protected]>
> Subject: Re: [PATCH 2/2] dmabuf/heaps: implement
> DMA_BUF_IOCTL_RW_FILE for system_heap
> 
> On 5/16/25 09:40, wangtao wrote:
> >
> >
> >> -----Original Message-----
> >> From: Christian König <[email protected]>
> >> Sent: Thursday, May 15, 2025 10:26 PM
> >> To: wangtao <[email protected]>; [email protected];
> >> [email protected]; [email protected];
> >> [email protected]; [email protected]
> >> Cc: [email protected]; [email protected];
> >> linaro- [email protected]; [email protected];
> >> wangbintian(BintianWang) <[email protected]>; yipengxiang
> >> <[email protected]>; liulu 00013167 <[email protected]>;
> >> hanfeng
> >> 00012985 <[email protected]>
> >> Subject: Re: [PATCH 2/2] dmabuf/heaps: implement
> >> DMA_BUF_IOCTL_RW_FILE for system_heap
> >>
> >> On 5/15/25 16:03, wangtao wrote:
> >>> [wangtao] My Test Configuration (CPU 1GHz, 5-test average):
> >>> Allocation: 32x32MB buffer creation
> >>> - dmabuf 53ms vs. udmabuf 694ms (10X slower)
> >>> - Note: shmem shows excessive allocation time
> >>
> >> Yeah, that is something already noted by others as well. But that is
> >> orthogonal.
> >>
> >>>
> >>> Read 1024MB File:
> >>> - dmabuf direct 326ms vs. udmabuf direct 461ms (40% slower)
> >>> - Note: pin_user_pages_fast consumes majority CPU cycles
> >>>
> >>> Key function call timing: See details below.
> >>
> >> Those aren't valid, you are comparing different functionalities here.
> >>
> >> Please try using udmabuf with sendfile() as confirmed to be working by
> T.J.
> > [wangtao] Using buffer IO with dmabuf file read/write requires one
> memory copy.
> > Direct IO removes this copy to enable zero-copy. The sendfile system
> > call reduces memory copies from two (read/write) to one. However, with
> > udmabuf, sendfile still keeps at least one copy, failing zero-copy.
> 
> 
> Then please work on fixing this.
[wangtao] What needs fixing? Does sendfile achieve zero-copy?
sendfile reduces memory copies (from 2 to 1) for network sockets,
but still requires one copy and cannot achieve zero copies.


> 
> Regards,
> Christian.
> 
> 
> >
> > If udmabuf sendfile uses buffer IO (file page cache), read latency
> > matches dmabuf buffer read, but allocation time is much longer.
> > With Direct IO, the default 16-page pipe size makes it slower than buffer 
> > IO.
> >
> > Test data shows:
> > udmabuf direct read is much faster than udmabuf sendfile.
> > dmabuf direct read outperforms udmabuf direct read by a large margin.
> >
> > Issue: After udmabuf is mapped via map_dma_buf, apps using memfd or
> > udmabuf for Direct IO might cause errors, but there are no safeguards
> > to prevent this.
> >
> > Allocate 32x32MB buffer and read 1024 MB file Test:
> > Metric                 | alloc (ms) | read (ms) | total (ms)
> > -----------------------|------------|-----------|-----------
> > udmabuf buffer read    | 539        | 2017      | 2555
> > udmabuf direct read    | 522        | 658       | 1179
> > udmabuf buffer sendfile| 505        | 1040      | 1546
> > udmabuf direct sendfile| 510        | 2269      | 2780
> > dmabuf buffer read     | 51         | 1068      | 1118
> > dmabuf direct read     | 52         | 297       | 349
> >
> > udmabuf sendfile test steps:
> > 1. Open data file(1024MB), get back_fd 2. Create memfd(32MB) # Loop
> > steps 2-6 3. Allocate udmabuf with memfd 4. Call sendfile(memfd,
> > back_fd) 5. Close memfd after sendfile 6. Close udmabuf 7. Close
> > back_fd
> >
> >>
> >> Regards,
> >> Christian.
> >

RE: [PATCH 2/2] dmabuf/heaps: implement DMA_BUF_IOCTL_RW_FILE for system_heap

Reply via email to