Hi,

Moving this thread to [email protected], since this code comes from
there.

Герман Семёнов <[email protected]> writes:

> Hello everyone,
> I'm newbie to sending patches by patch, but I'm still very used to modern git 
> hosting. Patch changes are simple in gnulib, using pahole tool
> from Red Hat (https://linux.die.net/man/1/pahole), I found that 'rofile' 
> structure in memory takes 72 bytes, which does not fit into 64 byte cpu
> cacheline and consumes more processor cycles. If you have benchmarks tied to 
> 'rofile', there may be a decent increase. How to run the tests?
>
> From 3a5c97e9eda30c451d4bf9afb2c15e2acf282de6 Mon Sep 17 00:00:00 2001
> From: Herman Semenoff <[email protected]>
> Date: Wed, 24 Sep 2025 23:32:27 +0300
> Subject: [PATCH] lib: align struct rofile to 64 bytes (1 cpu cacheline)
>
> References:
>     
> https://wr.informatik.uni-hamburg.de/_media/teaching/wintersemester_2013_2014/epc-14-haase-svenhendrik-alignmentinc-presentation.pdf
>     https://hpc.rz.rptu.de/Tutorials/AVX/alignment.shtml
>     https://en.wikipedia.org/wiki/Data_structure_alignment
>     https://stackoverflow.com/a/20882083
>     
> https://zijishi.xyz/post/optimization-technique/learning-to-use-data-alignment/
> ---
>  lib/stackvma.c | 2 +-
>  lib/vma-iter.c | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/lib/stackvma.c b/lib/stackvma.c
> index 95bb80db7c..72cc1b89b8 100644
> --- a/lib/stackvma.c
> +++ b/lib/stackvma.c
> @@ -142,13 +142,13 @@ struct rofile
>      size_t position;
>      size_t filled;
>      int eof_seen;
> +    char stack_allocated_buffer[STACK_ALLOCATED_BUFFER_SIZE];
>      /* These fields deal with allocation of the buffer.  */
>      char *buffer;
>      char *auxmap;
>      size_t auxmap_length;
>      uintptr_t auxmap_start;
>      uintptr_t auxmap_end;
> -    char stack_allocated_buffer[STACK_ALLOCATED_BUFFER_SIZE];
>    };
>  
>  /* Open a read-only file stream.  */
> diff --git a/lib/vma-iter.c b/lib/vma-iter.c
> index 009835f60c..f6732ffb5a 100644
> --- a/lib/vma-iter.c
> +++ b/lib/vma-iter.c
> @@ -164,13 +164,13 @@ struct rofile
>      size_t position;
>      size_t filled;
>      int eof_seen;
> +    char stack_allocated_buffer[STACK_ALLOCATED_BUFFER_SIZE];
>      /* These fields deal with allocation of the buffer.  */
>      char *buffer;
>      char *auxmap;
>      size_t auxmap_length;
>      unsigned long auxmap_start;
>      unsigned long auxmap_end;
> -    char stack_allocated_buffer[STACK_ALLOCATED_BUFFER_SIZE];
>    };
>  
>  /* Open a read-only file stream.  */

Thanks for the patch, but I agree with what Bruno said on the GitHub
thread [1].

The vma_iter functions can do a lot of parsing if a program has many
shared libraries or if the program maps many files into memory. For
example, my current Emacs process has 2700 lines in /proc/self/maps.

But I don't expect a program to repeatedly iterate over the virtual
memory areas such that it creates a performance issue.

Collin

[1] https://github.com/coreutils/gnulib/pull/21

Reply via email to