elf_memory again, readonly or readwrite...

2024-08-29 Thread Mark Wielaard
Hi,

So we changed elf_memory so it pretends the in-memory Elf image is
read with ELF_C_READ_MMAP. This helps when calling elf_memory on
read-only memory which still wants to change some things about the Elf
like uncompress some sections (which changes the section header).

With ELF_C_READ_MMAP libelf will make a copy of the section headers,
so any changes aren't written to the memory image (because that would
crash if the underlying memory is read-only).

But that breaks another use case where elf_memory is used on rw memory
and then the section headers are updated using libelf and it is
expected those in-memory section headers reflect the changes.

The problem of course is the elf_memory function doesn't take a "mode"
argument. So we have to guess. And make one or the other usage
unusable.

I think we should assume the memory is read/write and at least header
updates are written back to the memory image. But that when only just
reading the image then nothing is changed and written back to the
image.

This does mean that when explicitly uncompressing sections you have to
make sure the image is writable (because that updates the shdrs).

Is that bad/unreasonable?

Derek, would it make your use case impossible?

John, what kind of Shdr changes are you expecting to get written back
to the memory image. Are there any issues that cannot be worked around
by always using the Shdr copy given back from (g)elf[32|64]_shdr?

The patch I am thinking of is attached.

Cheers,

Markdiff --git a/libelf/elf_memory.c b/libelf/elf_memory.c
index 13d77cb71b39..1df49d732dd9 100644
--- a/libelf/elf_memory.c
+++ b/libelf/elf_memory.c
@@ -46,5 +46,6 @@ elf_memory (char *image, size_t size)
   return NULL;
 }
 
-  return __libelf_read_mmaped_file (-1, image, 0, size, ELF_C_READ_MMAP, NULL);
+  return __libelf_read_mmaped_file (-1, image, 0, size,
+   ELF_C_READ_MMAP_PRIVATE, NULL);
 }
diff --git a/tests/elfgetzdata.c b/tests/elfgetzdata.c
index 0af6c223a06b..a50275fea1a7 100644
--- a/tests/elfgetzdata.c
+++ b/tests/elfgetzdata.c
@@ -69,7 +69,8 @@ main (int argc, char *argv[])
   else
 {
  assert (do_mem);
- // We mmap the memory ourselves, explicitly PROT_READ only
+ // We mmap the memory ourselves, explicitly PROT_READ | PROT_WRITE
+ // elf_memory needs writable memory when using elf_compress.
  struct stat st;
  if (fstat (fd, &st) != 0)
{
@@ -79,7 +80,8 @@ main (int argc, char *argv[])
  continue;
}
  map_size = st.st_size;
- map_address = mmap (NULL, map_size, PROT_READ, MAP_PRIVATE, fd, 0);
+ map_address = mmap (NULL, map_size, PROT_READ | PROT_WRITE,
+ MAP_PRIVATE, fd, 0);
  if (map_address == MAP_FAILED)
{
  printf ("%s cannot mmap %s\n", argv[cnt], strerror (errno));


Re: elf_memory again, readonly or readwrite...

2024-08-29 Thread John Mellor-Crummey
Mark,

The draft patch you provided lets my section header updates happen in place. 
This meets my needs.

I am working with Intel Ponte Vecchio GPU binaries, which are relocatables. 
What I am trying to do is to (1) change the sh_addr field in the section 
headers to make the sections non-overlapping (all sections originally have 
sh_addr 0). I want to relocate each section to its offset, so it is 
non-overlapping. (2) I make corresponding changes in the symbol table for 
addresses of text segments and function symbols. (3) I relocate the line map 
entries to correspond to the new positions.

The use case I have is a bit strange in that 
- I read an elf binary into memory into a buffer that I allocate with malloc. 
(I don’t want to change the original copy of the binary, I just want to change 
my copy in memory. I probably could use MMAP_PRIVATE for this, but that 
required more thought.)
- I open the in-memory copy of the binary with elf_memory and make the changes 
that I describe above.
- I then pass the in-memory copy to Dyninst to help me process the binary 
(extract information about functions, inlined code, line mappings, etc.)

Dyninst isn’t accepting an Elf * from me. It accepts the memory as a char * 
pointer like elf_memory does. Then it uses elfutils to open it itself. That is 
the reason that I need all of my adjustments to the binary to be committed back 
to the memory segment: Dyninst isn’t inheriting the Elf * from me; it is 
starting from scratch.

I am willing to call elf_update to make updates if that makes it easier to 
address both Derek’s use case and mine. However, some changes to elf_update may 
be necessary to support update of something mapped with ELF_C_READ_MMAP_PRIVATE.
--
John Mellor-Crummey Professor
Dept of Computer ScienceRice University
email: joh...@rice.edu  phone: 713-348-5179

> On Aug 29, 2024, at 8:47 AM, Mark Wielaard  wrote:
> 
> Hi,
> 
> So we changed elf_memory so it pretends the in-memory Elf image is
> read with ELF_C_READ_MMAP. This helps when calling elf_memory on
> read-only memory which still wants to change some things about the Elf
> like uncompress some sections (which changes the section header).
> 
> With ELF_C_READ_MMAP libelf will make a copy of the section headers,
> so any changes aren't written to the memory image (because that would
> crash if the underlying memory is read-only).
> 
> But that breaks another use case where elf_memory is used on rw memory
> and then the section headers are updated using libelf and it is
> expected those in-memory section headers reflect the changes.
> 
> The problem of course is the elf_memory function doesn't take a "mode"
> argument. So we have to guess. And make one or the other usage
> unusable.
> 
> I think we should assume the memory is read/write and at least header
> updates are written back to the memory image. But that when only just
> reading the image then nothing is changed and written back to the
> image.
> 
> This does mean that when explicitly uncompressing sections you have to
> make sure the image is writable (because that updates the shdrs).
> 
> Is that bad/unreasonable?
> 
> Derek, would it make your use case impossible?
> 
> John, what kind of Shdr changes are you expecting to get written back
> to the memory image. Are there any issues that cannot be worked around
> by always using the Shdr copy given back from (g)elf[32|64]_shdr?
> 
> The patch I am thinking of is attached.
> 
> Cheers,
> 
> Mark



Re: elf_memory again, readonly or readwrite...

2024-08-29 Thread Derek Bruening
Instead of elf_memory() having to guess, which seems like it will confuse
future users (esp if its behavior is not documented), can we make a new
extended API routine elf_memory_mode() that takes in the mode, and clearly
document that the old legacy one assumes readonly (or read-write if you
change it)?

On Thu, Aug 29, 2024 at 9:48 AM Mark Wielaard  wrote:

> Hi,
>
> So we changed elf_memory so it pretends the in-memory Elf image is
> read with ELF_C_READ_MMAP. This helps when calling elf_memory on
> read-only memory which still wants to change some things about the Elf
> like uncompress some sections (which changes the section header).
>
> With ELF_C_READ_MMAP libelf will make a copy of the section headers,
> so any changes aren't written to the memory image (because that would
> crash if the underlying memory is read-only).
>
> But that breaks another use case where elf_memory is used on rw memory
> and then the section headers are updated using libelf and it is
> expected those in-memory section headers reflect the changes.
>
> The problem of course is the elf_memory function doesn't take a "mode"
> argument. So we have to guess. And make one or the other usage
> unusable.
>
> I think we should assume the memory is read/write and at least header
> updates are written back to the memory image. But that when only just
> reading the image then nothing is changed and written back to the
> image.
>
> This does mean that when explicitly uncompressing sections you have to
> make sure the image is writable (because that updates the shdrs).
>
> Is that bad/unreasonable?
>
> Derek, would it make your use case impossible?
>
> John, what kind of Shdr changes are you expecting to get written back
> to the memory image. Are there any issues that cannot be worked around
> by always using the Shdr copy given back from (g)elf[32|64]_shdr?
>
> The patch I am thinking of is attached.
>
> Cheers,
>
> Mark