Hello again!

I've just spent half an hour by debugging my code thanks to this bug. :)

Let's look at the docs:


7.6 ‘.asciz "STRING"’...
========================

‘.asciz’ is just like ‘.ascii’, but each string is followed by a zero
byte.  The “z” in ‘.asciz’ stands for “zero”.  Note that multiple string
arguments not separated by commas will be concatenated together and only
one final zero byte will be stored.


We see that “.asciz” inserts *no NUL byte* after the string. Let's play with my GAS installation:


$ cat >test.s
.asciz "hello", "world"
$ as test.s
$ xxd a.out
00000000: 7f45 4c46 0201 0100 0000 0000 0000 0000  .ELF............
00000010: 0100 3e00 0100 0000 0000 0000 0000 0000  ..>.............
00000020: 0000 0000 0000 0000 b000 0000 0000 0000  ................
00000030: 0000 0000 4000 0000 0000 4000 0600 0500  ....@.....@.....
00000040: 6865 6c6c 6f00 776f 726c 6400 0000 0000  hello.world.....
00000050: 0400 0000 2000 0000 0500 0000 474e 5500  .... .......GNU.
---------------------------------->8 ----[SNIP]--------------------
$ as --version
GNU assembler (Gentoo 2.41 p5) 2.41.0
Copyright (C) 2023 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms of
the GNU General Public License version 3 or later.
This program has absolutely no warranty.
This assembler was configured for a target of `x86_64-pc-linux-gnu'.


As you can see at the byte 0x45, the NUL byte is placed after the first string ("hello" ends).

The attached patch should fix that, but I do not know GAS source code well, so please review that carefully.

I provide also an alternative patch that adjusts the documentation according to the implementation.

I would appreciate if you include a reference to me in the commit log. Applying one of the patches using “git am” already accomplishes that task.

Thanks,
Jiří Wolker
From a4f3f9a244b40e1cb8962b30fba35a8900a37422 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ji=C5=99=C3=AD=20Wolker?= <proje...@jwo.cz>
Date: Thu, 30 May 2024 16:51:07 +0200
Subject: [PATCH] gas: Fix .asciz directive for multiple operands

The documentation stated that this directive inserts a zero byte
only after all strings passed as operands to this instructions.

Before this commit, zero byte was inserted after each of the strings
passed to this directive.

Also mentioned this bug in the documentation so users can recpect this
bug when using alder version of as.
---
 gas/doc/as.texi | 3 +++
 gas/read.c      | 6 +++---
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/gas/doc/as.texi b/gas/doc/as.texi
index 90898d7479a..d16e706e415 100644
--- a/gas/doc/as.texi
+++ b/gas/doc/as.texi
@@ -4807,6 +4807,9 @@ a zero byte.  The ``z'' in @samp{.asciz} stands for ``zero''.  Note that
 multiple string arguments not separated by commas will be concatenated
 together and only one final zero byte will be stored.
 
+Note that GNU Assembler version 2.42 (and previous versions) had a bug that
+caused inserting a zero byte after each string passed to this directive.
+
 @node Attach_to_group
 @section @code{.attach_to_group @var{name}}
 Attaches the current section to the named group.  This is like declaring
diff --git a/gas/read.c b/gas/read.c
index 8026a6cdb65..e898dc9bcbc 100644
--- a/gas/read.c
+++ b/gas/read.c
@@ -5463,9 +5463,6 @@ stringer (int bits_appendzero)
 	  if (*input_line_pointer == '"')
 	    break;
 
-	  if (append_zero)
-	    stringer_append_char (0, bitsize);
-
 #if !defined(NO_LISTING) && defined (OBJ_ELF)
 	  /* In ELF, when gcc is emitting DWARF 1 debugging output, it
 	     will emit .string with a filename in the .debug section
@@ -5505,6 +5502,9 @@ stringer (int bits_appendzero)
       c = *input_line_pointer;
     }
 
+    if (append_zero)
+      stringer_append_char (0, bitsize);
+
   demand_empty_rest_of_line ();
 }
 
-- 
2.44.1

From 39a450c0f03c4803b29370dc427b7f416efc9e1f Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ji=C5=99=C3=AD=20Wolker?= <proje...@jwo.cz>
Date: Thu, 30 May 2024 16:54:51 +0200
Subject: [PATCH] gas/doc: Fix doc of .asciz behavior with >1 arg

The documentation stated that this directive inserts a zero byte
only after all strings passed as operands to this instructions.

In the implementation, a zero byte is inserted after each of the strings
passed to this directive.
---
 gas/doc/as.texi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gas/doc/as.texi b/gas/doc/as.texi
index 90898d7479a..30524e90782 100644
--- a/gas/doc/as.texi
+++ b/gas/doc/as.texi
@@ -4805,7 +4805,7 @@ trailing zero byte) into consecutive addresses.
 @code{.asciz} is just like @code{.ascii}, but each string is followed by
 a zero byte.  The ``z'' in @samp{.asciz} stands for ``zero''.  Note that
 multiple string arguments not separated by commas will be concatenated
-together and only one final zero byte will be stored.
+together and a zero byte is stored after each string.
 
 @node Attach_to_group
 @section @code{.attach_to_group @var{name}}
-- 
2.44.1

Reply via email to