Chapter 3.5.4, "Source Code Reference", of the ld Manual is so inaccurate and inconsistent in its use of vocabulary with the rest of the manual that it should be replaced. A detailed critique is below; I suggest the following replacement:
3.5.4 Accessing Symbols defined in Linker Scripts in Source Code ---------------------------------------------------------------- The value of a symbol is its address. Thus, to access a symbol's value, declare it an external variable and use its address. Note that in most cases, symbols defined by linker scripts do *not* have any associated storage assigned to them, so it is typically an error to read from or write to such an external variable! For example, the Unix System V documentation traditionally uses the following C declarations for the end of the text segment, the end of the data segment, and the end of the BSS segment, which System V marks with the symbols ``etext'', ``edata'', and ``end'': extern etext; extern edata; extern end; Note that these declarations implicitly use a type of ``int''. One can choose the type most appropriate to the application, because type checking is not done during link editing. E.g., declaring such symbols as incomplete arrays of const char enables the C compiler to diagnose writes, reads (without array dereference) and use of the sizeof operator as errors: extern char const end []; Finally, note that some systems perform a transformation between variable names as used in a high-level language and symbol names as seen by the linker. The transformation is part of the ABI. E.g., a.out and COFF(?)-based systems prepend an underscore to variable names to arrive at the symbol name---this is done to create separate name spaces for high-level language modules and assembly language modules. Symbol names must take this transformation into account: e.g., the above symbols would be named ``_etext'', ``_edata'', and ``_end'' on such systems. In C++, the ``extern "C"'' modifier can be used to suppress the additional "mangling" of variable names done by that language. CRITIQUE OF CURRENT TEXT File: ld.info, Node: Source Code Reference, Prev: PROVIDE_HIDDEN, Up: Assign\ ments 3.5.4 Source Code Reference --------------------------- Accessing a linker script defined variable from source code is not >> symbol intuitive. In particular a linker script symbol is not equivalent to a variable declaration in a high level language, it is instead a symbol that does not have a value. >> ??? It has a value, it just might not have storage >> associated with it. This node's parent is titled "Assigning values to >> Symbols"! Before going further, it is important to note that compilers often transform names in the source code into different names when they are stored in the symbol table. For example, Fortran compilers commonly >> That mangling is defined by the ABI should be mentioned prepend or append an underscore, and C++ performs extensive `name mangling'. Therefore there might be a discrepancy between the name of a variable as it is used in source code and the name of the same variable as it is defined in a linker script. For example in C a linker script variable might be referred to as: extern int foo; But in the linker script it might be defined as: _foo = 1000; In the remaining examples however it is assumed that no name transformation has taken place. When a symbol is declared in a high level language such as C, two things happen. The first is that the compiler reserves enough space in the program's memory to hold the _value_ of the symbol. The second is >> data of the variable that the compiler creates an entry in the program's symbol table which >> technically, for gcc, the assembler >> object file's holds the symbol's _address_. ie the symbol table contains the address of the block of memory holding the symbol's value. So for example the following C declaration, at file scope: int foo = 1000; creates a entry called `foo' in the symbol table. This entry holds the address of an `int' sized block of memory where the number 1000 is initially stored. When a program references a symbol the compiler generates code that first accesses the symbol table to find the address of the symbol's >> Utter nonsense! memory block and then code to read the value from that memory block. So: foo = 1; looks up the symbol `foo' in the symbol table, gets the address associated with this symbol and then writes the value 1 into that address. Whereas: int * a = & foo; looks up the symbol `foo' in the symbol table, gets it address and then copies this address into the block of memory associated with the variable `a'. Linker scripts symbol declarations, by contrast, create an entry in the symbol table but do not assign any memory to them. Thus they are an address without a value. So for example the linker script >> Again, this is completely at variance to how the rest of the manual >> defines the "value" of a symbol, namely as its address for normal symbols >> or [sic] its value for absolute symbols. definition: foo = 1000; creates an entry in the symbol table called `foo' which holds the address of memory location 1000, but nothing special is stored at address 1000. This means that you cannot access the _value_ of a linker script defined symbol - it has no value - all you can do is access the _address_ of a linker script defined symbol. >> See above Hence when you are using a linker script defined symbol in source code you should always take the address of the symbol, and never attempt to use its value. For example suppose you want to copy the contents of a section of memory called .ROM into a section called .FLASH and the linker script contains these declarations: start_of_ROM = .ROM; end_of_ROM = .ROM + sizeof (.ROM) - 1; start_of_FLASH = .FLASH; start_of_FLASH = .FLASH; Then the C source code to perform the copy would be: extern char start_of_ROM, end_of_ROM, start_of_FLASH; >> A better practice is to define these variables as char start_of_ROM [], etc. >> This causes the compiler to complain if these variables are read from or >> written to, e.g., if the address-of operator & is forgotten, as the author >> describes below. >> Furthermore, non-writable sections should be const qualified. memcpy (& start_of_FLASH, & start_of_ROM, & end_of_ROM - & start_of_ROM)\ ; Note the use of the `&' operators. These are correct. -- Summary: Bogus documentation Product: binutils Version: 2.19 Status: NEW Severity: normal Priority: P2 Component: ld AssignedTo: unassigned at sources dot redhat dot com ReportedBy: konrad dot schwarz at siemens dot com CC: bug-binutils at gnu dot org http://sourceware.org/bugzilla/show_bug.cgi?id=10774 ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is. _______________________________________________ bug-binutils mailing list bug-binutils@gnu.org http://lists.gnu.org/mailman/listinfo/bug-binutils