# Add DW_LNS_indirect_line - update `line` to absolute value stored indirectly

## Background

In many source languages, it is possible for many program-counter addresses with arbitrary separation to correspond to the same source line due to features like templates/generics. When designing an incremental compiler, the line number program must be updated when line numbers within a source file are moved. It would be desirable to have the property that when moving a source line corresponding to a large amount of distinct program-counter addresses, only one line number value in the DWARF information needs to be updated. For this to be true, the regions of the line number program corresponding to each such address must include the line number of the source construct not directly, but through an indirect reference. This allows one line number value stored in the binary
to be shared across arbitrarily many entries in the line number matrix.

This is not currently possible: all modifications to the `line` register are given by relative offsets, and all of these offsets are directly included in the instruction (or implicit in the case
of a special opcode).

## Overview

Introduce new fields to the line number program header, `indirect_lines_length` (ULEB128) and `indirect_lines` (opaque block of bytes containing ULEB128 values). The `indirect_lines_length` field is the length in bytes of the `indirect_lines` section, rather than the number of elements. Introduce a new standard opcode to the line number program, `DW_LNS_indirect_line`. This opcode takes a single ULEB128 operand, which represents a byte offset into the `indirect_lines` stored in the header. The effect of this instruction is to set the `line` register to the ULEB128 value stored at the given byte offset into `indirect_lines`. Note that `indirect_lines` is not itself validated to be a valid sequence of ULEB128 values; decoding only occurs when `DW_LNS_indirect_line` is used. This allows an incremental compiler to pre-allocate a large amount of padding space in
`indirect_lines` to fill in later as needed.

Note that an incremental compiler would not necessarily wish to use variable-length integers to represent this information, since certain changes of line numbers could cause a line number which was previously encoded using 1 byte to now require 2. However, since the stored values need not be densely packed, an implementation is free to reserve as much space as is necessary for each entry. For instance, the downstream Zig compiler (which is the original motivator for this proposal) may choose to reserve 4 or 5 bytes for each line number, as line numbers in Zig source files cannot exceed 1<<32. The use of ULEB128 allows the compiler to make an appropriate decision here instead of
codifying such a restriction into the DWARF specification.

## Proposed Changes

Pages and line numbers are given for the 2024-06-16 working draft of DWARF Version 6, which is the
latest draft at the time of writing.

6.2.4 (pg 163; line 27)

21. indirect_lines_length (ULEB128)
    The length in bytes of the data stored in the `indirect_lines` field.
22. indirect_lines (block containing ULEB128 entries)
    A collection of line numbers, each stored as a ULEB128 integer. These values are referenced by     DW_LNS_indirect_line instructions to modify the state of the line number information state
    machine.

    The data stored in this field is not checked to be a valid sequence of ULEB128 entries. The     contained data may include padding bytes or otherwise invalid data. As such, it is expected that     bytes of this field be accessed only when a DW_LNS_indirect_line instruction references them.

6.2.5.2 (pg 170; line 23)

14. DW_LNS_indirect_line
    The DW_LNS_indirect_line opcode takes a single unsigned LEB128 operand. This operand is     interpreted as a byte offset into the `indirect_lines` field of the line number program header.     An unsigned LEB128 value is read from `indirect_lines` at the given offset, and this value is
    stored into the state machine's `line` register.

7.22 (pg 246; table 7.25)

 Opcode name          | Value
----------------------+-------
       ...            |  ...
DW_LNS_indirect_line  | 0x0d

--
Dwarf-discuss mailing list
Dwarf-discuss@lists.dwarfstd.org
https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss

Reply via email to