> Am 10.03.2025 um 08:16 schrieb Jakub Jelinek <ja...@redhat.com>:
> 
> Hi!
> 
> The following testcase shows a bug in unwind-dw2-btree.h.
> In short, the header provides lock-free btree data structure (so no parent
> link on nodes, both insertion and deletion are done in top-down walks
> with some locking of just a few nodes at a time so that lookups can notice
> concurrent modifications and retry, non-leaf (inner) nodes contain keys
> which are initially the base address of the left-most leaf entry of the
> following child (or all ones if there is none) minus one, insertion ensures
> balancing of the tree to ensure [d/2, d] entries filled through aggressive
> splitting if it sees a full tree while walking, deletion performs various
> operations like merging neighbour trees, merging into parent or moving some
> nodes from neighbour to the current one).
> What differs from the textbook implementations is mostly that the leaf nodes
> don't include just address as a key, but address range, address + size
> (where we don't insert any ranges with zero size) and the lookups can be
> performed for any address in the [address, address + size) range.  The keys
> on inner nodes are still just address-1, so the child covers all nodes
> where addr <= key unless it is covered already in children to the left.
> The user (static executables or JIT) should always ensure there is no
> overlap in between any of the ranges.
> 
> In the testcase a bunch of insertions are done, always followed by one
> removal, followed by one insertion of a range slightly different from the
> removed one.  E.g. in the first case [&code[0x50], &code[0x59]] range
> is removed and then we insert [&code[0x4c], &code[0x53]] range instead.
> This is valid, it doesn't overlap anything.  But the problem is that some
> non-leaf (inner) one used the &code[0x4f] key (after the 11 insertions
> completely correctly).  On removal, nothing adjusts the keys on the parent
> nodes (it really can't in the top-down only walk, the keys could be many nodes
> above it and unlike insertion, removal only knows the start address, doesn't
> know the removed size and so will discover it only when reaching the leaf
> node which contains it; plus even if it knew the address and size, it still
> doesn't know what the second left-most leaf node will be (i.e. the one after
> removal)).  And on insertion, if nodes aren't split at a level, nothing
> adjusts the inner keys either.  If a range is inserted and is either fully
> bellow key (keys are - 1, so having address + size - 1 being equal to key is
> fine) or fully after key (i.e. address > key), it works just fine, but if
> the key is in a middle of the range like in this case, &code[0x4f] is in the
> middle of the [&code[0x4c], &code[0x53]] range, then insertion works fine
> (we only use size on the leaf nodes), and lookup of the addresses below
> the key work fine too (i.e. [&code[0x4c], &code[0x4f]] will succeed).
> The problem is with lookups after the key (i.e. [&code[0x50, &code[0x53]]),
> the lookup looks for them in different children of the btree and doesn't
> find an entry and returns NULL.
> 
> As users need to ensure non-overlapping entries at any time, the following
> patch fixes it by adjusting keys during insertion where we know not just
> the address but also size; if we find during the top-down walk a key
> which is in the middle of the range being inserted, we simply increase the
> key to be equal to address + size - 1 of the range being inserted.
> There can't be any existing leaf nodes overlapping the range in correct
> programs and the btree rebalancing done on deletion ensures we don't have
> any empty nodes which would also cause problems.
> 
> The patch adjusts the keys in two spots, once for the current node being
> walked (the last hunk in the header, with large comment trying to explain
> it) and once during inner node splitting in a parent node if we'd otherwise
> try to add that key in the middle of the range being inserted into the
> parent node (in that case it would be missed in the last hunk).
> The testcase covers both of those spots, so succeeds with GCC 12 (which
> didn't have btrees) and fails with vanilla GCC trunk and also fails if
> either the
> if (fence < base + size - 1)
>  fence = iter->content.children[slot].separator = base + size - 1;
> or
> if (left_fence >= target && left_fence < target + size - 1)
>  left_fence = target + size - 1;
> hunk is removed (of course, only with the current node sizes, i.e. up to
> 15 children of inner nodes and up to 10 entries in leaf nodes).
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk
> and after a while affected rel?

Ok

Thanks,
Richard


> 2025-03-10  Jakub Jelinek  <ja...@redhat.com>
>      Michael Leuchtenburg  <mich...@slashhome.org>
> 
>  PR libgcc/119151
>  * unwind-dw2-btree.h (btree_split_inner): Add size argument.  If
>  left_fence is in the middle of [target,size) range, increase it to
>  target + size - 1.
>  (btree_insert): Adjust btree_split_inner caller.  If fence is smaller
>  than base + size - 1, increase it and separator of the slot to
>  base + size - 1.
> 
>  * gcc.dg/pr119151.c: New test.
> 
> --- libgcc/unwind-dw2-btree.h.jj    2025-01-02 11:47:49.803948854 +0100
> +++ libgcc/unwind-dw2-btree.h    2025-03-08 10:23:28.437681484 +0100
> @@ -474,7 +474,8 @@ btree_handle_root_split (struct btree *t
> // Split an inner node.
> static void
> btree_split_inner (struct btree *t, struct btree_node **inner,
> -           struct btree_node **parent, uintptr_type target)
> +           struct btree_node **parent, uintptr_type target,
> +           uintptr_type size)
> {
> // Check for the root.
> btree_handle_root_split (t, inner, parent);
> @@ -490,6 +491,9 @@ btree_split_inner (struct btree *t, stru
>     = left_inner->content.children[split + index];
> left_inner->entry_count = split;
> uintptr_type left_fence = btree_node_get_fence_key (left_inner);
> +  if (left_fence >= target && left_fence < target + size - 1)
> +    // See the PR119151 comment in btree_insert.
> +    left_fence = target + size - 1;
> btree_node_update_separator_after_split (*parent, right_fence, left_fence,
>                     right_inner);
> if (target <= left_fence)
> @@ -753,13 +757,28 @@ btree_insert (struct btree *t, uintptr_t
>   {
>     // Use eager splits to avoid lock coupling up.
>     if (iter->entry_count == max_fanout_inner)
> -    btree_split_inner (t, &iter, &parent, base);
> +    btree_split_inner (t, &iter, &parent, base, size);
> 
>     unsigned slot = btree_node_find_inner_slot (iter, base);
>     if (parent)
>  btree_node_unlock_exclusive (parent);
>     parent = iter;
>     fence = iter->content.children[slot].separator;
> +      if (fence < base + size - 1)
> +    // The separator was set to the base - 1 of the leftmost leaf child
> +    // at some point but such an entry could have been removed afterwards.
> +    // As both insertion and removal are just walking down the tree with
> +    // only a few current nodes locked at a time, updating the separator
> +    // on removal is not possible, especially because btree_remove does
> +    // not know the size until it reaches leaf node.  We must ensure that
> +    // the separator is not in a middle of some entry though, as
> +    // btree_lookup can look up any address in the entry's range and if
> +    // the separator is in the middle, addresses below it or equal to it
> +    // would be found while addresses above it would result in failed
> +    // lookup.  Update the separator now.  Assumption that users
> +    // ensure no overlapping registered ranges, there should be no
> +    // current entry for any address in the range.  See PR119151.
> +    fence = iter->content.children[slot].separator = base + size - 1;
>     iter = iter->content.children[slot].child;
>     btree_node_lock_exclusive (iter);
>   }
> --- gcc/testsuite/gcc.dg/pr119151.c.jj    2025-03-08 17:35:58.346189088 +0100
> +++ gcc/testsuite/gcc.dg/pr119151.c    2025-03-08 21:00:51.997675908 +0100
> @@ -0,0 +1,151 @@
> +/* PR libgcc/119151 */
> +/* Should be run just on targets which don't have _Unwind_Find_FDE in 
> libc.so.  */
> +/* { dg-do run { target { { x86_64-*-linux* aarch64*-*-linux* 
> powerpc64*-*-linux* riscv*-*-linux* } && lp64 } } } */
> +/* { dg-options "-O2" } */
> +
> +struct object
> +{
> +  void *pc_begin, *tbase, *dbase, *single;
> +  __SIZE_TYPE__ i;
> +  void *fde_end, *next;
> +};
> +struct dwarf_eh_bases
> +{
> +  void *tbase, *dbase, *func;
> +};
> +extern void __register_frame_info (const void *, struct object *);
> +extern void *__deregister_frame_info (const void *);
> +extern const void *_Unwind_Find_FDE (void *, struct dwarf_eh_bases *);
> +#define DW_EH_PE_sdata8    0x0c
> +#define DW_EH_PE_pcrel    0x10
> +#define DW_CFA_def_cfa    0x0c
> +#define DW_CFA_offset    0x80
> +
> +struct __attribute__((aligned (8))) eh_frame_cie {
> +  unsigned len;
> +  unsigned tag;
> +  unsigned char version;
> +  unsigned char augmentation[3];
> +  unsigned char code_align_factor;
> +  unsigned char data_align_factor;
> +  unsigned char ra_column;
> +  unsigned char augmentation_size;
> +  unsigned char encoding;
> +  unsigned char def_cfa;
> +  unsigned char def_cfa_op1, def_cfa_op2;
> +  unsigned char offset;
> +  unsigned char offset_op;
> +};
> +struct __attribute__((aligned (8))) eh_frame_fde {
> +  unsigned len;
> +  unsigned cie_offset;
> +  unsigned long long begin, size;
> +  unsigned char augmentation;
> +};
> +struct eh_frame_cie_fde {
> +  struct eh_frame_cie cie;
> +  struct eh_frame_fde fde;
> +  unsigned int zero;
> +  struct object obj;
> +} eh_frame[256];
> +unsigned ehidx;
> +unsigned char code[0x800] __attribute__((aligned (8)));
> +
> +void *
> +register_range (void *addr, unsigned size)
> +{
> +  /* Fills in empty-ish CIE and FDE with pcrel sdata8 encoding so that
> +     we don't need to worry about lp64 large code models.
> +     We don't actually execute anything in code and only _Unwind_Find_FDE,
> +     don't actually try to unwind anything.  */
> +  eh_frame[ehidx].cie.len
> +    = (unsigned) ((char *) &eh_frame[ehidx].fde
> +          - (char *) &eh_frame[ehidx].cie.tag);
> +  eh_frame[ehidx].cie.tag = 0;
> +  eh_frame[ehidx].cie.version = 3;
> +  __builtin_memcpy (eh_frame[ehidx].cie.augmentation, "zR", 3);
> +  eh_frame[ehidx].cie.code_align_factor = 1;
> +  eh_frame[ehidx].cie.data_align_factor = 0x78; /* sleb128 -8 */
> +  eh_frame[ehidx].cie.ra_column = 0x10;
> +  eh_frame[ehidx].cie.augmentation_size = 1;
> +  eh_frame[ehidx].cie.encoding = DW_EH_PE_pcrel | DW_EH_PE_sdata8;
> +  eh_frame[ehidx].cie.def_cfa = DW_CFA_def_cfa;
> +  eh_frame[ehidx].cie.def_cfa_op1 = 7;
> +  eh_frame[ehidx].cie.def_cfa_op2 = 8;
> +  eh_frame[ehidx].cie.offset = DW_CFA_offset + 0x10;
> +  eh_frame[ehidx].cie.offset_op = 1;
> +  eh_frame[ehidx].fde.len
> +    = (unsigned) ((char *) &eh_frame[ehidx].zero
> +          - (char *) &eh_frame[ehidx].fde.cie_offset);
> +  eh_frame[ehidx].fde.cie_offset
> +    = (unsigned) ((char *) &eh_frame[ehidx].fde.cie_offset
> +          - (char *) &eh_frame[ehidx].cie);
> +  eh_frame[ehidx].fde.begin
> +    = (__INTPTR_TYPE__) ((__UINTPTR_TYPE__) addr
> +             - (__UINTPTR_TYPE__) &eh_frame[ehidx].fde.begin);
> +  eh_frame[ehidx].fde.size = size;
> +  eh_frame[ehidx].fde.augmentation = 0;
> +  eh_frame[ehidx].zero = 0;
> +  __register_frame_info (&eh_frame[ehidx].cie, &eh_frame[ehidx].obj);
> +  ++ehidx;
> +  return &eh_frame[ehidx - 1].cie;
> +}
> +
> +void
> +unregister (void *eh_frame)
> +{
> +  __deregister_frame_info (eh_frame);
> +}
> +
> +int
> +main ()
> +{
> +  for (int i = 0; i < 0x50; i += 0x10)
> +    register_range (&code[i], 10);
> +  void *p = register_range (&code[0x50], 10);
> +  for (int i = 0x60; i < 0xb0; i += 0x10)
> +    register_range (&code[i], 10);
> +  unregister (p);
> +  register_range (&code[0x4c], 8);
> +  struct dwarf_eh_bases bases;
> +  const void *q = _Unwind_Find_FDE (&code[0x4c], &bases);
> +  const void *r = _Unwind_Find_FDE (&code[0x51], &bases);
> +  if (!q || q != r)
> +    __builtin_abort ();
> +  for (int i = 0; i <= 0xa0; i += 0x10)
> +    if (i != 0x50)
> +      {
> +        q = _Unwind_Find_FDE (&code[i], &bases);
> +        r = _Unwind_Find_FDE (&code[i + 9], &bases);
> +        if (!q || q != r)
> +      __builtin_abort ();
> +      }
> +  for (int i = 0xb0; i < 0x240; i += 0x10)
> +    register_range (&code[i], 10);
> +  p = register_range (&code[0x240], 10);
> +  for (int i = 0x250; i < 0x470; i += 0x10)
> +    register_range (&code[i], 10);
> +  void *s = register_range (&code[0x470], 10);
> +  for (int i = 0x480; i < 0x700; i += 0x10)
> +    register_range (&code[i], 10);
> +  unregister (p);
> +  register_range (&code[0x23c], 16);
> +  q = _Unwind_Find_FDE (&code[0x23d], &bases);
> +  r = _Unwind_Find_FDE (&code[0x24b], &bases);
> +  if (!q || q != r)
> +    __builtin_abort ();
> +  unregister (s);
> +  register_range (&code[0x46c], 16);
> +  q = _Unwind_Find_FDE (&code[0x46d], &bases);
> +  r = _Unwind_Find_FDE (&code[0x47b], &bases);
> +  if (!q || q != r)
> +    __builtin_abort ();
> +  for (int i = 0; i < 0x700; i += 0x10)
> +    if (i != 0x50 && i != 0x240 && i != 0x470)
> +      {
> +        q = _Unwind_Find_FDE (&code[i], &bases);
> +        r = _Unwind_Find_FDE (&code[i + 9], &bases);
> +        if (!q || q != r)
> +      __builtin_abort ();
> +      }
> +}
> 
>  Jakub

Reply via email to