On 2/28/19 11:07 AM, Andrii Nakryiko wrote: > On Thu, Feb 28, 2019 at 10:19 AM Yonghong Song <y...@fb.com> wrote: >> >> >> >> On 2/27/19 2:46 PM, Andrii Nakryiko wrote: >>> When checking available canonical candidates for struct/union algorithm >>> utilizes btf_dedup_is_equiv to determine if candidate is suitable. This >>> check is not enough when candidate is corresponding FWD for that >>> struct/union, because according to equivalence logic they are >>> equivalent. When it so happens that FWD and STRUCT/UNION end in hashing >>> to the same bucket, it's possible to create remapping loop from FWD to >>> STRUCT and STRUCT to same FWD, which will cause btf_dedup() to loop >>> forever. >>> >>> This patch fixes the issue by additionally checking that type and >>> canonical candidate are strictly equal (utilizing btf_equal_struct). >> >> It looks like btf_equal_struct() checking equality except >> member type id's. Maybe calling it btf_almost_equal_struct() or >> something like that? > > Yes, for struct/union we can't compare types directly, that's what > btf_dedup_is_equiv is doing. I think btf_equal_struct w/ comment > explaining this particular behavior is good enough. If you insist, > though, I'd rather go to something like btf_shallow_equal_struct or > something along those lines.
btf_shallow_equal_struct() will be fine. > >> >>> >>> Fixes: d5caef5b5655 ("btf: add BTF types deduplication algorithm") >>> Reported-by: Arnaldo Carvalho de Melo <a...@redhat.com> >>> Signed-off-by: Andrii Nakryiko <andr...@fb.com> >>> --- >>> tools/lib/bpf/btf.c | 6 +++++- >>> 1 file changed, 5 insertions(+), 1 deletion(-) >>> >>> diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c >>> index 6bbb710216e6..53db26d158c9 100644 >>> --- a/tools/lib/bpf/btf.c >>> +++ b/tools/lib/bpf/btf.c >>> @@ -2255,7 +2255,7 @@ static void btf_dedup_merge_hypot_map(struct >>> btf_dedup *d) >>> static int btf_dedup_struct_type(struct btf_dedup *d, __u32 type_id) >>> { >>> struct btf_dedup_node *cand_node; >>> - struct btf_type *t; >>> + struct btf_type *cand_type, *t; >>> /* if we don't find equivalent type, then we are canonical */ >>> __u32 new_id = type_id; >>> __u16 kind; >>> @@ -2275,6 +2275,10 @@ static int btf_dedup_struct_type(struct btf_dedup >>> *d, __u32 type_id) >>> for_each_dedup_cand(d, h, cand_node) { >>> int eq; >>> >>> + cand_type = d->btf->types[cand_node->type_id]; >>> + if (!btf_equal_struct(t, cand_type)) >> >> The comment for this btf_equal_struct is not quite right. >> /* >> * Check structural compatibility of two FUNC_PROTOs, ignoring >> referenced type >> * IDs. This check is performed during type graph equivalence check and >> * referenced types equivalence is checked separately. >> */ >> static bool btf_equal_struct(struct btf_type *t1, struct btf_type *t2) >> >> It should be two "struct/union types". > > Yep, good catch, will fix! > >> >>> + continue; >>> + >> >> I did not trace the algorithm how infinite loop happens. But the above > > Check the test in follow up patch. It has a minimal example that > triggers this bug. It happens when we have some FWD x, which we > discover that it should be resolved to some STRUCT x (as a result of > equivalence check/resolution of some other struct s, that references > struct x internally). But that struct x might not have been > deduplicated yet, we just record this FWD -> STRUCT mapping so that we > don't lose this connection. Later, once we get to deduplication of > struct x, FWD x will be (in case of hash collision) one possible > candidate to consider for deduplication. At that point, > btf_dedup_is_equiv will consider them equivalent (but they are not > equal (!), that's where the bug is), so we'll try to resolve STRUCT x > -> FWD x, which creates a loop. > > In btf_dedup_merge_hypot_map() that is used to record discovered > "equivalences" during struct/union type graph equivalence check, we > have explicit check to never resolve STRUCT/UNION into equivalent FWD, > so such loop shouldn't happen, except I missed the case of having FWD > as a possible dedup candidate due to hash collision. > >> change is certainly a correct one, you want to do deduplication only >> after everything else (except member types) are euqal? > > Well, if not for special case of FWD == STRUCT/UNION when > deduplicating structs, btf_dedup_is_equiv would be enough, because it > already checks for btf_equal_struct internally, when both types are > struct/union. It's just the special bit at the beginning of is_equiv > check that allows FWD and STRUCT/UNION with the same name to be > declared equivalent, that throws this off. > >> >> If the bug is due to circle in struct->fwd and fwd->struct mappings, >> maybe a simple check whether such circle exists or not before update >> the mapping will also work? I am not proposing this fix, but want >> to understand better the issue. > > That's essentially what we use btf_equal_struct for here, really. We > could equivalently just check BTF_INFO_KIND(t) == BTF_INFO_KIND(cand) > explicitly, but I btf_equal_struct feels a bit more generic and > obviously correct. Okay, I see. So the goal is really to prevent processing FWD in the struct/union dedup candidate list. It will be good to summarize the above detailed explanation in commit message. With the above suggested changes, Acked-by: Yonghong Song <y...@fb.com> > >> >> >> >> >>> btf_dedup_clear_hypot_map(d); >>> eq = btf_dedup_is_equiv(d, type_id, cand_node->type_id); >>> if (eq < 0) >>>