Blaise Boscaccy <bbosca...@linux.microsoft.com> writes: > Alexei Starovoitov <alexei.starovoi...@gmail.com> writes: > >> On Mon, Apr 14, 2025 at 5:32 PM Blaise Boscaccy >> <bbosca...@linux.microsoft.com> wrote: >>> >>> Alexei Starovoitov <alexei.starovoi...@gmail.com> writes: >>> >>> > On Sat, Apr 12, 2025 at 6:58 AM Blaise Boscaccy >>> > <bbosca...@linux.microsoft.com> wrote: >>> >> >>> >> TAlexei Starovoitov <alexei.starovoi...@gmail.com> writes: >>> >> >>> >> > On Fri, Apr 4, 2025 at 2:56 PM Blaise Boscaccy >>> >> > <bbosca...@linux.microsoft.com> wrote: >>> >> >> + >>> >> >> +static int hornet_find_maps(struct bpf_prog *prog, struct >>> >> >> hornet_maps *maps) >>> >> >> +{ >>> >> >> + struct bpf_insn *insn = prog->insnsi; >>> >> >> + int insn_cnt = prog->len; >>> >> >> + int i; >>> >> >> + int err; >>> >> >> + >>> >> >> + for (i = 0; i < insn_cnt; i++, insn++) { >>> >> >> + if (insn[0].code == (BPF_LD | BPF_IMM | BPF_DW)) { >>> >> >> + switch (insn[0].src_reg) { >>> >> >> + case BPF_PSEUDO_MAP_IDX_VALUE: >>> >> >> + case BPF_PSEUDO_MAP_IDX: >>> >> >> + err = add_used_map(maps, insn[0].imm); >>> >> >> + if (err < 0) >>> >> >> + return err; >>> >> >> + break; >>> >> >> + default: >>> >> >> + break; >>> >> >> + } >>> >> >> + } >>> >> >> + } >>> >> > >>> >> > ... >>> >> > >>> >> >> + if (!map->frozen) { >>> >> >> + attr.map_fd = fd; >>> >> >> + err = kern_sys_bpf(BPF_MAP_FREEZE, &attr, >>> >> >> sizeof(attr)); >>> >> > >>> >> > Sorry for the delay. Still swamped after conferences and the merge >>> >> > window. >>> >> > >>> >> >>> >> No worries. >>> >> >>> >> > Above are serious layering violations. >>> >> > LSMs should not be looking that deep into bpf instructions. >>> >> >>> >> These aren't BPF internals; this is data passed in from >>> >> userspace. Inspecting userspace function inputs is definitely within the >>> >> purview of an LSM. >>> >> >>> >> Lskel signature verification doesn't actually need a full disassembly, >>> >> but it does need all the maps used by the lskel. Due to API design >>> >> choices, this unfortunately requires disassembling the program to see >>> >> which array indexes are being used. >>> >> >>> >> > Calling into sys_bpf from LSM is plain nack. >>> >> > >>> >> >>> >> kern_sys_bpf is an EXPORT_SYMBOL, which means that it should be callable >>> >> from a module. >>> > >>> > It's a leftover. >>> > kern_sys_bpf() is not something that arbitrary kernel >>> > modules should call. >>> > It was added to work for cases where kernel modules >>> > carry their own lskels. >>> > That use case is gone, so EXPORT_SYMBOL will be removed. >>> > >>> >>> I'm not following that at all. You recommended using module-based lskels >>> to get around code signing requirements at lsfmmbpf and now you want to >>> nuke that entire feature? And further, skel_internal will no longer be >>> usable from within the kernel and bpf_preload is no longer going to be >>> supported? > > The eBPF dev community has spent what, 4-5 years on this, with little to > no progress. I have little faith that this is going to progress on your > end in a timely manner or at all, and frankly we (and others) needed > this yesterday. Hornet has zero impact on the bpf subsystem, yet you > seem viscerally opposed to us doing this. Why are you trying to stop us > from securing our cloud? > >> >> It was exported to modules to run lskel-s from modules. >> It's bpf internal api, but seeing how you want to abuse it >> the feature has to go. Sadly. >>
In the interest of not harming downstream users that possibly rely on BPF_PRELOAD, it would be completely fine on our end to not call kern_sys_bpf since maps can easily be frozen in userspace by the loader. I'd prefer LSMs to be not mutating things if at all possible as well. If we agreed to not call that function so-as you can keep that feature for everyone, would you be ammenable to a simple patch in skel_internal.h that freezes maps? e.g diff --git a/tools/lib/bpf/skel_internal.h b/tools/lib/bpf/skel_internal.h index 4d5fa079b5d6..51e72dc23062 100644 --- a/tools/lib/bpf/skel_internal.h +++ b/tools/lib/bpf/skel_internal.h @@ -263,6 +263,17 @@ static inline int skel_map_delete_elem(int fd, const void *key) return skel_sys_bpf(BPF_MAP_DELETE_ELEM, &attr, attr_sz); } +static inline int skel_map_freeze(int fd) +{ + const size_t attr_sz = offsetofend(union bpf_attr, map_fd); + union bpf_attr attr; + + memset(&attr, 0, attr_sz); + attr.map_fd = fd; + + return skel_sys_bpf(BPF_MAP_FREEZE, &attr, attr_sz); +} + static inline int skel_map_get_fd_by_id(__u32 id) { const size_t attr_sz = offsetofend(union bpf_attr, flags); @@ -327,6 +338,13 @@ static inline int bpf_load_and_run(struct bpf_load_and_run_opts *opts) goto out; } + err = skel_map_freeze(map_fd); + if (err < 0) { + opts->errstr = "failed to freeze map"; + set_err; + goto out; + } + memset(&attr, 0, prog_load_attr_sz); attr.prog_type = BPF_PROG_TYPE_SYSCALL; attr.insns = (long) opts->insns; >>> >> Lskels without frozen maps are vulnerable to a TOCTOU >>> >> attack from a sufficiently privileged user. Lskels currently pass >>> >> unfrozen maps into the kernel, and there is nothing stopping someone >>> >> from modifying them between BPF_PROG_LOAD and BPF_PROG_RUN. >>> >> >>> >> > The verification of module signatures is a job of the module loading >>> >> > process. >>> >> > The same thing should be done by the bpf system. >>> >> > The signature needs to be passed into sys_bpf syscall >>> >> > as a part of BPF_PROG_LOAD command. >>> >> > It probably should be two new fields in union bpf_attr >>> >> > (signature and length), >>> >> > and the whole thing should be processed as part of the loading >>> >> > with human readable error reported back through the verifier log >>> >> > in case of signature mismatch, etc. >>> >> > >>> >> >>> >> I don't necessarily disagree, but my main concern with this is that >>> >> previous code signing patchsets seem to get gaslit or have the goalposts >>> >> moved until they die or are abandoned. >>> > >>> > Previous attempts to add signing failed because >>> > 1. It's a difficult problem to solve >>> > 2. people only cared about their own narrow use case and not >>> > considering the needs of bpf ecosystem as a whole. >>> > >>> >> Are you saying that at this point, you would be amenable to an in-tree >>> >> set of patches that enforce signature verification of lskels during >>> >> BPF_PROG_LOAD that live in syscall.c, >>> > >>> > that's the only way to do it. >>> > >>> >>> So the notion of forcing people into writing bpf-based gatekeeper programs >>> is being abandoned? e.g. >>> >>> https://lore.kernel.org/bpf/bqxgv2tqk3hp3q3lcdqsw27btmlwqfkhyg6kohsw7lwdgbeol7@nkbxnrhpn7qr/#t >>> https://lore.kernel.org/bpf/61aae2da8c7b0_68de0208dd@john.notmuch/ >> >> Not abandoned. >> bpf-based tuning of load conditions is still necessary. >> The bpf_prog_load command will check the signature only. >> It won't start rejecting progs that don't have a signature. >> For that a one liner bpf-lsm or C-based lsm would be needed >> to address your dont-trust-root use case. >> > > Since this will require an LSM no matter what, there is zero reason for > us not to proceed with Hornet. If or when you actually figure out how to > sign an lskel and upstream updated LSM hooks, I can always rework Hornet > to use that instead. > >>> >>> >> without adding extra non-code >>> >> signing requirements like attachment point verification, completely >>> >> eBPF-based solutions, or rich eBPF-based program run-time policy >>> >> enforcement? >>> > >>> > Those are secondary considerations that should also be discussed. >>> > Not necessarily a blocker. >>> >>> Again, I'm confused here since you recently stated this whole thing >>> was "questionable" without attachment point verification. >> >> Correct. >> For fentry prog type the attachment point is checked during the load, >> but for tracepoints it's not, and anyone who is claiming that >> their system is secure because the tracepoint prog was signed >> is simply clueless in how bpf works. > > No one is making that claim, although I do appreciate the lovely > ad-hominem attack and absolutist standpoint. It's not like we invented > code signing last week. All we are trying to do is make our cloud > ever-so-slightly more secure and share the results with the community. > > The attack vectors I'm looking at are things like CVE-2021-33200. ACLs > for attachment points do nothing to stop that whereas code signing is a > possible countermeasure. This kind of thing is probably a non-issue with > your private cloud, but it's a very real issue with publicly accessible > ones.