On 02/27/2018 01:13 PM, Sandipan Das wrote:
> "Naveen N. Rao" wrote:
>> I'm wondering if we can instead encode the bpf prog id in
>> imm32. That way, we should be able to indicate the BPF
>> function being called into. Daniel, is that something we
>> can consider?
>
> Since each subprog does not get a separate id, we cannot fetch
> the fd and therefore the tag of a subprog. Instead we can use
> the tag of the complete program as shown below.
>
> "Daniel Borkmann" wrote:
>> I think one limitation that would still need to be addressed later
>> with such approach would be regarding the xlated prog dump in bpftool,
>> see 'BPF calls via JIT' in 7105e828c087 ("bpf: allow for correlation
>> of maps and helpers in dump"). Any ideas for this (potentially if we
>> could use off + imm for calls, we'd get to 48 bits, but that seems
>> still not be enough as you say)?
>
> As an alternative, this is what I am thinking of:
>
> Currently, for bpf-to-bpf calls, if bpf_jit_kallsyms is enabled,
> bpftool looks up the name of the corresponding symbol for the
> JIT-ed subprogram and shows it in the xlated dump alongside the
> actual call instruction. However, the lookup is based on the
> target address which is calculated using the imm field of the
> instruction. So, once again, if imm is truncated, we will end
> up with the wrong address. Also, the subprog aux data (which
> has been proposed as a mitigation for this) is not accessible
> from this tool.
>
> We can still access the tag for the complete bpf program and use
> this with the correct offset in an objdump-like notation as an
> alterative for the name of the subprog that is the target of a
> bpf-to-bpf call instruction.
>
> Currently, an xlated dump looks like this:
> 0: (85) call pc+2#bpf_prog_5f76847930402518_F
> 1: (b7) r0 = 1
> 2: (95) exit
> 3: (b7) r0 = 2
> 4: (95) exit
>
> With this patch, it will look like this:
> 0: (85) call pc+2#bpf_prog_8f85936f29a7790a+3
(Note the +2 is the insn->off already.)
> 1: (b7) r0 = 1
> 2: (95) exit
> 3: (b7) r0 = 2
> 4: (95) exit
>
> where 8f85936f29a7790a is the tag of the bpf program and 3 is
> the offset to the start of the subprog from the start of the
> program.
The problem with this approach would be that right now the name is
something like bpf_prog_5f76847930402518_F where the subprog tag is
just a placeholder so in future, this may well adapt to e.g. the actual
function name from the elf file. Note that when kallsyms is enabled
then a name like bpf_prog_5f76847930402518_F will also appear in stack
traces, perf records, etc, so for correlation/debugging it would really
help to have them the same everywhere.
Worst case if there's nothing better, potentially what one could do in
bpf_prog_get_info_by_fd() is to dump an array of full addresses and
have the imm part as the index pointing to one of them, just unfortunate
that it's likely only needed in ppc64.