On 04/26/2017 05:42 AM, Alexei Starovoitov wrote:
On 4/25/17 9:52 AM, David Miller wrote:
10: (15) if r3 == 0xdd86 goto pc+9
R0=imm2,min_value=2,max_value=2 R1=pkt(id=0,off=0,r=14) R2=pkt_end R3=inv
R4=pkt(id=0,off=14,r=14) R5=inv56 R10=fp
Hmmm, endianness looks wrong here. "-target bpf" defaults to the
endianness of whatever cpu that llvm was built for, right?
yeah. so here the code comes from:
} else if (eth->h_proto == _htons(ETH_P_IPV6)) {
and in the beginning of that .c file:
#define _htons __builtin_bswap16
^^^ that's the bug.
It should have been nop for sparc.
We cannot use htons() from system header, since it uses x86 inline asm.
Ahh yes, you're right! Wouldn't it be much easier for the above
case to just include <asm/byteorder.h> and use __constant_htons()?
(In fact, all htons() users in this file are constants.)
R0=imm2,min_value=2,max_value=2 R1=inv56,min_value=6,max_value=6 R2=pkt_end
R3=imm3,min_value=3,max_value=3 R4=pkt(id=1,off=14,r=34)
R5=pkt(id=1,off=34,r=34) R10=fp
36: (07) r4 += 18
37: (2d) if r4 > r2 goto pc+5
R0=imm2,min_value=2,max_value=2 R1=inv56,min_value=6,max_value=6 R2=pkt_end
R3=imm3,min_value=3,max_value=3 R4=pkt(id=1,off=32,r=34)
R5=pkt(id=1,off=34,r=34) R10=fp
38: (b7) r0 = 0
39: (69) r1 = *(u16 *)(r4 +0)
Unknown alignment. Only byte-sized access allowed in packet access.
And this seems to load the urgent pointer as a u16 which is what the verifier
rejects.
Oh I see, this is guarded by CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS.
yes. exactly.
Your analysis is correct and offending 16-bit load is this C code:
if (tcp->urg_ptr == 123)
the parsing code before that line deals with variable length ip header:
tcp = (struct tcphdr *)((void *)(iph) + ihl_len);
and at that point verifier cannot track the alignment of the pointer
anymore. It knows 'iph' alignment, but as soon as some variable is added
to it cannot assume good alignment anymore and from there on it
allows >1 byte accesses only on HAVE_EFFICIENT_UNALIGNED_ACCESS
architectures.
That's the 'if' that you found:
'if (reg->id && size != 1) {'
ref->id > 0 means that some variable was added to ptr_to_packet.
That sucks for sparc, of course. I don't have good suggestions for
workaround other than marking tcphdr as packed :(
From code efficiency point of view it still will be faster than
LD_ABS insn.
Plus, ld_abs would also only work on skbs right now, and would
need JIT support for XDP. But it's also cumbersome to use with
f.e. IPv6, etc, so should not be encouraged, imo.
One other option for !HAVE_EFFICIENT_UNALIGNED_ACCESS archs could
be to provide a bpf_skb_load_bytes() helper equivalent for XDP,
where you eventually do the memcpy() inside. We could see to inline
the helper itself to optimize it slightly.
Another idea is to try to track that ihl_len variable is also
somehow multiple of 4, but it will be quite complicated on
the verifier side in its existing architecture and maybe? worth
waiting until the verifier has proper dataflow analysis.