On 12/01/2017 06:32 AM, Jakub Kicinski wrote: > Hi! > > Jiong says: > > Currently, compiler will lower memcpy function call in XDP/eBPF C program > into a sequence of eBPF load/store pairs for some scenarios. > > Compiler is thinking this "inline" optimiation is beneficial as it could > avoid function call and also increase code locality. > > However, Netronome NPU is not an tranditional load/store architecture that > doing a sequence of individual load/store actions are not efficient. > > This patch set tries to identify the load/store sequences composed of > load/store pairs that comes from memcpy lowering, then accelerates them > through NPU's Command Push Pull (CPP) instruction. > > This patch set registered an new optimization pass before doing the actual > JIT work, it traverse through eBPF IR, once found candidate sequence then > record the memory copy source, destination and length information in the > first load instruction starting the sequence and marks all remaining > instructions in the sequence into skipable status. Later, when JITing the > first load instructoin, optimal instructions will be generated using those > record information. > > For this safety of this transformation: > > - jump into the middle of the sequence will cancel the optimization. > > - overlapped memory access will cancel the optimization. > > - the load destination register still contains the same value as before > the transformation.
Series applied to bpf-next, thanks guys!