Re: [lng-odp] Runtime inlining

2015-11-10 Thread Zoltan Kiss



On 10/11/15 07:39, Maxim Uvarov wrote:

JIT like lua might also not work because you need to rewrite OVS to
support it. I don't think that it will be accepted.

And it looks like it's problem in OVS, not in ODP. I.e. OVS should allow
to  use library functions for fast path (where inlines are critical).
I.e. not just call odp_packet_len(),  but move hole OVS function to
dynamic library.


I'm not sure I get your point here, but OVS allows to use dynamic 
library functions on fast path. The problem is that it's slow, because 
of the function call overhead.




regards,
Maxim.

On 10 November 2015 at 02:50, Bill Fischofer mailto:bill.fischo...@linaro.org>> wrote:

Adding Grant Likely to this chain as it relates to the broader
subject of portable ABIs that we've been discussing.

On Mon, Nov 9, 2015 at 4:48 PM, Jim Wilson mailto:jim.wil...@linaro.org>> wrote:

On Mon, Nov 9, 2015 at 2:39 PM, Bill Fischofer
mailto:bill.fischo...@linaro.org>>
wrote:
> The IO Visor project appears to be doing something like this with 
LLVM and
> JIT constructs to dynamically insert code into the kernel in a
> platform-independent manner. Perhaps we can leverage that technology?

GCC has some experimental JIT support, but I think it would be a lot
of work to use it, and I don't know how stable it is.
https://gcc.gnu.org/wiki/JIT
The LLVM support is probably more advanced.

Jim



___
lng-odp mailing list
lng-...@lists.linaro.org 
https://lists.linaro.org/mailman/listinfo/lng-odp




___
lng-odp mailing list
lng-...@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/lng-odp


___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/linaro-toolchain


Re: [lng-odp] Runtime inlining

2015-11-10 Thread Maxim Uvarov
On 10 November 2015 at 13:41, Zoltan Kiss  wrote:

>
>
> On 10/11/15 07:39, Maxim Uvarov wrote:
>
>> JIT like lua might also not work because you need to rewrite OVS to
>> support it. I don't think that it will be accepted.
>>
>> And it looks like it's problem in OVS, not in ODP. I.e. OVS should allow
>> to  use library functions for fast path (where inlines are critical).
>> I.e. not just call odp_packet_len(),  but move hole OVS function to
>> dynamic library.
>>
>
> I'm not sure I get your point here, but OVS allows to use dynamic library
> functions on fast path. The problem is that it's slow, because of the
> function call overhead.
>

I'm not familiar with ovs code. But for example ovs has something like:

ovs_get_and_packet_process()
{
// here you use some inlines:
  pkt = odp_recv();
  len = odp_packet_len(pkt);

... etc.

}

So it's clear for each target arch you needs it's own variant of
ovs_get_and_packet_process() function. That function should go from ovs to
dynamic library.

Maxim.




>
>
>> regards,
>> Maxim.
>>
>> On 10 November 2015 at 02:50, Bill Fischofer > > wrote:
>>
>> Adding Grant Likely to this chain as it relates to the broader
>> subject of portable ABIs that we've been discussing.
>>
>> On Mon, Nov 9, 2015 at 4:48 PM, Jim Wilson > > wrote:
>>
>> On Mon, Nov 9, 2015 at 2:39 PM, Bill Fischofer
>> mailto:bill.fischo...@linaro.org>>
>> wrote:
>> > The IO Visor project appears to be doing something like this
>> with LLVM and
>> > JIT constructs to dynamically insert code into the kernel in a
>> > platform-independent manner. Perhaps we can leverage that
>> technology?
>>
>> GCC has some experimental JIT support, but I think it would be a
>> lot
>> of work to use it, and I don't know how stable it is.
>> https://gcc.gnu.org/wiki/JIT
>> The LLVM support is probably more advanced.
>>
>> Jim
>>
>>
>>
>> ___
>> lng-odp mailing list
>> lng-...@lists.linaro.org 
>> https://lists.linaro.org/mailman/listinfo/lng-odp
>>
>>
>>
>>
>> ___
>> lng-odp mailing list
>> lng-...@lists.linaro.org
>> https://lists.linaro.org/mailman/listinfo/lng-odp
>>
>>
___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/linaro-toolchain


Re: [lng-odp] Runtime inlining

2015-11-10 Thread Grant Likely
On Tue, Nov 10, 2015 at 11:08 AM, Maxim Uvarov  wrote:
> On 10 November 2015 at 13:41, Zoltan Kiss  wrote:
>> On 10/11/15 07:39, Maxim Uvarov wrote:
>>> And it looks like it's problem in OVS, not in ODP. I.e. OVS should allow
>>> to  use library functions for fast path (where inlines are critical).
>>> I.e. not just call odp_packet_len(),  but move hole OVS function to
>>> dynamic library.
>>
>> I'm not sure I get your point here, but OVS allows to use dynamic library
>> functions on fast path. The problem is that it's slow, because of the
>> function call overhead.
>
> I'm not familiar with ovs code. But for example ovs has something like:
>
> ovs_get_and_packet_process()
> {
> // here you use some inlines:
>   pkt = odp_recv();
>   len = odp_packet_len(pkt);
>
> ... etc.
>
> }
>
> So it's clear for each target arch you needs it's own variant of
> ovs_get_and_packet_process() function. That function should go from ovs to
> dynamic library.

Which library? A library specific to OVS? Or some common ODP library
that everyone uses? In either case the solution is not scalable. In
the first case it still requires the app vendor to have a separate
build for each and every supported target. In the second, it is
basically argues for all fast-path application-specific code to go
into a non-app-specific library. That really won't fly.

I have two answers to this question. One for the short term, and one
for the long.

In the short term we have no choice. If we're going to support
portable application binaries, then we cannot do inlines. ODP simply
isn't set up to support that. Portable binaries will have to take the
hit of doing a function call each and every time. It's not fast, but
it *works*, which at least will set a lowest common denominator. To
mitigate the problem we could encourage application packages to
include a generic version (no-inlines, but works everywhere) plus one
or more optimized builds (with inlines) and the correct binary is
selected at runtime. Not great, but it is a reasonable answer for the
short term.

For the long term to get away from per-platform builds, I see two
viable options. Bill suggested the first: Use LLVM to optimize at
runtime so that thing like inlines get picked up when linked to the
platform library. There is some precedence of other projects already
doing this, so this isn't as far fetched as it may seem. The second is
to do what we already do in the kernel for ftrace: instrument the
function calls and runtime patch them with optimized inlines. Not
pretty, probably fragile, but we do have the knowledge from the kernel
of how to do it. All said, I would prefer an LLVM based solution, but
investigation is needed to figure out how to make it work.

g.

>>> On 10 November 2015 at 02:50, Bill Fischofer >> > wrote:
>>>
>>> Adding Grant Likely to this chain as it relates to the broader
>>> subject of portable ABIs that we've been discussing.
>>>
>>> On Mon, Nov 9, 2015 at 4:48 PM, Jim Wilson >> > wrote:
>>>
>>> On Mon, Nov 9, 2015 at 2:39 PM, Bill Fischofer
>>> mailto:bill.fischo...@linaro.org>>
>>> wrote:
>>> > The IO Visor project appears to be doing something like this
>>> with LLVM and
>>> > JIT constructs to dynamically insert code into the kernel in a
>>> > platform-independent manner. Perhaps we can leverage that
>>> technology?
>>>
>>> GCC has some experimental JIT support, but I think it would be a
>>> lot
>>> of work to use it, and I don't know how stable it is.
>>> https://gcc.gnu.org/wiki/JIT
>>> The LLVM support is probably more advanced.
>>>
>>> Jim
>>>
>>>
>>>
>>> ___
>>> lng-odp mailing list
>>> lng-...@lists.linaro.org 
>>> https://lists.linaro.org/mailman/listinfo/lng-odp
>>>
>>>
>>>
>>>
>>> ___
>>> lng-odp mailing list
>>> lng-...@lists.linaro.org
>>> https://lists.linaro.org/mailman/listinfo/lng-odp
>>>
>
>
> ___
> lng-odp mailing list
> lng-...@lists.linaro.org
> https://lists.linaro.org/mailman/listinfo/lng-odp
>
___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/linaro-toolchain


Re: [lng-odp] Runtime inlining

2015-11-10 Thread Zoltan Kiss



On 10/11/15 11:08, Maxim Uvarov wrote:



On 10 November 2015 at 13:41, Zoltan Kiss mailto:zoltan.k...@linaro.org>> wrote:



On 10/11/15 07:39, Maxim Uvarov wrote:

JIT like lua might also not work because you need to rewrite OVS to
support it. I don't think that it will be accepted.

And it looks like it's problem in OVS, not in ODP. I.e. OVS
should allow
to  use library functions for fast path (where inlines are
critical).
I.e. not just call odp_packet_len(),  but move hole OVS function to
dynamic library.


I'm not sure I get your point here, but OVS allows to use dynamic
library functions on fast path. The problem is that it's slow,
because of the function call overhead.


I'm not familiar with ovs code. But for example ovs has something like:

ovs_get_and_packet_process()
{
// here you use some inlines:
   pkt = odp_recv();
   len = odp_packet_len(pkt);

... etc.

}

So it's clear for each target arch you needs it's own variant of
ovs_get_and_packet_process() function. That function should go from ovs
to dynamic library.


I see. That would mitigate some of the problems, but unfortunately the 
usage of these accessor functions couldn't be narrowed down to 
particular piece of fast path code. E.g. the packet length is a quite 
good example, you need it very often during processing, at different 
parts of the code.




Maxim.




regards,
Maxim.

On 10 November 2015 at 02:50, Bill Fischofer
mailto:bill.fischo...@linaro.org>
>> wrote:

 Adding Grant Likely to this chain as it relates to the broader
 subject of portable ABIs that we've been discussing.

 On Mon, Nov 9, 2015 at 4:48 PM, Jim Wilson
mailto:jim.wil...@linaro.org>
 >> wrote:

 On Mon, Nov 9, 2015 at 2:39 PM, Bill Fischofer
 mailto:bill.fischo...@linaro.org>
>>
 wrote:
 > The IO Visor project appears to be doing something
like this with LLVM and
 > JIT constructs to dynamically insert code into the
kernel in a
 > platform-independent manner. Perhaps we can leverage
that technology?

 GCC has some experimental JIT support, but I think it
would be a lot
 of work to use it, and I don't know how stable it is.
https://gcc.gnu.org/wiki/JIT
 The LLVM support is probably more advanced.

 Jim



 ___
 lng-odp mailing list
lng-...@lists.linaro.org 
>
https://lists.linaro.org/mailman/listinfo/lng-odp




___
lng-odp mailing list
lng-...@lists.linaro.org 
https://lists.linaro.org/mailman/listinfo/lng-odp



___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/linaro-toolchain


Re: [lng-odp] Runtime inlining

2015-11-10 Thread Adhemerval Zanella


On 10-11-2015 10:04, Grant Likely wrote:
> On Tue, Nov 10, 2015 at 11:08 AM, Maxim Uvarov  
> wrote:
>> On 10 November 2015 at 13:41, Zoltan Kiss  wrote:
>>> On 10/11/15 07:39, Maxim Uvarov wrote:
 And it looks like it's problem in OVS, not in ODP. I.e. OVS should allow
 to  use library functions for fast path (where inlines are critical).
 I.e. not just call odp_packet_len(),  but move hole OVS function to
 dynamic library.
>>>
>>> I'm not sure I get your point here, but OVS allows to use dynamic library
>>> functions on fast path. The problem is that it's slow, because of the
>>> function call overhead.
>>
>> I'm not familiar with ovs code. But for example ovs has something like:
>>
>> ovs_get_and_packet_process()
>> {
>> // here you use some inlines:
>>   pkt = odp_recv();
>>   len = odp_packet_len(pkt);
>>
>> ... etc.
>>
>> }
>>
>> So it's clear for each target arch you needs it's own variant of
>> ovs_get_and_packet_process() function. That function should go from ovs to
>> dynamic library.
> 
> Which library? A library specific to OVS? Or some common ODP library
> that everyone uses? In either case the solution is not scalable. In
> the first case it still requires the app vendor to have a separate
> build for each and every supported target. In the second, it is
> basically argues for all fast-path application-specific code to go
> into a non-app-specific library. That really won't fly.
> 
> I have two answers to this question. One for the short term, and one
> for the long.
> 
> In the short term we have no choice. If we're going to support
> portable application binaries, then we cannot do inlines. ODP simply
> isn't set up to support that. Portable binaries will have to take the
> hit of doing a function call each and every time. It's not fast, but
> it *works*, which at least will set a lowest common denominator. To
> mitigate the problem we could encourage application packages to
> include a generic version (no-inlines, but works everywhere) plus one
> or more optimized builds (with inlines) and the correct binary is
> selected at runtime. Not great, but it is a reasonable answer for the
> short term.
> 
> For the long term to get away from per-platform builds, I see two
> viable options. Bill suggested the first: Use LLVM to optimize at
> runtime so that thing like inlines get picked up when linked to the
> platform library. There is some precedence of other projects already
> doing this, so this isn't as far fetched as it may seem. The second is
> to do what we already do in the kernel for ftrace: instrument the
> function calls and runtime patch them with optimized inlines. Not
> pretty, probably fragile, but we do have the knowledge from the kernel
> of how to do it. All said, I would prefer an LLVM based solution, but
> investigation is needed to figure out how to make it work.

The LLVM JIT approach will require a  lot of engineer work from ODP side. 
Currently LLVM provides two JIT engines: the MCJIT and the ORC 
(which is new on LLVM 3.7).

The MCJIT work on 'modules': the programs can either pass a C or IR file or
use the API to create a module with multiple functions.  The JIT engine will
then build and create a ELF module that will be loaded in process address
VMA. It is essentially an AOT JIT.

The ORC stands for 'On Request Compilation' and it differ than MCJIT is
aiming to lazy compilation using indirection hooks.  The function won't be
JITted until is is called. [1]

In any case you won't have inline speed if you decide to just JIT the
inline calls, it will still be an indirection calls to the JIT functions.
Neither supports patchpoints, which was the kernel does to dynamically
change the code to patch for specific instructions.

If you want to actually dynamic change the code you can try the
DynamicRIO [2] project that aims to provide an API to do so.  However
it is aimed for instrumentation, so I am not sure how well it plays with
performance-wise projects.

I would suggest instead of focus on dynamic code generation for such
inlines, to work on more general functions that are actually called
through either PLT or indirections and crate runtime dispatch based
on runtime. 

You can follow the GCC strategy to do indirection calls 
(the __builtin_supports('') which openssl emulates as well) or since
it is a library to use IFUNC on the PLT calls (like GLIBC does with
memory and math operations).  With current GCC you can build different
versions of the same function and add a IFUNC dispatch to select the
best one at runtime.

[1] http://article.gmane.org/gmane.comp.compilers.llvm.devel/80639
[2] http://www.dynamorio.org/

> 
> g.
> 
 On 10 November 2015 at 02:50, Bill Fischofer >>> > wrote:

 Adding Grant Likely to this chain as it relates to the broader
 subject of portable ABIs that we've been discussing.

 On Mon, Nov 9, 2015 at 4:48 PM, Jim Wilson >>> > wrot

Re: [lng-odp] Runtime inlining

2015-11-10 Thread Ola Liljedahl
On 6 November 2015 at 15:48, Zoltan Kiss  wrote:

> Hi,
>
> We have a packaging/linking/optimization problem at LNG, I hope you guys
> can give us some advice on that. (Cc'ing ODP list in case someone want to
> add something)
> We have OpenDataPlane (ODP), an API stretching between userspace
> applications and hardware SDKs. It's defined in the form of C headers, and
> we already have several implementations to face SDKs (or whathever is
> actually controlling the hardware), e.g. linux-generic, a DPDK one etc.
> And we have applications, like Open vSwitch (OVS), which now is able to
> work with any ODP platform implementation which implements this API
> When it comes to packaging, the ideal scenario would be to create one
> package for the application, e.g. openvswitch.deb, and one for each
> platform, e.g odp-generic.deb, odp-dpdk.deb. The latter would contain the
> implementations in the form of a libodp.so file, so the application can
> dynamically load the actually installed platform's library runtime, with
> all the benefits of dynamic linking.
>
We also need binary compatibility between different ODP implementations.
Binary compatibility that goes beyond an ABI.

I would be happy if we for a start could prove that we actually have source
code compatibility. E.g. compile and run the exact same app using different
ODP implementations and run them on their respective platforms with the
expected behaviour (including performance).

The trouble is that we have several accessor functions in the API which are
> very short and __very__ frequently used. The best example is "uint32_t
> odp_packet_len(odp_packet_t pkt)", which returns the length of the packet.
> odp_packet_t is an opaque type defined by the implementation, often a
> pointer to the packet's actual metadata, so the actual function call yields
> to a simple load from that metadata pointer (+offset). Having it wrapped
> into a function call brings a significant performance decrease: when
> forwarding 64 byte packets at 10 Gbps, I got 13.2 Mpps with function calls.
> When I've inlined that function it brought 13.8 Mpps, that's ~5%
> difference. And there are a lot of other frequently used short accessor
> functions with the same problem.
> But obviously if I inline these functions I break the ABI, and I need to
> compile the application for each platform (and create packages like
> openvswitch-odp-dpdk.deb, containing the platform statically linked). I've
> tried to look around on Google and in gcc manual, but I couldn't find a
> good solution for this kind of problem.
> I've checked link time optimization (-flto), but it only helps with static
> linking. Is there any way to keep the ODP application and platform
> implementation binaries in separate files while having the performance
> benefit of inlining?
>
> Regards,
>
> Zoltan
> ___
> lng-odp mailing list
> lng-...@lists.linaro.org
> https://lists.linaro.org/mailman/listinfo/lng-odp
>
___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/linaro-toolchain


Re: [lng-odp] Runtime inlining

2015-11-10 Thread Zoltan Kiss



On 10/11/15 12:04, Grant Likely wrote:

On Tue, Nov 10, 2015 at 11:08 AM, Maxim Uvarov  wrote:

On 10 November 2015 at 13:41, Zoltan Kiss  wrote:

On 10/11/15 07:39, Maxim Uvarov wrote:

And it looks like it's problem in OVS, not in ODP. I.e. OVS should allow
to  use library functions for fast path (where inlines are critical).
I.e. not just call odp_packet_len(),  but move hole OVS function to
dynamic library.


I'm not sure I get your point here, but OVS allows to use dynamic library
functions on fast path. The problem is that it's slow, because of the
function call overhead.


I'm not familiar with ovs code. But for example ovs has something like:

ovs_get_and_packet_process()
{
// here you use some inlines:
   pkt = odp_recv();
   len = odp_packet_len(pkt);

... etc.

}

So it's clear for each target arch you needs it's own variant of
ovs_get_and_packet_process() function. That function should go from ovs to
dynamic library.


Which library? A library specific to OVS? Or some common ODP library
that everyone uses? In either case the solution is not scalable. In
the first case it still requires the app vendor to have a separate
build for each and every supported target. In the second, it is
basically argues for all fast-path application-specific code to go
into a non-app-specific library. That really won't fly.

I have two answers to this question. One for the short term, and one
for the long.

In the short term we have no choice. If we're going to support
portable application binaries, then we cannot do inlines. ODP simply
isn't set up to support that. Portable binaries will have to take the
hit of doing a function call each and every time. It's not fast, but
it *works*, which at least will set a lowest common denominator. To
mitigate the problem we could encourage application packages to
include a generic version (no-inlines, but works everywhere) plus one
or more optimized builds (with inlines) and the correct binary is
selected at runtime. Not great, but it is a reasonable answer for the
short term.


I would argue for the short term to produce platform specific packages 
as well, at least for ODP-OVS. As ODP-OVS is not upstream, we need to 
produce an openvswitch-odp package anyway (which would set to conflict 
with the normal openvswitch package). My idea is to create 
openvswitch-odp-[platform] packages, though I don't know if you can set 
a wildcard conflict rule during packaging to make sure only one of them 
are installed at a time.




For the long term to get away from per-platform builds, I see two
viable options. Bill suggested the first: Use LLVM to optimize at
runtime so that thing like inlines get picked up when linked to the
platform library. There is some precedence of other projects already
doing this, so this isn't as far fetched as it may seem.


But wouldn't it tie us down with LLVM?


The second is
to do what we already do in the kernel for ftrace: instrument the
function calls and runtime patch them with optimized inlines. Not
pretty, probably fragile, but we do have the knowledge from the kernel


Yes, I was thinking also about the ftrace way, but I'm not familiar with 
ld.so enough to judge how hard it would be.

of how to do it. All said, I would prefer an LLVM based solution, but
investigation is needed to figure out how to make it work.

g.


On 10 November 2015 at 02:50, Bill Fischofer mailto:bill.fischo...@linaro.org>> wrote:

 Adding Grant Likely to this chain as it relates to the broader
 subject of portable ABIs that we've been discussing.

 On Mon, Nov 9, 2015 at 4:48 PM, Jim Wilson mailto:jim.wil...@linaro.org>> wrote:

 On Mon, Nov 9, 2015 at 2:39 PM, Bill Fischofer
 mailto:bill.fischo...@linaro.org>>
 wrote:
 > The IO Visor project appears to be doing something like this
with LLVM and
 > JIT constructs to dynamically insert code into the kernel in a
 > platform-independent manner. Perhaps we can leverage that
technology?

 GCC has some experimental JIT support, but I think it would be a
lot
 of work to use it, and I don't know how stable it is.
 https://gcc.gnu.org/wiki/JIT
 The LLVM support is probably more advanced.

 Jim



 ___
 lng-odp mailing list
 lng-...@lists.linaro.org 
 https://lists.linaro.org/mailman/listinfo/lng-odp




___
lng-odp mailing list
lng-...@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/lng-odp




___
lng-odp mailing list
lng-...@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/lng-odp


___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/linaro-toolchain


Re: [lng-odp] Runtime inlining

2015-11-10 Thread Grant Likely
On Tue, Nov 10, 2015 at 3:04 PM, Zoltan Kiss  wrote:
>
>
> On 10/11/15 12:04, Grant Likely wrote:
>>
>> On Tue, Nov 10, 2015 at 11:08 AM, Maxim Uvarov 
>> wrote:
>>>
>>> On 10 November 2015 at 13:41, Zoltan Kiss  wrote:

 On 10/11/15 07:39, Maxim Uvarov wrote:
>
> And it looks like it's problem in OVS, not in ODP. I.e. OVS should
> allow
> to  use library functions for fast path (where inlines are critical).
> I.e. not just call odp_packet_len(),  but move hole OVS function to
> dynamic library.


 I'm not sure I get your point here, but OVS allows to use dynamic
 library
 functions on fast path. The problem is that it's slow, because of the
 function call overhead.
>>>
>>>
>>> I'm not familiar with ovs code. But for example ovs has something like:
>>>
>>> ovs_get_and_packet_process()
>>> {
>>> // here you use some inlines:
>>>pkt = odp_recv();
>>>len = odp_packet_len(pkt);
>>>
>>> ... etc.
>>>
>>> }
>>>
>>> So it's clear for each target arch you needs it's own variant of
>>> ovs_get_and_packet_process() function. That function should go from ovs
>>> to
>>> dynamic library.
>>
>>
>> Which library? A library specific to OVS? Or some common ODP library
>> that everyone uses? In either case the solution is not scalable. In
>> the first case it still requires the app vendor to have a separate
>> build for each and every supported target. In the second, it is
>> basically argues for all fast-path application-specific code to go
>> into a non-app-specific library. That really won't fly.
>>
>> I have two answers to this question. One for the short term, and one
>> for the long.
>>
>> In the short term we have no choice. If we're going to support
>> portable application binaries, then we cannot do inlines. ODP simply
>> isn't set up to support that. Portable binaries will have to take the
>> hit of doing a function call each and every time. It's not fast, but
>> it *works*, which at least will set a lowest common denominator. To
>> mitigate the problem we could encourage application packages to
>> include a generic version (no-inlines, but works everywhere) plus one
>> or more optimized builds (with inlines) and the correct binary is
>> selected at runtime. Not great, but it is a reasonable answer for the
>> short term.
>
>
> I would argue for the short term to produce platform specific packages as
> well, at least for ODP-OVS. As ODP-OVS is not upstream, we need to produce
> an openvswitch-odp package anyway (which would set to conflict with the
> normal openvswitch package). My idea is to create openvswitch-odp-[platform]
> packages, though I don't know if you can set a wildcard conflict rule during
> packaging to make sure only one of them are installed at a time.
>
>>
>> For the long term to get away from per-platform builds, I see two
>> viable options. Bill suggested the first: Use LLVM to optimize at
>> runtime so that thing like inlines get picked up when linked to the
>> platform library. There is some precedence of other projects already
>> doing this, so this isn't as far fetched as it may seem.
>
>
> But wouldn't it tie us down with LLVM?

Does that worry you? LLVM is a mature project, open source, and lots
of momentum behind it. There are worse things we can do than align
with LLVM when it brings capability that we cannot get anywhere else.

g.
___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/linaro-toolchain


Re: [lng-odp] Runtime inlining

2015-11-10 Thread Zoltan Kiss



On 10/11/15 15:08, Grant Likely wrote:

On Tue, Nov 10, 2015 at 3:04 PM, Zoltan Kiss  wrote:



On 10/11/15 12:04, Grant Likely wrote:


On Tue, Nov 10, 2015 at 11:08 AM, Maxim Uvarov 
wrote:


On 10 November 2015 at 13:41, Zoltan Kiss  wrote:


On 10/11/15 07:39, Maxim Uvarov wrote:


And it looks like it's problem in OVS, not in ODP. I.e. OVS should
allow
to  use library functions for fast path (where inlines are critical).
I.e. not just call odp_packet_len(),  but move hole OVS function to
dynamic library.



I'm not sure I get your point here, but OVS allows to use dynamic
library
functions on fast path. The problem is that it's slow, because of the
function call overhead.



I'm not familiar with ovs code. But for example ovs has something like:

ovs_get_and_packet_process()
{
// here you use some inlines:
pkt = odp_recv();
len = odp_packet_len(pkt);

... etc.

}

So it's clear for each target arch you needs it's own variant of
ovs_get_and_packet_process() function. That function should go from ovs
to
dynamic library.



Which library? A library specific to OVS? Or some common ODP library
that everyone uses? In either case the solution is not scalable. In
the first case it still requires the app vendor to have a separate
build for each and every supported target. In the second, it is
basically argues for all fast-path application-specific code to go
into a non-app-specific library. That really won't fly.

I have two answers to this question. One for the short term, and one
for the long.

In the short term we have no choice. If we're going to support
portable application binaries, then we cannot do inlines. ODP simply
isn't set up to support that. Portable binaries will have to take the
hit of doing a function call each and every time. It's not fast, but
it *works*, which at least will set a lowest common denominator. To
mitigate the problem we could encourage application packages to
include a generic version (no-inlines, but works everywhere) plus one
or more optimized builds (with inlines) and the correct binary is
selected at runtime. Not great, but it is a reasonable answer for the
short term.



I would argue for the short term to produce platform specific packages as
well, at least for ODP-OVS. As ODP-OVS is not upstream, we need to produce
an openvswitch-odp package anyway (which would set to conflict with the
normal openvswitch package). My idea is to create openvswitch-odp-[platform]
packages, though I don't know if you can set a wildcard conflict rule during
packaging to make sure only one of them are installed at a time.



For the long term to get away from per-platform builds, I see two
viable options. Bill suggested the first: Use LLVM to optimize at
runtime so that thing like inlines get picked up when linked to the
platform library. There is some precedence of other projects already
doing this, so this isn't as far fetched as it may seem.



But wouldn't it tie us down with LLVM?


Does that worry you?


Only that then we require our applications to use LLVM if they want 
performance. I don't know the impact of that.



LLVM is a mature project, open source, and lots
of momentum behind it. There are worse things we can do than align
with LLVM when it brings capability that we cannot get anywhere else.

g.


___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/linaro-toolchain


Re: [lng-odp] Runtime inlining

2015-11-10 Thread Pinski, Andrew


> On Nov 10, 2015, at 7:28 AM, Zoltan Kiss  wrote:
> 
> 
> 
>> On 10/11/15 15:08, Grant Likely wrote:
>>> On Tue, Nov 10, 2015 at 3:04 PM, Zoltan Kiss  wrote:
>>> 
>>> 
 On 10/11/15 12:04, Grant Likely wrote:
 
 On Tue, Nov 10, 2015 at 11:08 AM, Maxim Uvarov 
 wrote:
> 
>> On 10 November 2015 at 13:41, Zoltan Kiss  wrote:
>> 
>>> On 10/11/15 07:39, Maxim Uvarov wrote:
>>> 
>>> And it looks like it's problem in OVS, not in ODP. I.e. OVS should
>>> allow
>>> to  use library functions for fast path (where inlines are critical).
>>> I.e. not just call odp_packet_len(),  but move hole OVS function to
>>> dynamic library.
>> 
>> 
>> I'm not sure I get your point here, but OVS allows to use dynamic
>> library
>> functions on fast path. The problem is that it's slow, because of the
>> function call overhead.
> 
> 
> I'm not familiar with ovs code. But for example ovs has something like:
> 
> ovs_get_and_packet_process()
> {
> // here you use some inlines:
>pkt = odp_recv();
>len = odp_packet_len(pkt);
> 
> ... etc.
> 
> }
> 
> So it's clear for each target arch you needs it's own variant of
> ovs_get_and_packet_process() function. That function should go from ovs
> to
> dynamic library.
 
 
 Which library? A library specific to OVS? Or some common ODP library
 that everyone uses? In either case the solution is not scalable. In
 the first case it still requires the app vendor to have a separate
 build for each and every supported target. In the second, it is
 basically argues for all fast-path application-specific code to go
 into a non-app-specific library. That really won't fly.
 
 I have two answers to this question. One for the short term, and one
 for the long.
 
 In the short term we have no choice. If we're going to support
 portable application binaries, then we cannot do inlines. ODP simply
 isn't set up to support that. Portable binaries will have to take the
 hit of doing a function call each and every time. It's not fast, but
 it *works*, which at least will set a lowest common denominator. To
 mitigate the problem we could encourage application packages to
 include a generic version (no-inlines, but works everywhere) plus one
 or more optimized builds (with inlines) and the correct binary is
 selected at runtime. Not great, but it is a reasonable answer for the
 short term.
>>> 
>>> 
>>> I would argue for the short term to produce platform specific packages as
>>> well, at least for ODP-OVS. As ODP-OVS is not upstream, we need to produce
>>> an openvswitch-odp package anyway (which would set to conflict with the
>>> normal openvswitch package). My idea is to create openvswitch-odp-[platform]
>>> packages, though I don't know if you can set a wildcard conflict rule during
>>> packaging to make sure only one of them are installed at a time.
>>> 
 
 For the long term to get away from per-platform builds, I see two
 viable options. Bill suggested the first: Use LLVM to optimize at
 runtime so that thing like inlines get picked up when linked to the
 platform library. There is some precedence of other projects already
 doing this, so this isn't as far fetched as it may seem.
>>> 
>>> 
>>> But wouldn't it tie us down with LLVM?
>> 
>> Does that worry you?
> 
> Only that then we require our applications to use LLVM if they want 
> performance. I don't know the impact of that.

Or they recompile the programs to get the speed. I am sorry but this is not a 
new problem. Most of the embedded folks are use to this. What a vendor of odp 
could do is provide an optimized version of the programs they think are 
important. 

Thanks,
Andrew


> 
>> LLVM is a mature project, open source, and lots
>> of momentum behind it. There are worse things we can do than align
>> with LLVM when it brings capability that we cannot get anywhere else.
>> 
>> g.
> ___
> linaro-toolchain mailing list
> linaro-toolchain@lists.linaro.org
> https://lists.linaro.org/mailman/listinfo/linaro-toolchain
___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/linaro-toolchain