Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-10 Thread Honnappa Nagarahalli
> > Hi Jerin, > > Following the guide to use the PMU counters(KO inserted and DPDK > recompiled), the numbers increased 10+ folds(bigger numbers here mean > more precise?), is this valid and expected? This is correct, big numbers mean, more precise/granular results. > No significant difference w

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-09 Thread Gavin Hu (Arm Technology China)
Hi Jerin, Following the guide to use the PMU counters(KO inserted and DPDK recompiled), the numbers increased 10+ folds(bigger numbers here mean more precise?), is this valid and expected? No significant difference was seen. gavin@net-arm-thunderx2:~/community/dpdk$ sudo ./test/test/test -l

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-09 Thread Olivier Matz
dahl > > CC: "dev@dpdk.org" , Honnappa Nagarahalli > > , "Ananyev, Konstantin" > > , "Gavin Hu (Arm Technology China)" > > , Steve Capper , nd > , > > "sta...@dpdk.org" > > Subject: Re: [dpdk-de

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-08 Thread Honnappa Nagarahalli
> > > On 07/10/2018, 06:03, "Jerin Jacob" > wrote: > > > > > > In arm64 case, it will have ATOMIC_RELAXED followed by asm > > volatile > ("":::"memory") of rte_pause(). > > > I would n't have any issue, if the generated code code is same or > better than the exiting case.

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-08 Thread Ola Liljedahl
On 08/10/2018, 16:46, "Ola Liljedahl" wrote: On 08/10/2018, 16:44, "Bruce Richardson" wrote: On Mon, Oct 08, 2018 at 09:22:05AM +, Ola Liljedahl wrote: > "* multi-producer safe lock-free ring buffer enqueue" > The comment is also wrong.

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-08 Thread Ola Liljedahl
On 08/10/2018, 16:44, "Bruce Richardson" wrote: On Mon, Oct 08, 2018 at 09:22:05AM +, Ola Liljedahl wrote: > "* multi-producer safe lock-free ring buffer enqueue" > The comment is also wrong. This design is not lock-free, how could it be when there is spinning > (wai

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-08 Thread Bruce Richardson
On Mon, Oct 08, 2018 at 09:22:05AM +, Ola Liljedahl wrote: > "* multi-producer safe lock-free ring buffer enqueue" > The comment is also wrong. This design is not lock-free, how could it be when > there is spinning > (waiting) for other threads in the code? If a thread must wait for other >

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-08 Thread Ola Liljedahl
uot; > , "Gavin Hu (Arm Technology China)" > , Steve Capper , nd , > "sta...@dpdk.org" > Subject: Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load > User-Agent: Mutt/1.10.1 (2018-07-13) > > External Email >

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-08 Thread Jerin Jacob
; "sta...@dpdk.org" > Subject: Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load > User-Agent: Mutt/1.10.1 (2018-07-13) > > External Email > > -Original Message- > > Date: Mon, 8 Oct 2018 11:59:16 + > > From: Ola Liljedahl

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-08 Thread Jerin Jacob
-Original Message- > Date: Mon, 8 Oct 2018 11:59:16 + > From: Ola Liljedahl > To: Jerin Jacob > CC: "dev@dpdk.org" , Honnappa Nagarahalli > , "Ananyev, Konstantin" > , "Gavin Hu (Arm Technology China)" > , Steve Capper , nd , > "sta...@dpdk.org" > Subject: Re: [PATCH v3 1/3] ring

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-08 Thread Ola Liljedahl
On 08/10/2018, 13:50, "Jerin Jacob" wrote: I don't know how that creates more undefined behavior. So replied in the context of your reply that, according to your view even Linux is running with undefined behavior. As I explained, Linux does not use C11 atomics (nor GCC __

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-08 Thread Jerin Jacob
-Original Message- > Date: Mon, 8 Oct 2018 11:21:42 + > From: Ola Liljedahl > To: Jerin Jacob > CC: "dev@dpdk.org" , Honnappa Nagarahalli > , "Ananyev, Konstantin" > , "Gavin Hu (Arm Technology China)" > , Steve Capper , nd , > "sta...@dpdk.org" > Subject: Re: [PATCH v3 1/3] ring

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-08 Thread Ola Liljedahl
On 08/10/2018, 12:47, "Jerin Jacob" wrote: -Original Message- > Date: Mon, 8 Oct 2018 10:25:45 + > From: Ola Liljedahl > To: Jerin Jacob > CC: "dev@dpdk.org" , Honnappa Nagarahalli > , "Ananyev, Konstantin" > , "Gavin Hu (Arm Technology China)" >

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-08 Thread Jerin Jacob
-Original Message- > Date: Mon, 8 Oct 2018 10:33:43 + > From: "Gavin Hu (Arm Technology China)" > To: Ola Liljedahl , Jerin Jacob > > CC: "dev@dpdk.org" , Honnappa Nagarahalli > , "Ananyev, Konstantin" > , Steve Capper , nd > , "sta...@dpdk.org" > Subject: RE: [PATCH v3 1/3] ring

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-08 Thread Jerin Jacob
-Original Message- > Date: Mon, 8 Oct 2018 10:25:45 + > From: Ola Liljedahl > To: Jerin Jacob > CC: "dev@dpdk.org" , Honnappa Nagarahalli > , "Ananyev, Konstantin" > , "Gavin Hu (Arm Technology China)" > , Steve Capper , nd , > "sta...@dpdk.org" > Subject: Re: [PATCH v3 1/3] ring

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-08 Thread Gavin Hu (Arm Technology China)
This is ThunderX2. > -Original Message- > From: Ola Liljedahl > Sent: Monday, October 8, 2018 6:39 PM > To: Gavin Hu (Arm Technology China) ; Jerin Jacob > > Cc: dev@dpdk.org; Honnappa Nagarahalli > ; Ananyev, Konstantin > ; Steve Capper ; > nd ; sta...@dpdk.org > Subject: Re: [PATCH v3 1

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-08 Thread Ola Liljedahl
On 08/10/2018, 12:33, "Gavin Hu (Arm Technology China)" wrote: I did benchmarking w/o and w/ the patch, it did not show any noticeable differences in terms of latency. Which platform is this? Here is the full log( 3 runs w/o the patch and 2 runs w/ the patch). sudo ./test/tes

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-08 Thread Gavin Hu (Arm Technology China)
I did benchmarking w/o and w/ the patch, it did not show any noticeable differences in terms of latency. Here is the full log( 3 runs w/o the patch and 2 runs w/ the patch). sudo ./test/test/test -l 16-19,44-47,72-75,100-103 -n 4 --socket-mem=1024 -- -i RTE>>ring_perf_autotest (#1 run of test w

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-08 Thread Ola Liljedahl
On 08/10/2018, 12:00, "Jerin Jacob" wrote: -Original Message- > Date: Mon, 8 Oct 2018 09:22:05 + > From: Ola Liljedahl > To: Jerin Jacob > CC: "dev@dpdk.org" , Honnappa Nagarahalli > , "Ananyev, Konstantin" > , "Gavin Hu (Arm Technology China)" >

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-08 Thread Ola Liljedahl
Or maybe performance gets worse but not because of that one additional instruction/cycle in ring buffer enqueue and dequeue but because function or loop alignment changed for one or more functions. When the benchmarking noise (possibly several % due to changes in code alignment) is bigger than

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-08 Thread Jerin Jacob
-Original Message- > Date: Mon, 8 Oct 2018 09:22:05 + > From: Ola Liljedahl > To: Jerin Jacob > CC: "dev@dpdk.org" , Honnappa Nagarahalli > , "Ananyev, Konstantin" > , "Gavin Hu (Arm Technology China)" > , Steve Capper , nd , > "sta...@dpdk.org" > Subject: Re: [PATCH v3 1/3] ring

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-08 Thread Ola Liljedahl
On 08/10/2018, 08:06, "Jerin Jacob" wrote: -Original Message- > Date: Sun, 7 Oct 2018 20:44:54 + > From: Ola Liljedahl > To: Jerin Jacob > CC: "dev@dpdk.org" , Honnappa Nagarahalli > , "Ananyev, Konstantin" > , "Gavin Hu (Arm Technology China)" > ,

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-07 Thread Jerin Jacob
-Original Message- > Date: Sun, 7 Oct 2018 20:44:54 + > From: Ola Liljedahl > To: Jerin Jacob > CC: "dev@dpdk.org" , Honnappa Nagarahalli > , "Ananyev, Konstantin" > , "Gavin Hu (Arm Technology China)" > , Steve Capper , nd , > "sta...@dpdk.org" > Subject: Re: [PATCH v3 1/3] ring

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-07 Thread Honnappa Nagarahalli
> > > > I doubt it is possible to benchmark with such a precision so to see the > > potential difference of one ADD instruction. > > Just changes in function alignment can affect performance by percents. > And > > the natural variation when not using a 100% deterministic system

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-07 Thread Ola Liljedahl
On 07/10/2018, 06:03, "Jerin Jacob" wrote: In arm64 case, it will have ATOMIC_RELAXED followed by asm volatile ("":::"memory") of rte_pause(). I would n't have any issue, if the generated code code is same or better than the exiting case. but it not the case, Right? The existing case

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-07 Thread Ola Liljedahl
On 07/10/2018, 06:03, "Jerin Jacob" wrote: How about fixing rte_pause() then? Meaning issuing power saving instructions on missing archs. Rte_pause() implemented as NOP or YIELD on ARM will likely not save any power. You should use WFE for that. I use this portable pattern:  //W

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-06 Thread Jerin Jacob
-Original Message- > Date: Sat, 6 Oct 2018 19:44:35 + > From: Ola Liljedahl > To: Jerin Jacob , "dev@dpdk.org" > > CC: Honnappa Nagarahalli , "Ananyev, > Konstantin" , "Gavin Hu (Arm Technology > China)" , Steve Capper , nd > , "sta...@dpdk.org" > Subject: Re: [PATCH v3 1/3] ring

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-06 Thread Ola Liljedahl
Some blogs posts about undefined behaviour in C/C++: https://blog.regehr.org/archives/213 http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html -- Ola

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-06 Thread Ola Liljedahl
On 06/10/2018, 09:42, "Jerin Jacob" wrote: -Original Message- > Date: Fri, 5 Oct 2018 20:34:15 + > From: Ola Liljedahl > To: Honnappa Nagarahalli , Jerin Jacob > > CC: "Ananyev, Konstantin" , "Gavin Hu (Arm > Technology China)" , "dev@dpdk.org" ,

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-06 Thread Jerin Jacob
-Original Message- > Date: Fri, 5 Oct 2018 20:34:15 + > From: Ola Liljedahl > To: Honnappa Nagarahalli , Jerin Jacob > > CC: "Ananyev, Konstantin" , "Gavin Hu (Arm > Technology China)" , "dev@dpdk.org" , > Steve Capper , nd , "sta...@dpdk.org" > > Subject: Re: [PATCH v3 1/3] ring

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-05 Thread Ola Liljedahl
On 05/10/2018, 22:29, "Honnappa Nagarahalli" wrote: > > I doubt it is possible to benchmark with such a precision so to see the > potential difference of one ADD instruction. > Just changes in function alignment can affect performance by percents. And > the natural variat

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-05 Thread Honnappa Nagarahalli
> > I doubt it is possible to benchmark with such a precision so to see the > potential difference of one ADD instruction. > Just changes in function alignment can affect performance by percents. And > the natural variation when not using a 100% deterministic system is going to > be a lot larger t

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-05 Thread Ola Liljedahl
I doubt it is possible to benchmark with such a precision so to see the potential difference of one ADD instruction. Just changes in function alignment can affect performance by percents. And the natural variation when not using a 100% deterministic system is going to be a lot larger than one cy

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-05 Thread Honnappa Nagarahalli
Hi Jerin, Thank you for generating the disassembly, that is really helpful. I agree with you that we have the option of moving parts 2 and 3 forward. I will let Gavin take a decision. I suggest that we run benchmarks on this patch alone and in combination with other patches in the serie

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-05 Thread Ola Liljedahl
So you don't want to write the proper C11 code because the compiler generates one extra instruction that way? You don't even know if that one extra instruction has any measurable impact on performance. E.g. it could be issued the cycle before together with other instructions. We can complain to

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-05 Thread Jerin Jacob
-Original Message- > Date: Fri, 5 Oct 2018 15:11:44 + > From: Honnappa Nagarahalli > To: "Ananyev, Konstantin" , Ola Liljedahl > , "Gavin Hu (Arm Technology China)" > , Jerin Jacob > CC: "dev@dpdk.org" , Steve Capper , nd > , "sta...@dpdk.org" > Subject: RE: [PATCH v3 1/3] ring: r

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-05 Thread Honnappa Nagarahalli
> > > > > > Hi Jerin, > > > > > > Thanks for your review, inline comments from our internal > discussions. > > > > > > BR. Gavin > > > > > > > -Original Message- > > > > From: Jerin Jacob > > > > Sent: Saturday

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-05 Thread Ola Liljedahl
On 05/10/2018, 15:45, "Ananyev, Konstantin" wrote: We all know that 32bit load/store on cpu we support - are atomic. Well, not necessarily true for unaligned loads and stores. But the "problem" here is that we are not directly generating 32-bit load and store instructions (that would re

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-05 Thread Ananyev, Konstantin
wrote: > > > > > -Original Message- > > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Gavin Hu (Arm > Technology China) > > Sent: Friday, October 5, 2018 1:47 AM > > To: Jerin Jacob > > Cc: dev

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-05 Thread Ola Liljedahl
> Cc: dev@dpdk.org; Honnappa Nagarahalli ; Steve Capper ; Ola Liljedahl > ; nd ; sta...@dpdk.org > Subject: Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load > > Hi Jerin, > > Thanks for your review, inli

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-05 Thread Ola Liljedahl
c: dev@dpdk.org; Honnappa Nagarahalli ; Steve Capper ; Ola Liljedahl > ; nd ; sta...@dpdk.org > Subject: Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load > > Hi Jerin, > > Thanks for your review, inline comments from our i

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-05 Thread Ananyev, Konstantin
.@dpdk.org > Subject: Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load > > Hi Jerin, > > Thanks for your review, inline comments from our internal discussions. > > BR. Gavin > > > -Original Message- > > From: Jerin Jacob > > Sent: Sat

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-04 Thread Gavin Hu (Arm Technology China)
Hi Jerin, Thanks for your review, inline comments from our internal discussions. BR. Gavin > -Original Message- > From: Jerin Jacob > Sent: Saturday, September 29, 2018 6:49 PM > To: Gavin Hu (Arm Technology China) > Cc: dev@dpdk.org; Honnappa Nagarahalli > ; Steve Capper > ; Ola Lilje

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-09-29 Thread Jerin Jacob
-Original Message- > Date: Mon, 17 Sep 2018 16:17:22 +0800 > From: Gavin Hu > To: dev@dpdk.org > CC: gavin...@arm.com, honnappa.nagaraha...@arm.com, steve.cap...@arm.com, > ola.liljed...@arm.com, jerin.ja...@caviumnetworks.com, n...@arm.com, > sta...@dpdk.org > Subject: [PATCH v3 1/3] ri

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-09-27 Thread Justin He
> -Original Message- > From: Gavin Hu (Arm Technology China) > Sent: 2018年9月26日 17:29 > To: Gavin Hu (Arm Technology China) ; dev@dpdk.org > Cc: Honnappa Nagarahalli ; Steve Capper > ; Ola Liljedahl ; > jerin.ja...@caviumnetworks.com; nd ; sta...@dpdk.org; Justin > He > Subject: RE: [PA

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-09-26 Thread Gavin Hu (Arm Technology China)
+Justin He > -Original Message- > From: Gavin Hu > Sent: Monday, September 17, 2018 4:17 PM > To: dev@dpdk.org > Cc: Gavin Hu (Arm Technology China) ; Honnappa > Nagarahalli ; Steve Capper > ; Ola Liljedahl ; > jerin.ja...@caviumnetworks.com; nd ; sta...@dpdk.org > Subject: [PATCH v3 1/3]

[dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-09-17 Thread Gavin Hu
In update_tail, read ht->tail using __atomic_load.Although the compiler currently seems to be doing the right thing even without _atomic_load, we don't want to give the compiler freedom to optimise what should be an atomic load, it should not be arbitarily moved around. Fixes: 39368ebfc6 ("ring: i