On Thu, Oct 13, 2016 at 11:21 AM, Jakob Viketoft < jakob.viket...@aacmicrotec.com> wrote:
> > *From:* Joel Sherrill [j...@rtems.org] > *Sent:* Thursday, October 13, 2016 17:38 > *To:* Jakob Viketoft > *Cc:* devel@rtems.org > *Subject:* Re: Time spent in ticks... > > >I don't have an or1k handy so ran on a sparc/erc32 simulator/ > >It is is a SPARC v7 at 15 Mhz. > > >These times are in microseconds and based on the tmtests. > >Specifically tm08and tm27. > > >(1) rtems_clock_tick: only case - 52 > >(2) rtems interrupt: entry overhead returns to interrupted task - 12 > >(3) rtems interrupt: exit overhead returns to interrupted task - 4 > >(4) rtems interrupt: entry overhead returns to nested interrupt - 11 > >(5) rtems interrupt: exit overhead returns to nested interrupt - 3 > > The above was from the master with SMP enabled. I repeated it with SMP disabled and it had no impact. Since the timing change is post 4.11, I decided to try 4.11 with SMP disabled: rtems_clock_tick: only case - 42 rtems interrupt: entry overhead returns to interrupted task - 11 rtems interrupt: exit overhead returns to interrupted task - 4 rtems interrupt: entry overhead returns to nested interrupt - 11 rtems interrupt: exit overhead returns to nested interrupt - 3 So 42 + 12 + 4 = 58 microseconds, 58 * 15 = 870 cycles So the overhead has gone up some but as Pavel says it is quite likely some mathematical operation on 64 bit types is slow on your CPU. HINT: If you can write a benchmark for 64-bit operations, it would be a good comparison between CPUs and might highlight where the software implementation needs improvement. > >The clock tick test has 100 tasks but it looks like they are blocked on a > semaphore > >without timeout. > > >Your times look WAY too high. Maybe the interrupt is stuck on and > >not being cleared. > > >On the erc32, a nominal "nothing to do clock tick" would be 1+2+3 from > >above or 52+12+4 = 68 microseconds. 68 * 15 = 1020 machine cycles. > >So at a higher clock rate, it should be even less time. > > >My gut feeling is that I think something is wrong with the ISR handler > >and it is stuck. But the performance is definitely way too high. > > >--joel > > (Sorry if the format got somewhat I garbled, anything but top-posting have > to be done manually...) > > I re-tested my case using an -O3 optimization (we have been using -O0 > during development for debugging purposes) and I got a good performance > boost, but I'm still nowhere near your numbers. I can vouch for that the > interrupt (exception really) isn't stuck, but that the code unfortunately > takes a long time to compute. I have a subsecond counter (1/16 of a second) > which I'm sampling at various places in the code, storing its numbers to a > buffer in memory so as to interfere with the program as little as possible. > > With -O3, a tick handling still takes ~320 us to perform, but the weight > has now shifted. tc_windup takes ~214 us and the rest is obviously > _Watchdog_Tick(). When fragmenting the tc_windup function to find the worst > speed bumps the biggest contribution (~122 us) seem to be coming from scale > factor recalculation. Since it's 64 bits, it's turned into a software > function which can be quite time-consuming apparently. > > Even though _Watchdog_Tick() "only" takes ~100 us now, it still sound much > higher than your total tick with a slower system (we're running at 50 MHz). > > Is there anything we can do to improve these numbers? Is Clock_isr > intended to be run uninterrupted as it is now? Can't see that much of the > BSP patch code has anything to do with the speed of what I'm looking at > right now... > > /Jakob > > > > *Jakob Viketoft *Senior Engineer in RTL and embedded software > > ÅAC Microtec AB > Dag Hammarskjölds väg 48 > SE-751 83 Uppsala, Sweden > > T: +46 702 80 95 97 > http://www.aacmicrotec.com >
_______________________________________________ devel mailing list devel@rtems.org http://lists.rtems.org/mailman/listinfo/devel