On 20/11/14 12:36, Daniel Hellstrom wrote:
On 11/20/2014 12:08 PM, Sebastian Huber wrote:
On 20/11/14 10:22, Daniel Hellstrom wrote:

I will fix this. I missed it since it I never enabled RTEMS_PROFILING.

I did a test suite run on NGMP with profiling enabled and your patches with a local fix. So overall they don't make things worse and the median of all maximum thread dispatch disabled times drops a bit (Std = Git master, Opt = with your patches).
How should I interpret this, is it all RTEMS test-suite you have run performing the profiling?

Most tests (448) print an XML profiling report, e.g. something like this

  <ProfilingReport name="CLOCK TICK">
    <PerCPUProfilingReport processorIndex="0">
<MaxThreadDispatchDisabledTime unit="ns">49200</MaxThreadDispatchDisabledTime> <MeanThreadDispatchDisabledTime unit="ns">3484</MeanThreadDispatchDisabledTime> <TotalThreadDispatchDisabledTime unit="ns">4366444</TotalThreadDispatchDisabledTime>
<ThreadDispatchDisabledCount>1253</ThreadDispatchDisabledCount>
      <MaxInterruptDelay unit="ns">7555</MaxInterruptDelay>
      <MaxInterruptTime unit="ns">32822</MaxInterruptTime>
      <MeanInterruptTime unit="ns">16377</MeanInterruptTime>
      <TotalInterruptTime unit="ns">57485400</TotalInterruptTime>
      <InterruptCount>3510</InterruptCount>
    </PerCPUProfilingReport>
    <SMPLockProfilingReport name="SMP lock stats">
      <MaxAcquireTime unit="ns">1666</MaxAcquireTime>
      <MaxSectionTime unit="ns">3022</MaxSectionTime>
      <MeanAcquireTime unit="ns">1411</MeanAcquireTime>
      <MeanSectionTime unit="ns">2393</MeanSectionTime>
      <TotalAcquireTime unit="ns">14111</TotalAcquireTime>
      <TotalSectionTime unit="ns">23933</TotalSectionTime>
      <UsageCount>10</UsageCount>
      <ContentionCount initialQueueLength="0">10</ContentionCount>
      <ContentionCount initialQueueLength="1">0</ContentionCount>
      <ContentionCount initialQueueLength="2">0</ContentionCount>
      <ContentionCount initialQueueLength="3">0</ContentionCount>
    </SMPLockProfilingReport>
    <SMPLockProfilingReport name="Giant">
      <MaxAcquireTime unit="ns">1488</MaxAcquireTime>
      <MaxSectionTime unit="ns">49711</MaxSectionTime>
      <MeanAcquireTime unit="ns">800</MeanAcquireTime>
      <MeanSectionTime unit="ns">9627</MeanSectionTime>
      <TotalAcquireTime unit="ns">3409800</TotalAcquireTime>
      <TotalSectionTime unit="ns">41013688</TotalSectionTime>
      <UsageCount>4260</UsageCount>
      <ContentionCount initialQueueLength="0">4260</ContentionCount>
      <ContentionCount initialQueueLength="1">0</ContentionCount>
      <ContentionCount initialQueueLength="2">0</ContentionCount>
      <ContentionCount initialQueueLength="3">0</ContentionCount>
    </SMPLockProfilingReport>
    <SMPLockProfilingReport name="chains">
      <MaxAcquireTime unit="ns">1466</MaxAcquireTime>
      <MaxSectionTime unit="ns">2466</MaxSectionTime>
      <MeanAcquireTime unit="ns">929</MeanAcquireTime>
      <MeanSectionTime unit="ns">2017</MeanSectionTime>
      <TotalAcquireTime unit="ns">12088</TotalAcquireTime>
      <TotalSectionTime unit="ns">26222</TotalSectionTime>
      <UsageCount>13</UsageCount>
      <ContentionCount initialQueueLength="0">13</ContentionCount>
      <ContentionCount initialQueueLength="1">0</ContentionCount>
      <ContentionCount initialQueueLength="2">0</ContentionCount>
      <ContentionCount initialQueueLength="3">0</ContentionCount>
    </SMPLockProfilingReport>
    <SMPLockProfilingReport name="per-CPU">
      <MaxAcquireTime unit="ns">1888</MaxAcquireTime>
      <MaxSectionTime unit="ns">3000</MaxSectionTime>
      <MeanAcquireTime unit="ns">880</MeanAcquireTime>
      <MeanSectionTime unit="ns">1378</MeanSectionTime>
      <TotalAcquireTime unit="ns">610288</TotalAcquireTime>
      <TotalSectionTime unit="ns">955266</TotalSectionTime>
      <UsageCount>693</UsageCount>
      <ContentionCount initialQueueLength="0">693</ContentionCount>
      <ContentionCount initialQueueLength="1">0</ContentionCount>
      <ContentionCount initialQueueLength="2">0</ContentionCount>
      <ContentionCount initialQueueLength="3">0</ContentionCount>
    </SMPLockProfilingReport>
    <SMPLockProfilingReport name="TOD">
      <MaxAcquireTime unit="ns">1666</MaxAcquireTime>
      <MaxSectionTime unit="ns">4400</MaxSectionTime>
      <MeanAcquireTime unit="ns">803</MeanAcquireTime>
      <MeanSectionTime unit="ns">1376</MeanSectionTime>
      <TotalAcquireTime unit="ns">2858733</TotalAcquireTime>
      <TotalSectionTime unit="ns">4894755</TotalSectionTime>
      <UsageCount>3556</UsageCount>
      <ContentionCount initialQueueLength="0">3556</ContentionCount>
      <ContentionCount initialQueueLength="1">0</ContentionCount>
      <ContentionCount initialQueueLength="2">0</ContentionCount>
      <ContentionCount initialQueueLength="3">0</ContentionCount>
    </SMPLockProfilingReport>
    <SMPLockProfilingReport name="mount table entry">
      <MaxAcquireTime unit="ns">1800</MaxAcquireTime>
      <MaxSectionTime unit="ns">4088</MaxSectionTime>
      <MeanAcquireTime unit="ns">1217</MeanAcquireTime>
      <MeanSectionTime unit="ns">2001</MeanSectionTime>
      <TotalAcquireTime unit="ns">56022</TotalAcquireTime>
      <TotalSectionTime unit="ns">92088</TotalSectionTime>
      <UsageCount>46</UsageCount>
      <ContentionCount initialQueueLength="0">46</ContentionCount>
      <ContentionCount initialQueueLength="1">0</ContentionCount>
      <ContentionCount initialQueueLength="2">0</ContentionCount>
      <ContentionCount initialQueueLength="3">0</ContentionCount>
    </SMPLockProfilingReport>
    <SMPLockProfilingReport name="LEON3 IrqCtrl">
      <MaxAcquireTime unit="ns">1377</MaxAcquireTime>
      <MaxSectionTime unit="ns">3022</MaxSectionTime>
      <MeanAcquireTime unit="ns">1344</MeanAcquireTime>
      <MeanSectionTime unit="ns">2877</MeanSectionTime>
      <TotalAcquireTime unit="ns">2688</TotalAcquireTime>
      <TotalSectionTime unit="ns">5755</TotalSectionTime>
      <UsageCount>2</UsageCount>
      <ContentionCount initialQueueLength="0">2</ContentionCount>
      <ContentionCount initialQueueLength="1">0</ContentionCount>
      <ContentionCount initialQueueLength="2">0</ContentionCount>
      <ContentionCount initialQueueLength="3">0</ContentionCount>
    </SMPLockProfilingReport>
    <SMPLockProfilingReport name="thread zombies">
      <MaxAcquireTime unit="ns">1866</MaxAcquireTime>
      <MaxSectionTime unit="ns">3466</MaxSectionTime>
      <MeanAcquireTime unit="ns">1395</MeanAcquireTime>
      <MeanSectionTime unit="ns">2435</MeanSectionTime>
      <TotalAcquireTime unit="ns">6977</TotalAcquireTime>
      <TotalSectionTime unit="ns">12177</TotalSectionTime>
      <UsageCount>5</UsageCount>
      <ContentionCount initialQueueLength="0">5</ContentionCount>
      <ContentionCount initialQueueLength="1">0</ContentionCount>
      <ContentionCount initialQueueLength="2">0</ContentionCount>
      <ContentionCount initialQueueLength="3">0</ContentionCount>
    </SMPLockProfilingReport>
    <SMPLockProfilingReport name="per-CPU state">
      <MaxAcquireTime unit="ns">1644</MaxAcquireTime>
      <MaxSectionTime unit="ns">4866</MaxSectionTime>
      <MeanAcquireTime unit="ns">1222</MeanAcquireTime>
      <MeanSectionTime unit="ns">2944</MeanSectionTime>
      <TotalAcquireTime unit="ns">4888</TotalAcquireTime>
      <TotalSectionTime unit="ns">11777</TotalSectionTime>
      <UsageCount>4</UsageCount>
      <ContentionCount initialQueueLength="0">4</ContentionCount>
      <ContentionCount initialQueueLength="1">0</ContentionCount>
      <ContentionCount initialQueueLength="2">0</ContentionCount>
      <ContentionCount initialQueueLength="3">0</ContentionCount>
    </SMPLockProfilingReport>
  </ProfilingReport>

The image is a boxplot of all the MaxThreadDispatchDisabledTime samples.


Turning on/off interrupts should be faster, but the code calling the on/off routines should also be faster. I used GRMON to extract instruction traces for window overflow/underflow handling to benchmark.

It is a nice improvement.

--
Sebastian Huber, embedded brains GmbH

Address : Dornierstr. 4, D-82178 Puchheim, Germany
Phone   : +49 89 189 47 41-16
Fax     : +49 89 189 47 41-09
E-Mail  : sebastian.hu...@embedded-brains.de
PGP     : Public key available on request.

Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.

_______________________________________________
devel mailing list
devel@rtems.org
http://lists.rtems.org/mailman/listinfo/devel

Reply via email to