Re: Proposal for hardware configuration dependent performance limits

Sebastian Huber Thu, 19 Nov 2020 00:26:55 -0800

Hello Chris,

On 17/11/2020 22:43, Chris Johns wrote:


On 17/11/20 6:14 pm, Sebastian Huber wrote:

On 16/11/2020 23:42, Chris Johns wrote:

On 16/11/20 5:40 pm, Sebastian Huber wrote:

On 16/11/2020 00:33, Chris Johns wrote:

In the proposal, limits are specified like this:


limits:
       sparc/gr712rc:
         DirtyCache:
           max-upper-bound: 0.000005
           mean-upper-bound: 0.000005
         FullCache:
           max-upper-bound: 0.000005
           mean-upper-bound: 0.000005
         HotCache:
           max-upper-bound: 0.000005
           mean-upper-bound: 0.000005
         Load/1:
           max-upper-bound: 0.00001
           mean-upper-bound: 0.00001
         Load/2:
           max-upper-bound: 0.00001
           mean-upper-bound: 0.00001
         Load/3:
           max-upper-bound: 0.00001
           mean-upper-bound: 0.00001
         Load/4:
           max-upper-bound: 0.00001
           mean-upper-bound: 0.00001

This neglects that the limits are subject to a board configuration. One
approach to cover this is the addition of a new BSP provided function:

const char *rtems_get_hardware_performance_hash();

The BSP feeds all performance related data into a hash function and

"data" here means configuration?

Yes, hardware configuration.

Why not make these values part of the BSP configuration? The defaults for the
BSP can have a set of suitable values. Different boards have different
configurations to match and a separate kernel build.

This doesn't work on BSPs which support configuration via a hardware
enumeration, boot loader settings, or device trees. Also changes in the BSP
options have no influence on the BSP name. Not only BSP configuration influence
performance, the CPU options play a role too, for example RTEMS_SMP. In order to
compare performance values over time we have to obtain the values under the same
conditions.

Maybe I am not understanding the context.

A BSP, which ever one, has a set of options that configure it. An example is the
xilinx_zynq_zc702 and the `ZYNQ_RAM_LENGTH = 0x40000000`. If I have 2 Zynq
circuits one with 256M and one with 1G I need to build and maintain 2 RTEMS
builds and from a purists point of view I need to maintain 2 builds of the exact
same application.

I asked about the fixed memory and your answer was to use the BSP options, the
size is fixed in the linker command files via the BSP option. That is what I
have done.

I would expect there exists a set of values for the xilinx_zynq_zc702 with no
SMP and with SMP as this BSP supports SMP. Those values would match all the
other settings for the BSP such as ZYNQ_CLOCK_CPU_1X, BSP_ARM_A9MPCORE_PERIPHCLK
etc. If my clock is different (and they are) I would need to supply a suitable
set of performance values if I wanted to pass those tests.

I am not questioning the need for the values or the tests. I am suggesting the
values form part of the BSP settings so a user can adjust them to suite their
specific set up in the same way they adjust other BSP settings. I do not think
we should attempt to hold or manage an endless sets of possible values and I do
not see the need for complex encapsulation methods such as a base64 hashes. The
systems we interact with are too complex and list is endless.

I think it will be highly BSP-specific what parameters are relevant to the
performance limits. This is why I suggested to add a function which can be
implemented by each BSP.

const char *rtems_get_hardware_performance_something();

It should return a string which changes if a performance relevant parameter
changed. If it is only SMP/no-SMP, ZYNQ_CLOCK_CPU_1X, and
BSP_ARM_A9MPCORE_PERIPHCLK, then fine, just return "SMP/800MHz/400MHz" or 
whatever.

I suggest you avoid heading down a path of specific strings, ie avoid something
meaningful a human can read. Also performance characteristics are a part of a
wider configuration topic. Maybe considering that would solve the performance
specific parts as well.

A label for a build of RTEMS is a good idea (see below) that could serve the
human readable part. I would consider computing a hash for the config.ini file,
ie the build, and embedding it. If you wanted to capture the state of the RTEMS
source built optionally compute a hash for the entire source tree and embed that
as well. You can then have calls such as:

const char* rtems_config_build_hash(void);
const char* rtems_config_source_hash(void);

  [ the last one could return "NOT-AVAILABLE" if not enabled ]

The key point is defining markers, with defaults if optional, then wrapping your
configuration management system round them. Strings with a meaning such as
"SMP/800MHz/400MHz" are fragile because cosmetic changes break dependent
configuration management systems. A hash implies nothing specific, that task is
left to your CM systems.

For a BSP specific case of runtime values what about:

const char* rtems_config_bsp_hash(void);

with a default returning "DEFAULT". A BSP could override a weak function to
provide a hash computed in a specific way.

When I said a build label I was considering ...

[arm/beagleboneblack]
RTEMS_BUILD_LABEL = "...---..."

with a function 'rtems_config_build_label' to fetch it. The default could be
"RTEMS" if not set in config.ini. This would be useful when tracking deployed
builds of RTEMS. Consider this as labelling the config.ini file in a human
readable way that suites my CM processes.

thanks for broadening the perspective. Maybe just focusing on theperformance limits was a bit too specific. However, if we put thingsinto a hash which only weakly influence the performance characteristics,then comparable performance test runs will be hard over time.


Can environment variables effect a build of RTEMS? If so you either need to
include them somehow or have waf ignore them.

I don't know waf good enough. If some environment variables are setduring ./waf configure a warning is printed. I don't know, ifenvironment variables are used during ./waf build.

My point is that we need a key reported by the BSP and then some performance
limits which can be found by arch/bsp/key to check if there are performance
regressions.

I am missing the place where the performance limits are held. Do the tests
report timing values and the checks against the limits happen on a host?

Yes, this is what I proposed. An alternative would be to generate tableswith performance limits and excessive C preprocessor conditionals andlet the tests check the limits. Another option is to let the buildsystem generate the tables. This would require that the performancelimits are a part of the build specification.


The proposed work flow would be something like this:

1. You select a board to use for long term performance tests.

2. You define a set of configurations you want to test.

3. You do an initial run of the test suite for each configuration. TheRTEMS Tester provides you with a machine readable output (test data) ofthe test run with the raw test output per test executable and some metainformation (TODO).

4. A tool reads the test data and the RTEMS specification and updatesthe specification with the performance limits obtained from the test run(maybe with some simple transformation, for example increase maximum by10% and round to microseconds).


5. You review the performance limits and then commit them.

6. Later you run the tests with a new RTEMS commit, get the performancevalues, compare them against the limits stored in the specification, andgenerate a report.


In the specification items the limits are stored like this:

limits:
      sparc/gr712rc:
        DirtyCache:
          max-upper-bound: 0.000005
          mean-upper-bound: 0.000005

So each BSP has a separate block of lines. This avoids trouble with merge 
conflicts.

As discussed above, using arch/bsp as a key is not enough. We need to include 
other things, so it should be really:

limits:
      sparc/gr712rc/something-in-addition:
        DirtyCache:
          max-upper-bound: 0.000005
          mean-upper-bound: 0.000005

--
embedded brains GmbH
Sebastian HUBER
Dornierstr. 4
82178 Puchheim
Germany
email: sebastian.hu...@embedded-brains.de
Phone: +49-89-18 94 741 - 16
Fax:   +49-89-18 94 741 - 08
PGP: Public key available on request.

embedded brains GmbH
Registergericht: Amtsgericht München
Registernummer: HRB 157899
Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler
Unsere Datenschutzerklärung finden Sie hier: 
https://embedded-brains.de/datenschutzerklaerung/

_______________________________________________
devel mailing list
devel@rtems.org
http://lists.rtems.org/mailman/listinfo/devel

Re: Proposal for hardware configuration dependent performance limits

Reply via email to