On Tue, May 6, 2025 at 1:35 AM <soum...@nvidia.com> wrote:
>
> From: Soumya AR <soum...@nvidia.com>
>
> Hi,
>
> This RFC and subsequent patch series introduces support for printing and 
> parsing
> of aarch64 tuning parameters in the form of JSON.
>
> It is important to note that this mechanism is specifically intended for power
> users to experiment with tuning parameters. This proposal does not suggest the
> use of JSON tuning files in production. Additionally, the JSON format should 
> not
> be considered stable and may change as GCC evolves.
>
> [1] Introduction
>
> Currently, the aarch64 backend in GCC (15) stores the tuning parameteres of 
> all
> backends under gcc/config/aarch64/tuning_models/. Since these parameters are
> hardcoded for each CPU, this RFC proposes a technique to support the 
> adjustment
> of these parameters at runtime. This allows easier experimentation with more
> aggressive parameters to find optimal numbers.
>
> The tuning data is fed to the compiler in JSON format, which was primarily
> chosen for the following reasons:
>
> * JSON can represent hierarchical data. This is useful for incorporating the
> nested nature of the tuning structures.
> * JSON supports integers, strings, booleans, and arrays.
> * GCC already has support for parsing and printing JSON, removing the need for
> writing APIs to read and write the JSON files.
>
> Thus, if we take the following example of some tuning parameters:
>
> static struct cpu_addrcost_table generic_armv9_a_addrcost_table =
> {
>     {
>       1, /* hi  */
>       0, /* si  */
>       0, /* di  */
>       1, /* ti  */
>     },
>   0, /* pre_modify  */
>   0, /* post_modify  */
>   2, /* post_modify_ld3_st3  */
>   2, /* post_modify_ld4_st4  */
> };
>
> static cpu_prefetch_tune generic_armv9a_prefetch_tune =
> {
>   0,                    /* num_slots  */
>   -1,                   /* l1_cache_size  */
>   64,                   /* l1_cache_line_size  */
>   -1,                   /* l2_cache_size  */
>   true,                 /* prefetch_dynamic_strides */
> };
>
> static struct tune_params neoversev3_tunings =
> {
>   &generic_armv9_a_addrcost_table,
>   10, /* issue_rate  */
>   AARCH64_FUSE_NEOVERSE_BASE, /* fusible_ops  */
>   "32:16",      /* function_align.  */
>   &generic_armv9a_prefetch_tune,
>   AARCH64_LDP_STP_POLICY_ALWAYS,   /* ldp_policy_model.  */
> };
>
> We can represent them in JSON as:
>
> {
>   "tune_params": {
>     "addr_cost": {
>       "addr_scale_costs": { "hi": 1, "si": 0, "di": 0, "ti": 1 },
>       "pre_modify": 0,
>       "post_modify": 0,
>       "post_modify_ld3_st3": 2,
>       "post_modify_ld4_st4": 2
>     },
>     "issue_rate": 10,
>     "fusible_ops": 1584,
>     "function_align": "32:16",
>     "prefetch": {
>       "num_slots": 0,
>       "l1_cache_size": -1,
>       "l1_cache_line_size": 64,
>       "l2_cache_size": -1,
>       "prefetch_dynamic_strides": true
>     },
>     "ldp_policy_model": "AARCH64_LDP_STP_POLICY_ALWAYS"
>   }
> }
>
> ---
>
> [2] Methodology
>
> Before the internal tuning parameters are overridden with user provided ones, 
> we
> must ensure the validity of the provided data.
>
> This is done using a "base" JSON schema, which contains information about the
> tune_params data structure used by the aarch64 backend.
>
> Example:
>
> {
>   "tune_params": {
>     "addr_cost": {
>       "addr_scale_costs": {
>         "hi": "int",
>         "si": "int",
>         "di": "int",
>         "ti": "int"
>       },
>       "pre_modify": "int",
>       "post_modify": "int",
>       "post_modify_ld3_st3": "int",
>       "post_modify_ld4_st4": "int"
>     },
>     "issue_rate": "int",
>     "fusible_ops": "uint",
>     "function_align": "string",
>     "prefetch": {
>       "num_slots": "int",
>       "l1_cache_size": "int",
>       "l1_cache_line_size": "int",
>       "l2_cache_size": "int",
>       "prefetch_dynamic_strides": "boolean"
>     },
>     "ldp_policy_model": "string"
>   }
> }
>
> Using this schema, we can:
>         * Verify that the correct datatypes have been used.
>         * Verify if the user provided "key" or tuning parameter exists.
>         * Allow user to only specify the required fields (in nested fashion),
>         eliminating the need to list down every single paramter if they only
>         wish to experiment with some.
>
> The schema is currently stored as a raw JSON string in
> config/aarch64/aarch64-json-schema.h.
>
> 1: Parsing User Input and Overriding aarch64_tune_params
>
> Once validated, the data can be extracted and stored into aarch64_tune_params,
> overriding the default tunings.
>
> Thus, if
> -muser-provided-CPU=<json_file> is specified, we can call the following 
> function
> in aarch64.cc, to override the default tuning parameters:
>
> void
> aarch64_load_tuning_params_from_json (const char *data_filename,
>                                       struct tune_params *tune);
>
> 2: Dumping Back the Tuning Data (in JSON)
>
> If needed, the user can choose to print back the tuning data used during
> runtime. This is helpful for debugging and getting access to a "starter" 
> tuning
> file template, which can be then modified and re-fed to the compiler.
>
> Thus, if
> -muser-provided-CPU=<json_file> is specified, we can call the following 
> function
> in aarch64.cc, after the final tuning structure has been populated:
>
> void
> aarch64_print_tune_params (const tune_params &params, const char *filename);
>
> ---
>
> [3] Testing
>
> To test out the functionality for this change, we have to ensure the following
> things are happening correctly:
>
> 1. The JSON tunings printer is able to print back the correct values, 
> especially
> when it comes to trickier datatypes like enums.
> 2. The error handling works as expected, espcially in the case of incorrect 
> JSON
> syntax, incorrect datatypes, and incorrect tuning data structure.
> 3. During GCC invokation, the values from JSON are correctly loaded in
> aarch64_tune_params.
>
> To test these, we make use of a combination of regression tests (in
> gcc.target/aarch64/aarch64-json-tunings/) as well as self-tests to check the
> contents of aarch64_tune_params during the GCC build.
>
> ---
>
> [4] Limitations:
>
> Lack of comments in JSON:
>         * JSON does not have the ability to store comments, which leads to the
>         loss of useful information that is provided in the form of comments in
>         the header files. A workaround is to have a dummy "comment" key and
>         ignore it when parsing. (e.g., "comment": "parameter description")
>
> No enum support in JSON:
>         * The current workaround for this is to use strings instead of enums,
>         but we lose out on the ability to pass enum as values, as well as 
> doing
>         bitwise operations on the enums, something used quite frequently for
>         some parameters.
>
> No type distinction in JSON:
>         * JSON uses the "number" type which allows signed and unsigned 
> integers
>         as well as floats but provides no distinction between them.
>
> Storing the JSON schema:
>         * The JSON schema is currently stored as a raw JSON string in
>         aarch64-json-schema.h. This is helpful in exposing the file to the
>         testing framework, but is not the cleanest solution.
>
>         * Theoretically, the schema could be stored in the installation
>         directory, but this interferes with the idea of having self-tests for
>         the JSON parser.
>
> Maintaing the printer/parser routines and JSON schema:
>         * Any change in the aarch64 tuning format will result in the need for
>         manual changes to be made to the routines for the JSON tunings 
> printer,
>         parser, and schema.
>
> ---
>
> [5] Follow-Up Ideas:
>
> JSON to C++ File Conversion:
>         * Once the user has a JSON file with tuning values they are satisfied
>         with, they have to manually translate the file back to CPP header 
> files
>         using the correct structure formats. This can be automated using a
>         script that reads the JSON data and generates the appropriate header
>         file.


One suggestion is document this at least in the internals manual so
folks don't need to point back to sources and can look at a decent
description of how to use it. Note I don't think this should be
documented in the user manual though as I don't think users, even
power ones should depend on it being the same across versions; just in
a similar fashion `--param` options are treated (or rather should be).
My only worry about having this ability is that folks will mine to
find the best idea for their program and their specific core at the
time and somehow those become the standard for all the future; (an
example of this is
https://www.eecis.udel.edu/~xli/publications/park2007dynamic.pdf).

Thanks,
Andrew Pinski

>
> Soumya AR (5):
>   aarch64 + arm: Remove const keyword from tune_params members and
>     nested members
>   aarch64: Enable dumping of AArch64 CPU tuning parameters to JSON
>   json: Add get_map() method to JSON object class
>   aarch64: Enable parsing of user-provided AArch64 CPU tuning parameters
>   aarch64: Regression tests for parsing of user-provided AArch64 CPU
>     tuning parameters
>
>  gcc/config.gcc                                |   2 +-
>  gcc/config/aarch64/aarch64-cost-tables.h      |  18 +-
>  gcc/config/aarch64/aarch64-json-schema.h      | 261 ++++++
>  .../aarch64/aarch64-json-tunings-parser.cc    | 837 ++++++++++++++++++
>  .../aarch64/aarch64-json-tunings-parser.h     |  29 +
>  .../aarch64/aarch64-json-tunings-printer.cc   | 517 +++++++++++
>  .../aarch64/aarch64-json-tunings-printer.h    |  28 +
>  gcc/config/aarch64/aarch64-protos.h           | 182 ++--
>  gcc/config/aarch64/aarch64.cc                 |  45 +-
>  gcc/config/aarch64/aarch64.opt                |   8 +
>  gcc/config/aarch64/t-aarch64                  |  19 +
>  gcc/config/aarch64/tuning_models/a64fx.h      |  14 +-
>  gcc/config/aarch64/tuning_models/ampere1.h    |   8 +-
>  gcc/config/aarch64/tuning_models/ampere1a.h   |   2 +-
>  gcc/config/aarch64/tuning_models/ampere1b.h   |   8 +-
>  gcc/config/aarch64/tuning_models/cortexa35.h  |   2 +-
>  gcc/config/aarch64/tuning_models/cortexa53.h  |   4 +-
>  gcc/config/aarch64/tuning_models/cortexa57.h  |   8 +-
>  gcc/config/aarch64/tuning_models/cortexa72.h  |   2 +-
>  gcc/config/aarch64/tuning_models/cortexa73.h  |   2 +-
>  gcc/config/aarch64/tuning_models/cortexx925.h |  18 +-
>  gcc/config/aarch64/tuning_models/emag.h       |   2 +-
>  gcc/config/aarch64/tuning_models/exynosm1.h   |  14 +-
>  .../aarch64/tuning_models/fujitsu_monaka.h    |   2 +-
>  gcc/config/aarch64/tuning_models/generic.h    |  18 +-
>  .../aarch64/tuning_models/generic_armv8_a.h   |  18 +-
>  .../aarch64/tuning_models/generic_armv9_a.h   |  22 +-
>  .../aarch64/tuning_models/neoverse512tvb.h    |  10 +-
>  gcc/config/aarch64/tuning_models/neoversen1.h |   2 +-
>  gcc/config/aarch64/tuning_models/neoversen2.h |  18 +-
>  gcc/config/aarch64/tuning_models/neoversen3.h |  18 +-
>  gcc/config/aarch64/tuning_models/neoversev1.h |  20 +-
>  gcc/config/aarch64/tuning_models/neoversev2.h |  18 +-
>  gcc/config/aarch64/tuning_models/neoversev3.h |  18 +-
>  .../aarch64/tuning_models/neoversev3ae.h      |  18 +-
>  gcc/config/aarch64/tuning_models/qdf24xx.h    |  12 +-
>  gcc/config/aarch64/tuning_models/saphira.h    |   2 +-
>  gcc/config/aarch64/tuning_models/thunderx.h   |  10 +-
>  .../aarch64/tuning_models/thunderx2t99.h      |  12 +-
>  .../aarch64/tuning_models/thunderx3t110.h     |  12 +-
>  .../aarch64/tuning_models/thunderxt88.h       |   4 +-
>  gcc/config/aarch64/tuning_models/tsv110.h     |  12 +-
>  gcc/config/aarch64/tuning_models/xgene1.h     |  14 +-
>  gcc/config/arm/aarch-common-protos.h          | 128 +--
>  gcc/config/arm/aarch-cost-tables.h            |  12 +-
>  gcc/config/arm/arm-protos.h                   |   2 +-
>  gcc/config/arm/arm.cc                         |  20 +-
>  gcc/json.h                                    |  21 +-
>  gcc/selftest-run-tests.cc                     |   1 +
>  gcc/selftest.h                                |   1 +
>  .../aarch64-json-tunings.exp                  |  35 +
>  .../aarch64/aarch64-json-tunings/boolean-1.c  |   6 +
>  .../aarch64-json-tunings/boolean-1.json       |   9 +
>  .../aarch64/aarch64-json-tunings/boolean-2.c  |   7 +
>  .../aarch64-json-tunings/boolean-2.json       |   9 +
>  .../aarch64-json-tunings/empty-brackets.c     |   6 +
>  .../aarch64-json-tunings/empty-brackets.json  |   1 +
>  .../aarch64/aarch64-json-tunings/empty.c      |   6 +
>  .../aarch64/aarch64-json-tunings/empty.json   |   0
>  .../aarch64/aarch64-json-tunings/enum-1.c     |   8 +
>  .../aarch64/aarch64-json-tunings/enum-1.json  |   7 +
>  .../aarch64/aarch64-json-tunings/enum-2.c     |   7 +
>  .../aarch64/aarch64-json-tunings/enum-2.json  |   7 +
>  .../aarch64/aarch64-json-tunings/integer-1.c  |   7 +
>  .../aarch64-json-tunings/integer-1.json       |   6 +
>  .../aarch64/aarch64-json-tunings/integer-2.c  |   7 +
>  .../aarch64-json-tunings/integer-2.json       |   6 +
>  .../aarch64/aarch64-json-tunings/integer-3.c  |   7 +
>  .../aarch64-json-tunings/integer-3.json       |   5 +
>  .../aarch64/aarch64-json-tunings/integer-4.c  |   6 +
>  .../aarch64-json-tunings/integer-4.json       |   5 +
>  .../aarch64/aarch64-json-tunings/string-1.c   |   8 +
>  .../aarch64-json-tunings/string-1.json        |   7 +
>  .../aarch64/aarch64-json-tunings/string-2.c   |   7 +
>  .../aarch64-json-tunings/string-2.json        |   5 +
>  .../aarch64-json-tunings/unidentified-key.c   |   6 +
>  .../unidentified-key.json                     |   5 +
>  77 files changed, 2289 insertions(+), 381 deletions(-)
>  create mode 100644 gcc/config/aarch64/aarch64-json-schema.h
>  create mode 100644 gcc/config/aarch64/aarch64-json-tunings-parser.cc
>  create mode 100644 gcc/config/aarch64/aarch64-json-tunings-parser.h
>  create mode 100644 gcc/config/aarch64/aarch64-json-tunings-printer.cc
>  create mode 100644 gcc/config/aarch64/aarch64-json-tunings-printer.h
>  create mode 100644 
> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/aarch64-json-tunings.exp
>  create mode 100644 
> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/boolean-1.c
>  create mode 100644 
> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/boolean-1.json
>  create mode 100644 
> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/boolean-2.c
>  create mode 100644 
> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/boolean-2.json
>  create mode 100644 
> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/empty-brackets.c
>  create mode 100644 
> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/empty-brackets.json
>  create mode 100644 
> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/empty.c
>  create mode 100644 
> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/empty.json
>  create mode 100644 
> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/enum-1.c
>  create mode 100644 
> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/enum-1.json
>  create mode 100644 
> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/enum-2.c
>  create mode 100644 
> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/enum-2.json
>  create mode 100644 
> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/integer-1.c
>  create mode 100644 
> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/integer-1.json
>  create mode 100644 
> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/integer-2.c
>  create mode 100644 
> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/integer-2.json
>  create mode 100644 
> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/integer-3.c
>  create mode 100644 
> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/integer-3.json
>  create mode 100644 
> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/integer-4.c
>  create mode 100644 
> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/integer-4.json
>  create mode 100644 
> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/string-1.c
>  create mode 100644 
> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/string-1.json
>  create mode 100644 
> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/string-2.c
>  create mode 100644 
> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/string-2.json
>  create mode 100644 
> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/unidentified-key.c
>  create mode 100644 
> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/unidentified-key.json
>
> --
> 2.44.0
>

Reply via email to