https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105959
--- Comment #16 from CVS Commits <cvs-commit at gcc dot gnu.org> --- The master branch has been updated by David Malcolm <dmalc...@gcc.gnu.org>: https://gcc.gnu.org/g:6b2740946d26ffde7e1318f24bae00443ece387d commit r13-6815-g6b2740946d26ffde7e1318f24bae00443ece387d Author: David Malcolm <dmalc...@redhat.com> Date: Wed Mar 22 16:48:27 2023 -0400 testsuite: always use UTF-8 in scan-sarif-file[-not] [PR105959] c-c++-common/diagnostic-format-sarif-file-4.c is a test case for quoting non-ASCII source code in a SARIF diagnostic log. The SARIF standard mandates that .sarif files are UTF-8 encoded. PR testsuite/105959 notes that the test case fails when the system encoding is not UTF-8, such as when the "make" invocation is prefixed with LC_ALL=C, whereas it works with in a UTF-8-locale. The root cause is that dg-scan opens the file for reading using the "system" encoding; I believe it is falling back to treating all files as effectively ISO 8859-1 in a non-UTF-8 locale. This patch fixes things by adding a mechanism to dg-scan to allow callers to (optionally) specify an encoding to use when reading the file, and updating scan-sarif-file (and the -not variant) to always use UTF-8 when calling dg-scan, fixing the test case with LC_ALL=C. gcc/testsuite/ChangeLog: PR testsuite/105959 * gcc.dg-selftests/dg-final.exp (dg_final_directive_check_num_args): Update expected maximum number of args for the various directives using dg-scan. * lib/scanasm.exp (append_encoding_arg): New procedure. (dg-scan): Add optional 3rd argument: the encoding to use when reading from the file. * lib/scansarif.exp (scan-sarif-file): Treat the file as UTF-8 encoded when reading it. (scan-sarif-file-not): Likewise. Signed-off-by: David Malcolm <dmalc...@redhat.com>