-Wformat and u8""

2022-05-09 Thread Ulrich Drepper via Gcc
I have a C++20+ code base which forces the program to run using an UTF-8
locale and then uses u8"" strings internally.  This causes warnings with
-Wformat.

#include 

int main()
{
  printf((const char*) u8"test %d\n", 1);
  return 0;
}

Compile with
   g++ -std=gnu++20 -c -O -Wall t.cc

and you'll see:
t.cc: In function ‘int main()’:
t.cc:5:24: warning: format string is not an array of type ‘char’ [-Wformat=]
5 |   printf((const char*) u8"test %d\n", 1);
  |^

I would say it is not gcc's business to question my use of u8"" given that
I use a cast and the u8"" string can be parsed by the -Wformat handling.

Before filing a report I'd like to take the temperature and see whether
people agree with this.

Thanks.


Re: -Wformat and u8""

2022-05-09 Thread Florian Weimer via Gcc
* Ulrich Drepper via Gcc:

> t.cc: In function ‘int main()’:
> t.cc:5:24: warning: format string is not an array of type ‘char’ [-Wformat=]
> 5 |   printf((const char*) u8"test %d\n", 1);
>   |^

This is not an aliasing violation because of the exception for char,
right?  So the warning does not even highlight theoretical undefined
behavior.

On the other hand, that cast is still quite ugly.  All string-related
functions in the C library currently need it.  It might obscure real
type errors.  Isn't this a problem with char8_t?

Thanks,
Florian



Re: GSoC Fortran - Do Concurrent

2022-05-09 Thread Thomas Schwinge
Hi Bryan!

Thanks for reaching out, and welcome to GCC!

On 2022-05-03T13:34:13-0500, Bryan Carroll via Gcc  wrote:
> I know I'm too late for GSoC, but if Tobias Burnus or someone wants to
> mentor me, I'm willing to work on the Fortran - Do Concurrent project, as a
> volunteer, if it's not already taken. I didn't see the GSoC until a day
> before it was due.

So did another aspirant, Wil, who within very short notice threw together
a GSoC project application for that very task.  ;-P

I'm putting Wil in CC -- open discussion, and all that.

> A little about myself: I'm a research associate at the Center for Analysis
> and Prediction of Storms at the National Weather Center on the University of
> Oklahoma campus. Mostly I do hardware and software support on our servers
> and some occasional software development. I received my M.Sc. with a major
> of Applied Mathematics and Computer Science from the University of Central
> Oklahoma in the summer of 2020. I have experience with C, C++, Python, MPI,
> some OpenMP, and some Fortran. I took a compiler course a few years ago; so
> I have some experience with compilers. I've been wanting to get into GCC
> development.

Yes, that's certainly suitable background and motivation to start
contributing to GCC!

> If the Fortran - Do Concurrent is already taken, I'm also interested in some
> of the other projects. Let me know if you want to proceed.

We do the GSoC project ranking for GCC, but in the end it's Google who
decide how many GSoC slots we get, etc.  So, at this point, we don't know
yet whether Wil's "Fortran DO CONCURRENT" gets accepted as a GSoC project
this year, or whether it possibly might not work out for other reasons.
Assuming that Wil's GSoC project does come to fruition, we'd have to look
for a different task for you: the GSoC rules assume that participants
individually work on their own project.  Thus, either have to wait a bit
longer, or find a different project that you're interested in?


Grüße
 Thomas
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


Re: -Wformat and u8""

2022-05-09 Thread Ulrich Drepper via Gcc
On Mon, May 9, 2022 at 11:26 AM Florian Weimer  wrote:

> On the other hand, that cast is still quite ugly.


Yes, there aren't yet any I/O functions defined for char8_t and therefore
that's the best we can do right now.  I have all kinds of ugly macros to
high these casts.


> All string-related
> functions in the C library currently need it.


Yes, but the cast isn't the issue.  Or more correctly: gcc disregarding the
cast for -Wformat is.

Anyway, I'm not concerned about the non-I/O functions.  This is all C++
code after all and there are functions for all the rest.


> Isn't this a problem with char8_t?
>

 Well, yes, the problem is that gcc seems to just see the u8"" type
(char8_t) even though I tell it with the cast to regard it as a const
char.  Again, I ensure that the encoding matches and putting UTF-8 in char
strings is actually incorrect (in theory).


Re: -Wformat and u8""

2022-05-09 Thread Andreas Schwab
On Mai 09 2022, Florian Weimer via Gcc wrote:

> * Ulrich Drepper via Gcc:
>
>> t.cc: In function ‘int main()’:
>> t.cc:5:24: warning: format string is not an array of type ‘char’ [-Wformat=]
>> 5 |   printf((const char*) u8"test %d\n", 1);
>>   |^
>
> This is not an aliasing violation because of the exception for char,
> right?  So the warning does not even highlight theoretical undefined
> behavior.
>
> On the other hand, that cast is still quite ugly.  All string-related
> functions in the C library currently need it.  It might obscure real
> type errors.  Isn't this a problem with char8_t?

In C++20, u8 literals have a distinct type, which is an incompatible
change from C++17.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."