libcpp allows one to directly input non-ascii characters in source files (.f90 etc.); the used encoding can be set using the options:
-finput-charset=UTF-8 Cf. also: -fexec-charset and -fwide-exec-charset and http://gcc.gnu.org/onlinedocs/gcc/Preprocessor-Options.html If one uses gfortran -cpp -finput-charset=UTF-8 wide.f90 one currently gets the error: f951: warning: command line option "-finput-charset=UTF-8" is valid for C/C++/ObjC/ObjC++ but not for Fortran [enabled by default] The files scanner.c etc. do support the reading of wide chars thus, in principle, only few changes should be required. Caveat: Many people still use kind=1 strings - but with non-ASCII characters; one should try to make sure that this continues to work. Stuffing the characters in as one currently does is one option. For Latin1 (ISO 8859-1) characters one can also simply strip off the high bytes and write only the first byte. Using UTF-8 also works - though len() will report too many characters. -- Summary: Support UTF-8 (and other encodings) in the source file (.f90) for CHARACTER(kind=4) Product: gcc Version: 4.6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: fortran AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: burnus at gcc dot gnu dot org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45179