https://gcc.gnu.org/g:1844a4aa6615c2252303e70d41bdb18e7c5664c6

commit r15-4375-g1844a4aa6615c2252303e70d41bdb18e7c5664c6
Author: Jakub Jelinek <ja...@redhat.com>
Date:   Wed Oct 16 10:09:49 2024 +0200

    libcpp, c, middle-end: Optimize initializers using #embed in C
    
    This patch actually optimizes #embed, so far in C.
    
    For a simple testcase (for 494447200 bytes long cc1plus):
    cat embed-11.c
    unsigned char a[] = {
      #embed "cc1plus"
    };
    time ./xgcc -B ./ -S -std=c23 -O2 embed-11.c
    
    real    0m13.647s
    user    0m7.157s
    sys     0m2.597s
    time ./xgcc -B ./ -c -std=c23 -O2 embed-11.c
    
    real    0m28.649s
    user    0m26.653s
    sys     0m1.958s
    
    and when configured against binutils with .base64 support
    time ./xgcc -B ./ -S -std=c23 -O2 embed-11.c
    
    real    0m4.283s
    user    0m2.288s
    sys     0m0.859s
    time ./xgcc -B ./ -c -std=c23 -O2 embed-11.c
    
    real    0m6.888s
    user    0m5.876s
    sys     0m1.002s
    
    (all times with --enable-checking=yes,rtl,extra compiler).
    
    Even just
    ./cc1plus -E -o embed-11.i embed-11.c
    (which doesn't have this optimization yet and so preprocesses it as
    1.3GB preprocessed file) needed almost 25GB of compile time RAM (but
    preprocessed fine).
    And compiling that embed-11.i with -std=c23 -O0 by unpatched gcc
    I gave up after 400 seconds when it already ate 45GB of RAM and didn't
    produce a single byte into embed-11.s yet.
    
    The patch introduces a new CPP_EMBED token which contains raw memory image
    virtually representing a sequence of int literals.
    To simplify the parsing complexities, the preprocessor guarantees CPP_EMBED
    is only emitted if there are 4+ (it actually does that for 64+ right now)
    literals in the sequence and emits CPP_NUMBER CPP_COMMA CPP_EMBED CPP_COMMA
    CPP_NUMBER tokens (with more CPP_EMBED separated by CPP_COMMA if it is
    longer than 2GB, as STRING_CSTs in GCC and also the new RAW_DATA_CST etc.
    are limited to INT_MAX elements).  The main reason is that the preprocessor
    doesn't really know in which context #embed directive appears, there could
    be e.g.
    { 25 *
      #embed "whatever"
    * 2 - 15 }
    or similar and dealing with this special case deep in the expression parsing
    is undesirable.
    With the CPP_NUMBERs around it, I believe in the C FE the only places which
    need handling of the CPP_EMBED token are initializer parsing (that is the
    only one which adds actual optimizations for it), comma expressions (I
    believe nothing really cares whether it is 25,13,95 or
    25,13,0,1,2,3,4,5,6,7,8,9,10,13,95 etc., so besides the 2 outer CPP_NUMBER
    the parsing just adds one INTEGER_CST to the comma expression, I doubt users
    want to be spammed with millions of -Wunused warnings per #embed),
    whatever uses c_parser_expr_list (function calls, attribute arguments,
    OpenMP sizes clause argument, OpenACC tile clause argument and whatever uses
    c_parser_get_builtin_args (mainly for __builtin_shufflevector).  Please 
correct
    me if I'm wrong.
    
    The patch introduces a RAW_DATA_CST tree code, which can then be used inside
    of array CONSTRUCTOR elt values.  In some sense RAW_DATA_CST is similar to
    STRING_CST, but right now STRING_CST is used only if the whole array
    initializer is that constant, while RAW_DATA_CST at index idx (should be
    always INTEGER_CST index, another advantage of the CPP_NUMBER around is that
    [30 ... 250] =
      #embed "whatever"
    really does what it would do with a integer sequence there) stands for
    [idx] = RAW_DATA_POINTER (val)[0],
    [idx+1] = RAW_DATA_POINTER (val)[1],
    ...
    [idx+RAW_DATA_LENGTH (val)-1] = RAW_DATA_POINTER (val)[RAW_DATA_LENGTH 
(val)-1].
    Another important thing is that unlike STRING_CST which has the data
    embedded in it RAW_DATA_CST doesn't own the data, it has RAW_DATA_OWNER
    which owns the data (that can be a STRING_CST, e.g. used for PCH or LTO
    after reading LTO in) or another RAW_DATA_CST (with NULL RAW_DATA_OWNER,
    standing for data owned by libcpp buffers).  The advantage is that it can be
    cheaply peeled off, or split into multiple smaller pieces, e.g. if one uses
    designated initializer to store something into the middle of a 10GB #embed
    array, in no case we need to actually copy data around for that.
    Right now RAW_DATA_CST is only used in initializers of integral arrays where
    the integer type has (host) CHAR_BIT precision, so usually char/signed
    char/unsigned char (for C++ later maybe std::byte); in theory we could say
    allocate 4 times as big buffer for conversions to int array and depending
    on endianity and storage order reversal etc., but I'm not sure if that is
    something that will be actually needed in the wild.
    And an optimization inside of c-common.cc attempts to undo that CPP_NUMBER
    CPP_EMBED CPP_NUMBER division in case one uses #embed the usual way and
    doesn't use the boundary literals in weird ways and the values there match
    the surrounding bytes in the owner buffer.
    
    For LTO, in order to avoid copying perhaps gigabytes long data around,
    the hacks in the streamer out/in cause the data owned by libcpp to be
    streamed right into the stream and streamed back as a STRING_CST which
    owns the data.
    
    2024-10-16  Jakub Jelinek  <ja...@redhat.com>
    
    libcpp/
            * include/cpplib.h (TTYPE_TABLE): Add CPP_EMBED token type.
            * files.cc (finish_embed): For limit >= 64 and C preprocessing
            instead of emitting CPP_NUMBER CPP_COMMA separated sequence for the
            whole embed emit it just for the first and last byte and in between
            emit a CPP_EMBED token or tokens if too large.
    gcc/
            * treestruct.def (TS_RAW_DATA_CST): New.
            * tree.def (RAW_DATA_CST): New tree code.
            * tree-core.h (struct tree_raw_data): New type.
            (union tree_node): Add raw_data_cst member.
            * tree.h (RAW_DATA_LENGTH, RAW_DATA_POINTER, RAW_DATA_OWNER): 
Define.
            (gt_ggc_mx, gt_pch_nx): Declare overloads for tree_raw_data *.
            * tree.cc (tree_node_structure_for_code): Handle RAW_DATA_CST.
            (initialize_tree_contains_struct): Handle TS_RAW_DATA_CST.
            (tree_code_size): Handle RAW_DATA_CST.
            (initializer_zerop): Likewise.
            (gt_ggc_mx, gt_pch_nx): Define overloads for tree_raw_data *.
            * gimplify.cc (gimplify_init_ctor_eval): Handle RAW_DATA_CST.
            * fold-const.cc (operand_compare::operand_equal_p): Handle
            RAW_DATA_CST.  Formatting fix.
            (operand_compare::hash_operand): Handle RAW_DATA_CST.
            (native_encode_initializer): Likewise.
            (get_array_ctor_element_at_index): Likewise.
            (fold): Likewise.
            * gimple-fold.cc (fold_array_ctor_reference): Likewise.  Formatting
            fix.
            * varasm.cc (const_hash_1): Handle RAW_DATA_CST.
            (initializer_constant_valid_p_1): Likewise.
            (array_size_for_constructor): Likewise.
            (output_constructor_regular_field): Likewise.
            * expr.cc (categorize_ctor_elements_1): Likewise.
            (expand_expr_real_1) <case ARRAY_REF>: Punt for RAW_DATA_CST.
            * tree-streamer.cc (streamer_check_handled_ts_structures): Mark
            TS_RAW_DATA_CST as handled.
            * tree-streamer-in.cc (streamer_alloc_tree): Handle RAW_DATA_CST.
            (lto_input_ts_raw_data_cst_tree_pointers): New function.
            (streamer_read_tree_body): Call it for RAW_DATA_CST.
            * tree-streamer-out.cc (write_ts_raw_data_cst_tree_pointers): New
            function.
            (streamer_write_tree_body): Call it for RAW_DATA_CST.
            (streamer_write_tree_header): Handle RAW_DATA_CST.
            * lto-streamer-out.cc (DFS::DFS_write_tree_body): Handle 
RAW_DATA_CST.
            * tree-pretty-print.cc (dump_generic_node): Likewise.
    gcc/c-family/
            * c-ppoutput.cc (token_streamer::stream): Add special code to spell
            CPP_EMBED token.
            * c-lex.cc (c_lex_with_flags): Handle CPP_EMBED.  Formatting fix.
            * c-common.cc (c_parse_error): Handle CPP_EMBED.
            (braced_list_to_string): Optimize RAW_DATA_CST surrounded by
            INTEGER_CSTs which match some bytes before or after RAW_DATA_CST in
            its owner.
    gcc/c/
            * c-parser.cc (c_parser_braced_init): Handle CPP_EMBED.
            (c_parser_get_builtin_args): Likewise.
            (c_parser_expression): Likewise.
            (c_parser_expr_list): Likewise.
            * c-typeck.cc (digest_init): Handle RAW_DATA_CST.  Formatting fix.
            (init_node_successor): New function.
            (add_pending_init): Handle RAW_DATA_CST.
            (set_nonincremental_init): Formatting fix.
            (output_init_element): Handle RAW_DATA_CST.  Formatting fixes.
            (maybe_split_raw_data): New function.
            (process_init_element): Use maybe_split_raw_data.  Handle
            RAW_DATA_CST.
    gcc/testsuite/
            * c-c++-common/cpp/embed-20.c: New test.
            * c-c++-common/cpp/embed-21.c: New test.
            * c-c++-common/cpp/embed-28.c: New test.
            * gcc.dg/cpp/embed-8.c: New test.
            * gcc.dg/cpp/embed-9.c: New test.
            * gcc.dg/cpp/embed-10.c: New test.
            * gcc.dg/cpp/embed-11.c: New test.
            * gcc.dg/cpp/embed-12.c: New test.
            * gcc.dg/cpp/embed-13.c: New test.
            * gcc.dg/cpp/embed-14.c: New test.
            * gcc.dg/cpp/embed-15.c: New test.
            * gcc.dg/cpp/embed-16.c: New test.
            * gcc.dg/pch/embed-1.c: New test.
            * gcc.dg/pch/embed-1.hs: New test.
            * gcc.dg/lto/embed-1_0.c: New test.
            * gcc.dg/lto/embed-1_1.c: New test.

Diff:
---
 gcc/c-family/c-common.cc                  | 171 +++++++++++++-
 gcc/c-family/c-lex.cc                     |  44 +++-
 gcc/c-family/c-ppoutput.cc                |  54 +++++
 gcc/c/c-parser.cc                         |  89 ++++++-
 gcc/c/c-typeck.cc                         | 371 +++++++++++++++++++++++++++---
 gcc/expr.cc                               |  17 +-
 gcc/fold-const.cc                         |  81 ++++++-
 gcc/gimple-fold.cc                        |  42 +++-
 gcc/gimplify.cc                           |  43 ++++
 gcc/lto-streamer-out.cc                   |   3 +
 gcc/testsuite/c-c++-common/cpp/embed-20.c |  24 ++
 gcc/testsuite/c-c++-common/cpp/embed-21.c |  56 +++++
 gcc/testsuite/c-c++-common/cpp/embed-28.c |  38 +++
 gcc/testsuite/gcc.dg/cpp/embed-10.c       |   7 +
 gcc/testsuite/gcc.dg/cpp/embed-11.c       |  11 +
 gcc/testsuite/gcc.dg/cpp/embed-12.c       |  31 +++
 gcc/testsuite/gcc.dg/cpp/embed-13.c       |  39 ++++
 gcc/testsuite/gcc.dg/cpp/embed-14.c       |  24 ++
 gcc/testsuite/gcc.dg/cpp/embed-15.c       |  18 ++
 gcc/testsuite/gcc.dg/cpp/embed-16.c       |  11 +
 gcc/testsuite/gcc.dg/cpp/embed-8.c        |  87 +++++++
 gcc/testsuite/gcc.dg/cpp/embed-9.c        |  42 ++++
 gcc/testsuite/gcc.dg/lto/embed-1_0.c      |  19 ++
 gcc/testsuite/gcc.dg/lto/embed-1_1.c      |   5 +
 gcc/testsuite/gcc.dg/pch/embed-1.c        |  17 ++
 gcc/testsuite/gcc.dg/pch/embed-1.hs       |   6 +
 gcc/tree-core.h                           |   8 +
 gcc/tree-pretty-print.cc                  |  22 ++
 gcc/tree-streamer-in.cc                   |  32 +++
 gcc/tree-streamer-out.cc                  |  45 ++++
 gcc/tree-streamer.cc                      |   1 +
 gcc/tree.cc                               |  56 +++++
 gcc/tree.def                              |   8 +
 gcc/tree.h                                |  11 +
 gcc/treestruct.def                        |   1 +
 gcc/varasm.cc                             |  15 ++
 libcpp/files.cc                           |  47 +++-
 libcpp/include/cpplib.h                   |   2 +
 38 files changed, 1533 insertions(+), 65 deletions(-)

diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc
index 8ad9b998e7b3..7494a2dac0ae 100644
--- a/gcc/c-family/c-common.cc
+++ b/gcc/c-family/c-common.cc
@@ -6767,6 +6767,8 @@ c_parse_error (const char *gmsgid, enum cpp_ttype 
token_type,
     message = catenate_messages (gmsgid, " before end of line");
   else if (token_type == CPP_DECLTYPE)
     message = catenate_messages (gmsgid, " before %<decltype%>");
+  else if (token_type == CPP_EMBED)
+    message = catenate_messages (gmsgid, " before %<#embed%>");
   else if (token_type < N_TTYPES)
     {
       message = catenate_messages (gmsgid, " before %qs token");
@@ -9775,7 +9777,9 @@ maybe_add_include_fixit (rich_location *richloc, const 
char *header,
 
 /* Attempt to convert a braced array initializer list CTOR for array
    TYPE into a STRING_CST for convenience and efficiency.  Return
-   the converted string on success or the original ctor on failure.  */
+   the converted string on success or the original ctor on failure.
+   Also, for non-convertable CTORs which contain RAW_DATA_CST values
+   among the elts try to extend the range of RAW_DATA_CSTs.  */
 
 static tree
 braced_list_to_string (tree type, tree ctor, bool member)
@@ -9819,26 +9823,155 @@ braced_list_to_string (tree type, tree ctor, bool 
member)
   auto_vec<char> str;
   str.reserve (nelts + 1);
 
-  unsigned HOST_WIDE_INT i;
+  unsigned HOST_WIDE_INT i, j = HOST_WIDE_INT_M1U;
   tree index, value;
+  bool check_raw_data = false;
 
   FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (ctor), i, index, value)
     {
+      if (check_raw_data)
+       {
+         /* The preprocessor always surrounds CPP_EMBED tokens in between
+            CPP_NUMBER and CPP_COMMA tokens.  Try to undo that here now that
+            the whole initializer is parsed.  E.g. if we have
+            [0] = 'T', [1] = "his is a #embed tex", [20] = 't'
+            where the middle value is RAW_DATA_CST and in its owner this is
+            surrounded by 'T' and 't' characters, we can create from it just
+            [0] = "This is a #embed text"
+            Similarly if a RAW_DATA_CST needs to be split into two parts
+            because of designated init store but the stored value is actually
+            the same as in the RAW_DATA_OWNER's memory we can merge multiple
+            RAW_DATA_CSTs.  */
+         if (TREE_CODE (value) == RAW_DATA_CST
+             && index
+             && tree_fits_uhwi_p (index))
+           {
+             tree owner = RAW_DATA_OWNER (value);
+             unsigned int start, end, k;
+             if (TREE_CODE (owner) == STRING_CST)
+               {
+                 start
+                   = RAW_DATA_POINTER (value) - TREE_STRING_POINTER (owner);
+                 end = TREE_STRING_LENGTH (owner) - RAW_DATA_LENGTH (value);
+               }
+             else
+               {
+                 gcc_checking_assert (TREE_CODE (owner) == RAW_DATA_CST);
+                 start
+                   = RAW_DATA_POINTER (value) - RAW_DATA_POINTER (owner);
+                 end = RAW_DATA_LENGTH (owner) - RAW_DATA_LENGTH (value);
+               }
+             end -= start;
+             unsigned HOST_WIDE_INT l = j == HOST_WIDE_INT_M1U ? i : j;
+             for (k = 0; k < start && k < l; ++k)
+               {
+                 constructor_elt *elt = CONSTRUCTOR_ELT (ctor, l - k - 1);
+                 if (elt->index == NULL_TREE
+                     || !tree_fits_uhwi_p (elt->index)
+                     || !tree_fits_shwi_p (elt->value)
+                     || wi::to_widest (index) != (wi::to_widest (elt->index)
+                                                  + (k + 1)))
+                   break;
+                 if (TYPE_UNSIGNED (TREE_TYPE (value)))
+                   {
+                     if (tree_to_shwi (elt->value)
+                         != *((const unsigned char *)
+                              RAW_DATA_POINTER (value) - k - 1))
+                       break;
+                   }
+                 else if (tree_to_shwi (elt->value)
+                          != *((const signed char *)
+                               RAW_DATA_POINTER (value) - k - 1))
+                   break;
+               }
+             start = k;
+             l = 0;
+             for (k = 0; k < end && k + 1 < CONSTRUCTOR_NELTS (ctor) - i; ++k)
+               {
+                 constructor_elt *elt = CONSTRUCTOR_ELT (ctor, i + k + 1);
+                 if (elt->index == NULL_TREE
+                     || !tree_fits_uhwi_p (elt->index)
+                     || (wi::to_widest (elt->index)
+                         != (wi::to_widest (index)
+                             + (RAW_DATA_LENGTH (value) + l))))
+                   break;
+                 if (TREE_CODE (elt->value) == RAW_DATA_CST
+                     && RAW_DATA_OWNER (elt->value) == RAW_DATA_OWNER (value)
+                     && (RAW_DATA_POINTER (elt->value)
+                         == RAW_DATA_POINTER (value) + l))
+                   {
+                     l += RAW_DATA_LENGTH (elt->value);
+                     end -= RAW_DATA_LENGTH (elt->value) - 1;
+                     continue;
+                   }
+                 if (!tree_fits_shwi_p (elt->value))
+                   break;
+                 if (TYPE_UNSIGNED (TREE_TYPE (value)))
+                   {
+                     if (tree_to_shwi (elt->value)
+                         != *((const unsigned char *)
+                              RAW_DATA_POINTER (value)
+                              + RAW_DATA_LENGTH (value) + k))
+                       break;
+                   }
+                 else if (tree_to_shwi (elt->value)
+                          != *((const signed char *)
+                               RAW_DATA_POINTER (value)
+                               + RAW_DATA_LENGTH (value) + k))
+                   break;
+                 ++l;
+               }
+             end = k;
+             if (start != 0 || end != 0)
+               {
+                 if (j == HOST_WIDE_INT_M1U)
+                   j = i - start;
+                 else
+                   j -= start;
+                 RAW_DATA_POINTER (value) -= start;
+                 RAW_DATA_LENGTH (value) += start + end;
+                 i += end;
+                 if (start == 0)
+                   CONSTRUCTOR_ELT (ctor, j)->index = index;
+                 CONSTRUCTOR_ELT (ctor, j)->value = value;
+                 ++j;
+                 continue;
+               }
+           }
+         if (j != HOST_WIDE_INT_M1U)
+           {
+             CONSTRUCTOR_ELT (ctor, j)->index = index;
+             CONSTRUCTOR_ELT (ctor, j)->value = value;
+             ++j;
+           }
+         continue;
+       }
+
       unsigned HOST_WIDE_INT idx = i;
       if (index)
        {
          if (!tree_fits_uhwi_p (index))
-           return ctor;
+           {
+             check_raw_data = true;
+             continue;
+           }
          idx = tree_to_uhwi (index);
        }
 
       /* auto_vec is limited to UINT_MAX elements.  */
       if (idx > UINT_MAX)
-       return ctor;
+       {
+         check_raw_data = true;
+         continue;
+       }
 
-     /* Avoid non-constant initializers.  */
-     if (!tree_fits_shwi_p (value))
-       return ctor;
+      /* Avoid non-constant initializers.  */
+      if (!tree_fits_shwi_p (value))
+       {
+         check_raw_data = true;
+         --i;
+         continue;
+       }
 
       /* Skip over embedded nuls except the last one (initializer
         elements are in ascending order of indices).  */
@@ -9846,14 +9979,20 @@ braced_list_to_string (tree type, tree ctor, bool 
member)
       if (!val && i + 1 < nelts)
        continue;
 
-      if (idx < str.length())
-       return ctor;
+      if (idx < str.length ())
+       {
+         check_raw_data = true;
+         continue;
+       }
 
       /* Bail if the CTOR has a block of more than 256 embedded nuls
         due to implicitly initialized elements.  */
       unsigned nchars = (idx - str.length ()) + 1;
       if (nchars > 256)
-       return ctor;
+       {
+         check_raw_data = true;
+         continue;
+       }
 
       if (nchars > 1)
        {
@@ -9862,11 +10001,21 @@ braced_list_to_string (tree type, tree ctor, bool 
member)
        }
 
       if (idx >= maxelts)
-       return ctor;
+       {
+         check_raw_data = true;
+         continue;
+       }
 
       str.safe_insert (idx, val);
     }
 
+  if (check_raw_data)
+    {
+      if (j != HOST_WIDE_INT_M1U)
+       CONSTRUCTOR_ELTS (ctor)->truncate (j);
+      return ctor;
+    }
+
   /* Append a nul string termination.  */
   if (maxelts != HOST_WIDE_INT_M1U && str.length () < maxelts)
     str.safe_push (0);
diff --git a/gcc/c-family/c-lex.cc b/gcc/c-family/c-lex.cc
index fb88a19f31bc..fa46455b7230 100644
--- a/gcc/c-family/c-lex.cc
+++ b/gcc/c-family/c-lex.cc
@@ -783,6 +783,48 @@ c_lex_with_flags (tree *value, location_t *loc, unsigned 
char *cpp_flags,
       *value = build_string (tok->val.str.len, (const char 
*)tok->val.str.text);
       break;
 
+    case CPP_EMBED:
+      *value = make_node (RAW_DATA_CST);
+      TREE_TYPE (*value) = integer_type_node;
+      RAW_DATA_LENGTH (*value) = tok->val.str.len;
+      if (pch_file)
+       {
+         /* When writing PCH headers, copy the data over, such that
+            the owner is a STRING_CST.  */
+         int off = 0;
+         if (tok->val.str.len <= INT_MAX - 2)
+           /* See below.  */
+           off = 1;
+         tree owner = build_string (tok->val.str.len + 2 * off,
+                                    (const char *) tok->val.str.text - off);
+         TREE_TYPE (owner) = build_array_type_nelts (unsigned_char_type_node,
+                                                     tok->val.str.len);
+         RAW_DATA_OWNER (*value) = owner;
+         RAW_DATA_POINTER (*value) = TREE_STRING_POINTER (owner) + off;
+       }
+      else
+       {
+         /* Otherwise add another dummy RAW_DATA_CST as owner which
+            indicates the data is owned by libcpp.  */
+         RAW_DATA_POINTER (*value) = (const char *) tok->val.str.text;
+         tree owner = make_node (RAW_DATA_CST);
+         TREE_TYPE (owner) = integer_type_node;
+         RAW_DATA_LENGTH (owner) = tok->val.str.len;
+         RAW_DATA_POINTER (owner) = (const char *) tok->val.str.text;
+         if (tok->val.str.len <= INT_MAX - 2)
+           {
+             /* The preprocessor surrounds at least smaller CPP_EMBEDs
+                in between CPP_NUMBER CPP_COMMA before and
+                CPP_COMMA CPP_NUMBER after, so the actual libcpp buffer
+                holds those 2 extra bytes around it.  Don't do that if
+                CPP_EMBED is at the maximum ~ 2GB size.  */
+             RAW_DATA_LENGTH (owner) += 2;
+             RAW_DATA_POINTER (owner)--;
+           }
+         RAW_DATA_OWNER (*value) = owner;
+       }
+      break;
+
       /* This token should not be visible outside cpplib.  */
     case CPP_MACRO_ARG:
       gcc_unreachable ();
@@ -802,7 +844,7 @@ c_lex_with_flags (tree *value, location_t *loc, unsigned 
char *cpp_flags,
          add_flags |= PREV_FALLTHROUGH;
          goto retry_after_at;
        }
-       goto retry;
+      goto retry;
 
     default:
       *value = NULL_TREE;
diff --git a/gcc/c-family/c-ppoutput.cc b/gcc/c-family/c-ppoutput.cc
index e2c38cbd9ebb..722f9190f14b 100644
--- a/gcc/c-family/c-ppoutput.cc
+++ b/gcc/c-family/c-ppoutput.cc
@@ -309,6 +309,60 @@ token_streamer::stream (cpp_reader *pfile, const cpp_token 
*token,
        maybe_print_line (UNKNOWN_LOCATION);
       in_pragma = false;
     }
+  else if (token->type == CPP_EMBED)
+    {
+      char buf[76 + 6];
+      maybe_print_line (token->src_loc);
+      gcc_checking_assert (token->val.str.len != 0);
+      fputs ("#embed \".\" __gnu__::__base64__(", print.outf);
+      if (token->val.str.len > 30)
+       {
+         fputs (" \\\n", print.outf);
+         print.src_line++;
+       }
+      buf[0] = '"';
+      memcpy (buf + 1 + 76, "\" \\\n", 5);
+      unsigned int j = 1;
+      static const char base64_enc[] =
+       "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";
+      for (unsigned i = 0; ; i += 3)
+       {
+         unsigned char a = token->val.str.text[i];
+         unsigned char b = 0, c = 0;
+         unsigned int n = token->val.str.len - i;
+         if (n > 1)
+           b = token->val.str.text[i + 1];
+         if (n > 2)
+           c = token->val.str.text[i + 2];
+         unsigned long v = ((((unsigned long) a) << 16)
+                            | (((unsigned long) b) << 8)
+                            | c);
+         buf[j++] = base64_enc[(v >> 18) & 63];
+         buf[j++] = base64_enc[(v >> 12) & 63];
+         buf[j++] = base64_enc[(v >> 6) & 63];
+         buf[j++] = base64_enc[v & 63];
+         if (j == 76 + 1 || n <= 3)
+           {
+             if (n < 3)
+               {
+                 buf[j - 1] = '=';
+                 if (n == 1)
+                   buf[j - 2] = '=';
+               }
+             if (n <= 3)
+               memcpy (buf + j, "\")", 3);
+             else
+               print.src_line++;
+             fputs (buf, print.outf);
+             j = 1;
+             if (n <= 3)
+               break;
+           }
+       }
+      print.printed = true;
+      maybe_print_line (token->src_loc);
+      return;
+    }
   else
     {
       if (cpp_get_options (parse_in)->debug)
diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index 0834f8caf6ae..9eaa91413b6c 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -6221,6 +6221,25 @@ c_parser_braced_init (c_parser *parser, tree type, bool 
nested_p,
            {
              last_init_list_comma = c_parser_peek_token (parser)->location;
              c_parser_consume_token (parser);
+             /* CPP_EMBED should be always in between two CPP_COMMA
+                tokens.  */
+             while (c_parser_next_token_is (parser, CPP_EMBED))
+               {
+                 c_token *embed = c_parser_peek_token (parser);
+                 c_parser_consume_token (parser);
+                 c_expr embed_val;
+                 embed_val.value = embed->value;
+                 embed_val.original_code = RAW_DATA_CST;
+                 embed_val.original_type = integer_type_node;
+                 set_c_expr_source_range (&embed_val, embed->get_range ());
+                 embed_val.m_decimal = 0;
+                 process_init_element (embed->location, embed_val, false,
+                                       &braced_init_obstack);
+                 gcc_checking_assert (c_parser_next_token_is (parser,
+                                                              CPP_COMMA));
+                 last_init_list_comma = c_parser_peek_token (parser)->location;
+                 c_parser_consume_token (parser);
+               }
            }
          else
            break;
@@ -10428,6 +10447,25 @@ c_parser_get_builtin_args (c_parser *parser, const 
char *bname,
   while (c_parser_next_token_is (parser, CPP_COMMA))
     {
       c_parser_consume_token (parser);
+      if (c_parser_next_token_is (parser, CPP_EMBED))
+       {
+         c_token *embed = c_parser_peek_token (parser);
+         tree value = embed->value;
+         expr.original_code = INTEGER_CST;
+         expr.original_type = integer_type_node;
+         expr.value = NULL_TREE;
+         set_c_expr_source_range (&expr, embed->get_range ());
+         expr.m_decimal = 0;
+         for (unsigned int i = 0; i < (unsigned) RAW_DATA_LENGTH (value); i++)
+           {
+             expr.value = build_int_cst (integer_type_node,
+                                         ((const unsigned char *)
+                                          RAW_DATA_POINTER (value))[i]);
+             vec_safe_push (cexpr_list, expr);
+           }
+         c_parser_consume_token (parser);
+         continue;
+       }
       expr = c_parser_expr_no_commas (parser, NULL);
       vec_safe_push (cexpr_list, expr);
     }
@@ -13129,8 +13167,27 @@ c_parser_expression (c_parser *parser)
        }
       if (DECL_P (lhsval) || handled_component_p (lhsval))
        mark_exp_read (lhsval);
-      next = c_parser_expr_no_commas (parser, NULL);
-      next = convert_lvalue_to_rvalue (expr_loc, next, true, false);
+      if (c_parser_next_token_is (parser, CPP_EMBED))
+       {
+         /* Users aren't interested in milions of -Wunused-value
+            warnings when using #embed inside of a comma expression,
+            and one CPP_NUMBER plus CPP_COMMA before it and one
+            CPP_COMMA plus CPP_NUMBER after it is guaranteed by
+            the preprocessor.  Thus, parse the whole CPP_EMBED just
+            as a single INTEGER_CST, the last byte in it.  */
+         c_token *embed = c_parser_peek_token (parser);
+         tree val = embed->value;
+         unsigned last = RAW_DATA_LENGTH (val) - 1;
+         next.value = build_int_cst (TREE_TYPE (val),
+                                     ((const unsigned char *)
+                                      RAW_DATA_POINTER (val))[last]);
+         c_parser_consume_token (parser);
+       }
+      else
+       {
+         next = c_parser_expr_no_commas (parser, NULL);
+         next = convert_lvalue_to_rvalue (expr_loc, next, true, false);
+       }
       expr.value = build_compound_expr (loc, expr.value, next.value);
       expr.original_code = COMPOUND_EXPR;
       expr.original_type = next.original_type;
@@ -13235,6 +13292,34 @@ c_parser_expr_list (c_parser *parser, bool convert_p, 
bool fold_p,
   while (c_parser_next_token_is (parser, CPP_COMMA))
     {
       c_parser_consume_token (parser);
+      if (c_parser_next_token_is (parser, CPP_EMBED))
+       {
+         c_token *embed = c_parser_peek_token (parser);
+         tree value = embed->value;
+         expr.original_code = INTEGER_CST;
+         expr.original_type = integer_type_node;
+         expr.value = NULL_TREE;
+         set_c_expr_source_range (&expr, embed->get_range ());
+         expr.m_decimal = 0;
+         for (unsigned int i = 0; i < (unsigned) RAW_DATA_LENGTH (value); i++)
+           {
+             if (literal_zero_mask
+                 && idx + 1 < HOST_BITS_PER_INT
+                 && RAW_DATA_POINTER (value)[i] == 0)
+               *literal_zero_mask |= 1U << (idx + 1);
+             expr.value = build_int_cst (integer_type_node,
+                                         ((const unsigned char *)
+                                          RAW_DATA_POINTER (value))[i]);
+             vec_safe_push (ret, expr.value);
+             if (orig_types)
+               vec_safe_push (orig_types, expr.original_type);
+             if (locations)
+               locations->safe_push (expr.get_location ());
+             ++idx;
+           }
+         c_parser_consume_token (parser);
+         continue;
+       }
       if (literal_zero_mask)
        c_parser_check_literal_zero (parser, literal_zero_mask, idx + 1);
       expr = c_parser_expr_no_commas (parser, NULL);
diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index 2c4560fb6d3c..36d0b23a3d72 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -8797,12 +8797,13 @@ digest_init (location_t init_loc, tree type, tree init, 
tree origtype,
   if (!maybe_const)
     arith_const_expr = false;
   else if (!INTEGRAL_TYPE_P (TREE_TYPE (inside_init))
-      && TREE_CODE (TREE_TYPE (inside_init)) != REAL_TYPE
-      && TREE_CODE (TREE_TYPE (inside_init)) != COMPLEX_TYPE)
+          && TREE_CODE (TREE_TYPE (inside_init)) != REAL_TYPE
+          && TREE_CODE (TREE_TYPE (inside_init)) != COMPLEX_TYPE)
     arith_const_expr = false;
   else if (TREE_CODE (inside_init) != INTEGER_CST
-      && TREE_CODE (inside_init) != REAL_CST
-      && TREE_CODE (inside_init) != COMPLEX_CST)
+          && TREE_CODE (inside_init) != REAL_CST
+          && TREE_CODE (inside_init) != COMPLEX_CST
+          && TREE_CODE (inside_init) != RAW_DATA_CST)
     arith_const_expr = false;
   else if (TREE_OVERFLOW (inside_init))
     arith_const_expr = false;
@@ -9063,6 +9064,22 @@ digest_init (location_t init_loc, tree type, tree init, 
tree origtype,
                                               ? ic_init_const
                                               : ic_init), 
null_pointer_constant,
                                              NULL_TREE, NULL_TREE, 0);
+      if (TREE_CODE (inside_init) == RAW_DATA_CST
+         && c_inhibit_evaluation_warnings == 0
+         && warn_conversion
+         && !TYPE_UNSIGNED (type)
+         && TYPE_PRECISION (type) == CHAR_BIT)
+       for (unsigned int i = 0;
+            i < (unsigned) RAW_DATA_LENGTH (inside_init); ++i)
+         if (((const signed char *) RAW_DATA_POINTER (inside_init))[i] < 0)
+           warning_at (init_loc, OPT_Wconversion,
+                       "conversion from %qT to %qT changes value from "
+                       "%qd to %qd",
+                       integer_type_node, type,
+                       ((const unsigned char *)
+                        RAW_DATA_POINTER (inside_init))[i],
+                       ((const signed char *)
+                        RAW_DATA_POINTER (inside_init))[i]);
       return inside_init;
     }
 
@@ -10174,6 +10191,28 @@ set_init_label (location_t loc, tree fieldname, 
location_t fieldname_loc,
     while (field != NULL_TREE);
 }
 
+/* Helper function for add_pending_init.  Find inorder successor of P
+   in AVL tree.  */
+static struct init_node *
+init_node_successor (struct init_node *p)
+{
+  struct init_node *r;
+  if (p->right)
+    {
+      r = p->right;
+      while (r->left)
+       r = r->left;
+      return r;
+    }
+  r = p->parent;
+  while (r && p == r->right)
+    {
+      p = r;
+      r = r->parent;
+    }
+  return r;
+}
+
 /* Add a new initializer to the tree of pending initializers.  PURPOSE
    identifies the initializer, either array index or field in a structure.
    VALUE is the value of that index or field.  If ORIGTYPE is not
@@ -10201,9 +10240,179 @@ add_pending_init (location_t loc, tree purpose, tree 
value, tree origtype,
          if (tree_int_cst_lt (purpose, p->purpose))
            q = &p->left;
          else if (tree_int_cst_lt (p->purpose, purpose))
-           q = &p->right;
+           {
+             if (TREE_CODE (p->value) != RAW_DATA_CST
+                 || (p->right
+                     && tree_int_cst_le (p->right->purpose, purpose)))
+               q = &p->right;
+             else
+               {
+                 widest_int pp = wi::to_widest (p->purpose);
+                 widest_int pw = wi::to_widest (purpose);
+                 if (pp + RAW_DATA_LENGTH (p->value) <= pw)
+                   q = &p->right;
+                 else
+                   {
+                     /* Override which should split the old RAW_DATA_CST
+                        into 2 or 3 pieces.  */
+                     if (!implicit && warn_override_init)
+                       warning_init (loc, OPT_Woverride_init,
+                                     "initialized field overwritten");
+                     unsigned HOST_WIDE_INT start = (pw - pp).to_uhwi ();
+                     unsigned HOST_WIDE_INT len = 1;
+                     if (TREE_CODE (value) == RAW_DATA_CST)
+                       len = RAW_DATA_LENGTH (value);
+                     unsigned HOST_WIDE_INT end = 0;
+                     unsigned plen = RAW_DATA_LENGTH (p->value);
+                     gcc_checking_assert (start < plen && start);
+                     if (plen - start > len)
+                       end = plen - start - len;
+                     tree v = p->value;
+                     tree origtype = p->origtype;
+                     if (start == 1)
+                       p->value = build_int_cst (TREE_TYPE (v),
+                                                 *(const unsigned char *)
+                                                 RAW_DATA_POINTER (v));
+                     else
+                       {
+                         p->value = v;
+                         if (end > 1)
+                           v = copy_node (v);
+                         RAW_DATA_LENGTH (p->value) = start;
+                       }
+                     if (end)
+                       {
+                         tree epurpose
+                           = size_binop (PLUS_EXPR, purpose,
+                                         bitsize_int (len));
+                         if (end > 1)
+                           {
+                             RAW_DATA_LENGTH (v) -= plen - end;
+                             RAW_DATA_POINTER (v) += plen - end;
+                           }
+                         else
+                           v = build_int_cst (TREE_TYPE (v),
+                                              ((const unsigned char *)
+                                               RAW_DATA_POINTER (v))[plen
+                                                                     - end]);
+                         add_pending_init (loc, epurpose, v, origtype,
+                                           implicit, braced_init_obstack);
+                       }
+                     q = &constructor_pending_elts;
+                     continue;
+                   }
+               }
+           }
          else
            {
+             if (TREE_CODE (p->value) == RAW_DATA_CST
+                 && (RAW_DATA_LENGTH (p->value)
+                     > (TREE_CODE (value) == RAW_DATA_CST
+                        ? RAW_DATA_LENGTH (value) : 1)))
+               {
+                 /* Override which should split the old RAW_DATA_CST
+                    into 2 pieces.  */
+                 if (!implicit && warn_override_init)
+                   warning_init (loc, OPT_Woverride_init,
+                                 "initialized field overwritten");
+                 unsigned HOST_WIDE_INT len = 1;
+                 if (TREE_CODE (value) == RAW_DATA_CST)
+                   len = RAW_DATA_LENGTH (value);
+                 if ((unsigned) RAW_DATA_LENGTH (p->value) > len + 1)
+                   {
+                     RAW_DATA_LENGTH (p->value) -= len;
+                     RAW_DATA_POINTER (p->value) += len;
+                   }
+                 else
+                   {
+                     unsigned int l = RAW_DATA_LENGTH (p->value) - 1;
+                     p->value
+                       = build_int_cst (TREE_TYPE (p->value),
+                                        ((const unsigned char *)
+                                         RAW_DATA_POINTER (p->value))[l]);
+                   }
+                 p->purpose = size_binop (PLUS_EXPR, p->purpose,
+                                          bitsize_int (len));
+                 continue;
+               }
+             if (TREE_CODE (value) == RAW_DATA_CST)
+               {
+               handle_raw_data:
+                 /* RAW_DATA_CST value might overlap various further
+                    prior initval entries.  Find out how many.  */
+                 unsigned cnt = 0;
+                 widest_int w
+                   = wi::to_widest (purpose) + RAW_DATA_LENGTH (value);
+                 struct init_node *r = p, *last = NULL;
+                 bool override_init = warn_override_init;
+                 while ((r = init_node_successor (r))
+                        && wi::to_widest (r->purpose) < w)
+                   {
+                     ++cnt;
+                     if (TREE_SIDE_EFFECTS (r->value))
+                       warning_init (loc, OPT_Woverride_init_side_effects,
+                                     "initialized field with side-effects "
+                                     "overwritten");
+                     else if (override_init)
+                       {
+                         warning_init (loc, OPT_Woverride_init,
+                                       "initialized field overwritten");
+                         override_init = false;
+                       }
+                     last = r;
+                   }
+                 if (cnt)
+                   {
+                     if (TREE_CODE (last->value) == RAW_DATA_CST
+                         && (wi::to_widest (last->purpose)
+                             + RAW_DATA_LENGTH (last->value) > w))
+                       {
+                         /* The last overlapping prior initval overlaps
+                            only partially.  Shrink it and decrease cnt.  */
+                         unsigned int l = (wi::to_widest (last->purpose)
+                                           + RAW_DATA_LENGTH (last->value)
+                                           - w).to_uhwi ();
+                         --cnt;
+                         RAW_DATA_LENGTH (last->value) -= l;
+                         RAW_DATA_POINTER (last->value) += l;
+                         if (RAW_DATA_LENGTH (last->value) == 1)
+                           {
+                             const unsigned char *s
+                               = ((const unsigned char *)
+                                  RAW_DATA_POINTER (last->value));
+                             last->value
+                               = build_int_cst (TREE_TYPE (last->value), *s);
+                           }
+                         last->purpose
+                           = size_binop (PLUS_EXPR, last->purpose,
+                                         bitsize_int (l));
+                       }
+                     /* Instead of deleting cnt nodes from the AVL tree
+                        and rebalancing, peel of last cnt bytes from the
+                        RAW_DATA_CST.  Overriding thousands of previously
+                        initialized array elements with #embed needs to work,
+                        but doesn't need to be super efficient.  */
+                     gcc_checking_assert ((unsigned) RAW_DATA_LENGTH (value)
+                                          > cnt);
+                     RAW_DATA_LENGTH (value) -= cnt;
+                     const unsigned char *s
+                       = ((const unsigned char *) RAW_DATA_POINTER (value)
+                          + RAW_DATA_LENGTH (value));
+                     unsigned int o = RAW_DATA_LENGTH (value);
+                     for (r = p; cnt--; ++o, ++s)
+                       {
+                         r = init_node_successor (r);
+                         r->purpose = size_binop (PLUS_EXPR, purpose,
+                                                  bitsize_int (o));
+                         r->value = build_int_cst (TREE_TYPE (value), *s);
+                         r->origtype = origtype;
+                       }
+                     if (RAW_DATA_LENGTH (value) == 1)
+                       value = build_int_cst (TREE_TYPE (value),
+                                              *((const unsigned char *)
+                                                RAW_DATA_POINTER (value)));
+                   }
+               }
              if (!implicit)
                {
                  if (TREE_SIDE_EFFECTS (p->value))
@@ -10219,6 +10428,23 @@ add_pending_init (location_t loc, tree purpose, tree 
value, tree origtype,
              return;
            }
        }
+      if (TREE_CODE (value) == RAW_DATA_CST && p)
+       {
+         struct init_node *r;
+         if (q == &p->left)
+           r = p;
+         else
+           r = init_node_successor (p);
+         if (r && wi::to_widest (r->purpose) < (wi::to_widest (purpose)
+                                                + RAW_DATA_LENGTH (value)))
+           {
+             /* Overlap with at least one prior initval in the range but
+                not at the start.  */
+             p = r;
+             p->purpose = purpose;
+             goto handle_raw_data;
+           }
+       }
     }
   else
     {
@@ -10447,8 +10673,8 @@ set_nonincremental_init (struct obstack * 
braced_init_obstack)
     {
       if (TYPE_DOMAIN (constructor_type))
        constructor_unfilled_index
-           = convert (bitsizetype,
-                      TYPE_MIN_VALUE (TYPE_DOMAIN (constructor_type)));
+         = convert (bitsizetype,
+                    TYPE_MIN_VALUE (TYPE_DOMAIN (constructor_type)));
       else
        constructor_unfilled_index = bitsize_zero_node;
     }
@@ -10662,12 +10888,13 @@ output_init_element (location_t loc, tree value, tree 
origtype,
   if (!maybe_const)
     arith_const_expr = false;
   else if (!INTEGRAL_TYPE_P (TREE_TYPE (value))
-      && TREE_CODE (TREE_TYPE (value)) != REAL_TYPE
-      && TREE_CODE (TREE_TYPE (value)) != COMPLEX_TYPE)
+          && TREE_CODE (TREE_TYPE (value)) != REAL_TYPE
+          && TREE_CODE (TREE_TYPE (value)) != COMPLEX_TYPE)
     arith_const_expr = false;
   else if (TREE_CODE (value) != INTEGER_CST
-      && TREE_CODE (value) != REAL_CST
-      && TREE_CODE (value) != COMPLEX_CST)
+          && TREE_CODE (value) != REAL_CST
+          && TREE_CODE (value) != COMPLEX_CST
+          && TREE_CODE (value) != RAW_DATA_CST)
     arith_const_expr = false;
   else if (TREE_OVERFLOW (value))
     arith_const_expr = false;
@@ -10772,10 +10999,16 @@ output_init_element (location_t loc, tree value, tree 
origtype,
      put it on constructor_pending_elts.  */
   if (TREE_CODE (constructor_type) == ARRAY_TYPE
       && (!constructor_incremental
-         || !tree_int_cst_equal (field, constructor_unfilled_index)))
+         || !tree_int_cst_equal (field, constructor_unfilled_index)
+         || (TREE_CODE (value) == RAW_DATA_CST
+             && constructor_pending_elts
+             && pending)))
     {
       if (constructor_incremental
-         && tree_int_cst_lt (field, constructor_unfilled_index))
+         && (tree_int_cst_lt (field, constructor_unfilled_index)
+             || (TREE_CODE (value) == RAW_DATA_CST
+                 && constructor_pending_elts
+                 && pending)))
        set_nonincremental_init (braced_init_obstack);
 
       add_pending_init (loc, field, value, origtype, implicit,
@@ -10834,9 +11067,14 @@ output_init_element (location_t loc, tree value, tree 
origtype,
 
   /* Advance the variable that indicates sequential elements output.  */
   if (TREE_CODE (constructor_type) == ARRAY_TYPE)
-    constructor_unfilled_index
-      = size_binop_loc (input_location, PLUS_EXPR, constructor_unfilled_index,
-                       bitsize_one_node);
+    {
+      tree inc = bitsize_one_node;
+      if (value && TREE_CODE (value) == RAW_DATA_CST)
+       inc = bitsize_int (RAW_DATA_LENGTH (value));
+      constructor_unfilled_index
+       = size_binop_loc (input_location, PLUS_EXPR,
+                         constructor_unfilled_index, inc);
+    }
   else if (TREE_CODE (constructor_type) == RECORD_TYPE)
     {
       constructor_unfilled_fields
@@ -10845,8 +11083,8 @@ output_init_element (location_t loc, tree value, tree 
origtype,
       /* Skip any nameless bit fields.  */
       while (constructor_unfilled_fields != NULL_TREE
             && DECL_UNNAMED_BIT_FIELD (constructor_unfilled_fields))
-       constructor_unfilled_fields =
-         DECL_CHAIN (constructor_unfilled_fields);
+       constructor_unfilled_fields
+         = DECL_CHAIN (constructor_unfilled_fields);
     }
   else if (TREE_CODE (constructor_type) == UNION_TYPE)
     constructor_unfilled_fields = NULL_TREE;
@@ -11092,6 +11330,23 @@ initialize_elementwise_p (tree type, tree value)
   return false;
 }
 
+/* Helper function for process_init_element.  Split first element of
+   RAW_DATA_CST and save the rest to *RAW_DATA.  */
+
+static inline tree
+maybe_split_raw_data (tree value, tree *raw_data)
+{
+  if (value == NULL_TREE || TREE_CODE (value) != RAW_DATA_CST)
+    return value;
+  *raw_data = value;
+  value = build_int_cst (integer_type_node,
+                        *(const unsigned char *)
+                        RAW_DATA_POINTER (*raw_data));
+  ++RAW_DATA_POINTER (*raw_data);
+  --RAW_DATA_LENGTH (*raw_data);
+  return value;
+}
+
 /* Add one non-braced element to the current constructor level.
    This adjusts the current position within the constructor's type.
    This may also start or terminate implicit levels
@@ -11114,7 +11369,9 @@ process_init_element (location_t loc, struct c_expr 
value, bool implicit,
     = (orig_value != NULL_TREE && TREE_CODE (orig_value) == STRING_CST);
   bool strict_string = value.original_code == STRING_CST;
   bool was_designated = designator_depth != 0;
+  tree raw_data = NULL_TREE;
 
+retry:
   designator_depth = 0;
   designator_erroneous = 0;
 
@@ -11282,6 +11539,7 @@ process_init_element (location_t loc, struct c_expr 
value, bool implicit,
              continue;
            }
 
+         value.value = maybe_split_raw_data (value.value, &raw_data);
          if (value.value)
            {
              push_member_name (constructor_fields);
@@ -11370,6 +11628,7 @@ process_init_element (location_t loc, struct c_expr 
value, bool implicit,
              continue;
            }
 
+         value.value = maybe_split_raw_data (value.value, &raw_data);
          if (value.value)
            {
              push_member_name (constructor_fields);
@@ -11418,26 +11677,66 @@ process_init_element (location_t loc, struct c_expr 
value, bool implicit,
              break;
            }
 
-         /* Now output the actual element.  */
-         if (value.value)
+         if (value.value
+             && TREE_CODE (value.value) == RAW_DATA_CST
+             && RAW_DATA_LENGTH (value.value) > 1
+             && (TREE_CODE (elttype) == INTEGER_TYPE
+                 || TREE_CODE (elttype) == BITINT_TYPE)
+             && TYPE_PRECISION (elttype) == CHAR_BIT
+             && (constructor_max_index == NULL_TREE
+                 || tree_int_cst_lt (constructor_index,
+                                     constructor_max_index)))
            {
+             unsigned int len = RAW_DATA_LENGTH (value.value);
+             if (constructor_max_index)
+               {
+                 widest_int w = wi::to_widest (constructor_max_index);
+                 w -= wi::to_widest (constructor_index);
+                 w += 1;
+                 if (w < len)
+                   len = w.to_uhwi ();
+               }
+             if (len < (unsigned) RAW_DATA_LENGTH (value.value))
+               {
+                 raw_data = copy_node (value.value);
+                 RAW_DATA_LENGTH (raw_data) -= len;
+                 RAW_DATA_POINTER (raw_data) += len;
+                 RAW_DATA_LENGTH (value.value) = len;
+               }
+             TREE_TYPE (value.value) = elttype;
              push_array_bounds (tree_to_uhwi (constructor_index));
              output_init_element (loc, value.value, value.original_type,
-                                  strict_string, elttype,
-                                  constructor_index, true, implicit,
-                                  braced_init_obstack);
+                                  false, elttype, constructor_index, true,
+                                  implicit, braced_init_obstack);
              RESTORE_SPELLING_DEPTH (constructor_depth);
+             constructor_index
+               = size_binop_loc (input_location, PLUS_EXPR,
+                                 constructor_index, bitsize_int (len));
            }
+         else
+           {
+             value.value = maybe_split_raw_data (value.value, &raw_data);
+             /* Now output the actual element.  */
+             if (value.value)
+               {
+                 push_array_bounds (tree_to_uhwi (constructor_index));
+                 output_init_element (loc, value.value, value.original_type,
+                                      strict_string, elttype,
+                                      constructor_index, true, implicit,
+                                      braced_init_obstack);
+                 RESTORE_SPELLING_DEPTH (constructor_depth);
+               }
 
-         constructor_index
-           = size_binop_loc (input_location, PLUS_EXPR,
-                             constructor_index, bitsize_one_node);
+             constructor_index
+               = size_binop_loc (input_location, PLUS_EXPR,
+                                 constructor_index, bitsize_one_node);
 
-         if (!value.value)
-           /* If we are doing the bookkeeping for an element that was
-              directly output as a constructor, we must update
-              constructor_unfilled_index.  */
-           constructor_unfilled_index = constructor_index;
+             if (!value.value)
+               /* If we are doing the bookkeeping for an element that was
+                  directly output as a constructor, we must update
+                  constructor_unfilled_index.  */
+               constructor_unfilled_index = constructor_index;
+           }
        }
       else if (gnu_vector_type_p (constructor_type))
        {
@@ -11452,6 +11751,7 @@ process_init_element (location_t loc, struct c_expr 
value, bool implicit,
              break;
            }
 
+         value.value = maybe_split_raw_data (value.value, &raw_data);
          /* Now output the actual element.  */
          if (value.value)
            {
@@ -11485,6 +11785,7 @@ process_init_element (location_t loc, struct c_expr 
value, bool implicit,
        }
       else
        {
+         value.value = maybe_split_raw_data (value.value, &raw_data);
          if (value.value)
            output_init_element (loc, value.value, value.original_type,
                                 strict_string, constructor_type,
@@ -11556,6 +11857,14 @@ process_init_element (location_t loc, struct c_expr 
value, bool implicit,
     }
 
   constructor_range_stack = 0;
+
+  if (raw_data && RAW_DATA_LENGTH (raw_data))
+    {
+      gcc_assert (!string_flag && !was_designated);
+      value.value = raw_data;
+      raw_data = NULL_TREE;
+      goto retry;
+    }
 }
 
 /* Build a complete asm-statement, whose components are a CV_QUALIFIER
diff --git a/gcc/expr.cc b/gcc/expr.cc
index da486cf85fdd..ed64ccea7663 100644
--- a/gcc/expr.cc
+++ b/gcc/expr.cc
@@ -7165,6 +7165,13 @@ categorize_ctor_elements_1 (const_tree ctor, 
HOST_WIDE_INT *p_nz_elts,
          init_elts += mult * TREE_STRING_LENGTH (value);
          break;
 
+       case RAW_DATA_CST:
+         nz_elts += mult * RAW_DATA_LENGTH (value);
+         unique_nz_elts += RAW_DATA_LENGTH (value);
+         init_elts += mult * RAW_DATA_LENGTH (value);
+         num_fields += mult * (RAW_DATA_LENGTH (value) - 1);
+         break;
+
        case COMPLEX_CST:
          if (!initializer_zerop (TREE_REALPART (value)))
            {
@@ -11825,7 +11832,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode 
tmode,
                                      field, value)
              if (tree_int_cst_equal (field, index))
                {
-                 if (!TREE_SIDE_EFFECTS (value))
+                 if (!TREE_SIDE_EFFECTS (value)
+                     && TREE_CODE (value) != RAW_DATA_CST)
                    return expand_expr (fold (value), target, tmode, modifier);
                  break;
                }
@@ -11867,7 +11875,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode 
tmode,
                                          field, value)
                  if (tree_int_cst_equal (field, index))
                    {
-                     if (TREE_SIDE_EFFECTS (value))
+                     if (TREE_SIDE_EFFECTS (value)
+                         || TREE_CODE (value) == RAW_DATA_CST)
                        break;
 
                      if (TREE_CODE (value) == CONSTRUCTOR)
@@ -11884,8 +11893,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode 
tmode,
                            break;
                        }
 
-                     return
-                       expand_expr (fold (value), target, tmode, modifier);
+                     return expand_expr (fold (value), target, tmode,
+                                         modifier);
                    }
              }
            else if (TREE_CODE (init) == STRING_CST)
diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
index 6f73f648b70c..be04f28e0eb3 100644
--- a/gcc/fold-const.cc
+++ b/gcc/fold-const.cc
@@ -3360,8 +3360,14 @@ operand_compare::operand_equal_p (const_tree arg0, 
const_tree arg1,
       case STRING_CST:
        return (TREE_STRING_LENGTH (arg0) == TREE_STRING_LENGTH (arg1)
                && ! memcmp (TREE_STRING_POINTER (arg0),
-                             TREE_STRING_POINTER (arg1),
-                             TREE_STRING_LENGTH (arg0)));
+                            TREE_STRING_POINTER (arg1),
+                            TREE_STRING_LENGTH (arg0)));
+
+      case RAW_DATA_CST:
+       return (RAW_DATA_LENGTH (arg0) == RAW_DATA_LENGTH (arg1)
+               && ! memcmp (RAW_DATA_POINTER (arg0),
+                            RAW_DATA_POINTER (arg1),
+                            RAW_DATA_LENGTH (arg0)));
 
       case ADDR_EXPR:
        gcc_checking_assert (!(flags & OEP_ADDRESS_OF));
@@ -3960,6 +3966,10 @@ operand_compare::hash_operand (const_tree t, 
inchash::hash &hstate,
       hstate.add ((const void *) TREE_STRING_POINTER (t),
                  TREE_STRING_LENGTH (t));
       return;
+    case RAW_DATA_CST:
+      hstate.add ((const void *) RAW_DATA_POINTER (t),
+                 RAW_DATA_LENGTH (t));
+      return;
     case COMPLEX_CST:
       hash_operand (TREE_REALPART (t), hstate, flags);
       hash_operand (TREE_IMAGPART (t), hstate, flags);
@@ -8463,6 +8473,48 @@ native_encode_initializer (tree init, unsigned char 
*ptr, int len,
                }
 
              curpos = pos;
+             if (val && TREE_CODE (val) == RAW_DATA_CST)
+               {
+                 if (count)
+                   return 0;
+                 if (off == -1
+                     || (curpos >= off
+                         && (curpos + RAW_DATA_LENGTH (val)
+                             <= (HOST_WIDE_INT) off + len)))
+                   {
+                     if (ptr)
+                       memcpy (ptr + (curpos - o), RAW_DATA_POINTER (val),
+                               RAW_DATA_LENGTH (val));
+                     if (mask)
+                       memset (mask + curpos, 0, RAW_DATA_LENGTH (val));
+                   }
+                 else if (curpos + RAW_DATA_LENGTH (val) > off
+                          && curpos < (HOST_WIDE_INT) off + len)
+                   {
+                     /* Partial overlap.  */
+                     unsigned char *p = NULL;
+                     int no = 0;
+                     int l;
+                     gcc_assert (mask == NULL);
+                     if (curpos >= off)
+                       {
+                         if (ptr)
+                           p = ptr + curpos - off;
+                         l = MIN ((HOST_WIDE_INT) off + len - curpos,
+                                  RAW_DATA_LENGTH (val));
+                       }
+                     else
+                       {
+                         p = ptr;
+                         no = off - curpos;
+                         l = len;
+                       }
+                     if (p)
+                       memcpy (p, RAW_DATA_POINTER (val) + no, l);
+                   }
+                 curpos += RAW_DATA_LENGTH (val);
+                 val = NULL_TREE;
+               }
              if (val)
                do
                  {
@@ -13840,6 +13892,9 @@ get_array_ctor_element_at_index (tree ctor, offset_int 
access_index,
       else
        first_p = false;
 
+      if (TREE_CODE (cval) == RAW_DATA_CST)
+       max_index += RAW_DATA_LENGTH (cval) - 1;
+
       /* Do we have match?  */
       if (wi::cmp (access_index, index, index_sgn) >= 0)
        {
@@ -13939,10 +13994,26 @@ fold (tree expr)
            && TREE_CODE (op0) == CONSTRUCTOR
            && ! type_contains_placeholder_p (TREE_TYPE (op0)))
          {
-           tree val = get_array_ctor_element_at_index (op0,
-                                                       wi::to_offset (op1));
+           unsigned int idx;
+           tree val
+             = get_array_ctor_element_at_index (op0, wi::to_offset (op1),
+                                                &idx);
            if (val)
-             return val;
+             {
+               if (TREE_CODE (val) != RAW_DATA_CST)
+                 return val;
+               if (CONSTRUCTOR_ELT (op0, idx)->index == NULL_TREE
+                   || (TREE_CODE (CONSTRUCTOR_ELT (op0, idx)->index)
+                       != INTEGER_CST))
+                 return t;
+               offset_int o
+                 = (wi::to_offset (op1)
+                    - wi::to_offset (CONSTRUCTOR_ELT (op0, idx)->index));
+               gcc_checking_assert (o < RAW_DATA_LENGTH (val));
+               return build_int_cst (TREE_TYPE (val),
+                                     ((const unsigned char *)
+                                      RAW_DATA_POINTER (val))[o.to_uhwi ()]);
+             }
          }
 
        return t;
diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc
index 9a84483f9bff..d4ec74759902 100644
--- a/gcc/gimple-fold.cc
+++ b/gcc/gimple-fold.cc
@@ -8233,7 +8233,7 @@ fold_array_ctor_reference (tree type, tree ctor,
       unsigned ctor_idx;
       tree val = get_array_ctor_element_at_index (ctor, access_index,
                                                  &ctor_idx);
-      if (!val && ctor_idx >= CONSTRUCTOR_NELTS  (ctor))
+      if (!val && ctor_idx >= CONSTRUCTOR_NELTS (ctor))
        return build_zero_cst (type);
 
       /* native-encode adjacent ctor elements.  */
@@ -8260,10 +8260,27 @@ fold_array_ctor_reference (tree type, tree ctor,
        {
          if (bufoff + elt_sz > sizeof (buf))
            elt_sz = sizeof (buf) - bufoff;
-         int len = native_encode_expr (val, buf + bufoff, elt_sz,
+         int len;
+         if (TREE_CODE (val) == RAW_DATA_CST)
+           {
+             gcc_assert (inner_offset == 0);
+             if (!elt->index || TREE_CODE (elt->index) != INTEGER_CST)
+               return NULL_TREE;
+             inner_offset = (access_index
+                             - wi::to_offset (elt->index)).to_uhwi ();
+             len = MIN (sizeof (buf) - bufoff,
+                        (unsigned) (RAW_DATA_LENGTH (val) - inner_offset));
+             memcpy (buf + bufoff, RAW_DATA_POINTER (val) + inner_offset,
+                     len);
+             access_index += len - 1;
+           }
+         else
+           {
+             len = native_encode_expr (val, buf + bufoff, elt_sz,
                                        inner_offset / BITS_PER_UNIT);
-         if (len != (int) elt_sz - inner_offset / BITS_PER_UNIT)
-           return NULL_TREE;
+             if (len != (int) elt_sz - inner_offset / BITS_PER_UNIT)
+               return NULL_TREE;
+           }
          inner_offset = 0;
          bufoff += len;
 
@@ -8305,8 +8322,23 @@ fold_array_ctor_reference (tree type, tree ctor,
       return native_interpret_expr (type, buf, size / BITS_PER_UNIT);
     }
 
-  if (tree val = get_array_ctor_element_at_index (ctor, access_index))
+  unsigned ctor_idx;
+  if (tree val = get_array_ctor_element_at_index (ctor, access_index,
+                                                 &ctor_idx))
     {
+      if (TREE_CODE (val) == RAW_DATA_CST)
+       {
+         if (size != BITS_PER_UNIT || elt_sz != 1 || inner_offset != 0)
+           return NULL_TREE;
+         constructor_elt *elt = CONSTRUCTOR_ELT (ctor, ctor_idx);
+         if (elt->index == NULL_TREE || TREE_CODE (elt->index) != INTEGER_CST)
+           return NULL_TREE;
+         *suboff += access_index.to_uhwi () * BITS_PER_UNIT;
+         unsigned o = (access_index - wi::to_offset (elt->index)).to_uhwi ();
+         return build_int_cst (TREE_TYPE (val),
+                               ((const unsigned char *)
+                                RAW_DATA_POINTER (val))[o]);
+       }
       if (!size && TREE_CODE (val) != CONSTRUCTOR)
        {
          /* For the final reference to the entire accessed element
diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc
index 3f602469d571..9284fffe137f 100644
--- a/gcc/gimplify.cc
+++ b/gcc/gimplify.cc
@@ -5381,6 +5381,49 @@ gimplify_init_ctor_eval (tree object, 
vec<constructor_elt, va_gc> *elts,
          && TREE_CODE (TREE_TYPE (value)) != VECTOR_TYPE)
        gimplify_init_ctor_eval (cref, CONSTRUCTOR_ELTS (value),
                                 pre_p, cleared);
+      else if (TREE_CODE (value) == RAW_DATA_CST)
+       {
+         if (RAW_DATA_LENGTH (value) <= 32)
+           {
+             for (unsigned int i = 0; i < (unsigned) RAW_DATA_LENGTH (value);
+                  ++i)
+               if (!cleared || RAW_DATA_POINTER (value)[i])
+                 {
+                   if (i)
+                     {
+                       tree p
+                         = fold_build2 (PLUS_EXPR, TREE_TYPE (purpose),
+                                        purpose,
+                                        build_int_cst (TREE_TYPE (purpose),
+                                                       i));
+                       cref = build4 (ARRAY_REF, array_elt_type,
+                                      unshare_expr (object), p, NULL_TREE,
+                                      NULL_TREE);
+                     }
+                   tree init
+                     = build2 (INIT_EXPR, TREE_TYPE (cref), cref,
+                               build_int_cst (TREE_TYPE (value),
+                                              ((const unsigned char *)
+                                               RAW_DATA_POINTER (value))[i]));
+                   gimplify_and_add (init, pre_p);
+                   ggc_free (init);
+                 }
+           }
+         else
+           {
+             tree rtype = build_array_type_nelts (TREE_TYPE (value),
+                                                  RAW_DATA_LENGTH (value));
+             tree rctor = build_constructor_single (rtype, bitsize_zero_node,
+                                                    value);
+             tree addr = build_fold_addr_expr (cref);
+             cref = build2 (MEM_REF, rtype, addr,
+                            build_int_cst (ptr_type_node, 0));
+             rctor = tree_output_constant_def (rctor);
+             tree init = build2 (INIT_EXPR, rtype, cref, rctor);
+             gimplify_and_add (init, pre_p);
+             ggc_free (init);
+           }
+       }
       else
        {
          tree init = build2 (INIT_EXPR, TREE_TYPE (cref), cref, value);
diff --git a/gcc/lto-streamer-out.cc b/gcc/lto-streamer-out.cc
index 807b935537be..a464c2d5ddf1 100644
--- a/gcc/lto-streamer-out.cc
+++ b/gcc/lto-streamer-out.cc
@@ -1159,6 +1159,9 @@ DFS::DFS_write_tree_body (struct output_block *ob,
        }
     }
 
+  if (code == RAW_DATA_CST)
+    DFS_follow_tree_edge (RAW_DATA_OWNER (expr));
+
   if (code == OMP_CLAUSE)
     {
       int i;
diff --git a/gcc/testsuite/c-c++-common/cpp/embed-20.c 
b/gcc/testsuite/c-c++-common/cpp/embed-20.c
new file mode 100644
index 000000000000..91a16e5ec9d7
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cpp/embed-20.c
@@ -0,0 +1,24 @@
+/* { dg-do run } */
+/* { dg-options "" } */
+/* { dg-additional-options "-std=c23" { target c } } */
+
+unsigned char a[] = {
+#embed __FILE__
+};
+struct S { unsigned char h[(sizeof (a) - 7) / 2]; short int i; unsigned char 
j[sizeof (a) - 7 - (sizeof (a) - 7) / 2]; };
+struct T { int a, b, c; struct S d; long long e; double f; long long g; };
+struct T b = {
+#embed __FILE__
+};
+
+int
+main ()
+{
+  if (b.a != a[0] || b.b != a[1] || b.c != a[2]
+      || __builtin_memcmp (b.d.h, a + 3, sizeof (b.d.h))
+      || b.d.i != a[3 + sizeof (b.d.h)]
+      || __builtin_memcmp (b.d.j, a + 4 + sizeof (b.d.h), sizeof (b.d.j))
+      || b.e != a[sizeof (a) - 3] || b.f != a[sizeof (a) - 2]
+      || b.g != a[sizeof (a) - 1])
+    __builtin_abort ();
+}
diff --git a/gcc/testsuite/c-c++-common/cpp/embed-21.c 
b/gcc/testsuite/c-c++-common/cpp/embed-21.c
new file mode 100644
index 000000000000..e388b6ccac91
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cpp/embed-21.c
@@ -0,0 +1,56 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -Wno-psabi" } */
+/* { dg-additional-options "-std=c23" { target c } } */
+
+unsigned char a[] = {
+#embed __FILE__
+};
+const unsigned char b[] = {
+#embed __FILE__
+};
+unsigned char c[] = {
+  0, 1, 2, 3, 4, 5, 6, 7,
+#embed __FILE__ limit(128) suffix (,)
+#embed __FILE__ limit(128) suffix (,)
+#embed __FILE__
+};
+const unsigned char d[] = {
+  0, 1, 2, 3, 4, 5, 6, 7,
+#embed __FILE__ limit(128) suffix (,)
+#embed __FILE__ limit(128) suffix (,)
+#embed __FILE__
+};
+typedef char V __attribute__((vector_size (16), may_alias));
+struct __attribute__((may_alias)) S { int a, b, c, d; };
+
+__attribute__((noipa)) int
+foo (V x, V y)
+{
+  return __builtin_memcmp (&x, &y, sizeof (x));
+}
+
+__attribute__((noipa)) int
+bar (struct S x, struct S y)
+{
+  return x.a != y.a || x.b != y.b || x.c != y.c || x.d != y.d;
+}
+
+int
+main ()
+{
+  if (a[0] != b[0]
+      || a[42] != b[42]
+      || a[sizeof (a) - 5] != b[sizeof (a) - 5]
+      || a[sizeof (a) - 1] != b[sizeof (a) - 1])
+    __builtin_abort ();
+  if (foo (((V *) a)[0], ((V *) b)[0])
+      || foo (((V *) a)[1], ((V *) b)[1])
+      || foo (((V *) c)[8], ((V *) d)[8])
+      || foo (((V *) c)[9], ((V *) d)[9]))
+    __builtin_abort ();
+  if (bar (((struct S *) a)[0], ((struct S *) b)[0])
+      || bar (((struct S *) a)[1], ((struct S *) b)[1])
+      || bar (((struct S *) c)[8], ((struct S *) d)[8])
+      || bar (((struct S *) c)[9], ((struct S *) d)[9]))
+    __builtin_abort ();
+}
diff --git a/gcc/testsuite/c-c++-common/cpp/embed-28.c 
b/gcc/testsuite/c-c++-common/cpp/embed-28.c
new file mode 100644
index 000000000000..a917644ba966
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cpp/embed-28.c
@@ -0,0 +1,38 @@
+/* { dg-do run } */
+/* { dg-options "--embed-directory ${srcdir}/c-c++-common/cpp/embed-dir" } */
+/* { dg-additional-options "-std=gnu99" { target c } } */
+
+const unsigned char magna_carta[] = {
+#embed <magna-carta.txt> limit (256)
+};
+
+typedef unsigned char V __attribute__((vector_size (256)));
+#define TEN(x) x##0, x##1, x##2, x##3, x##4, x##5, x##6, x##7, x##8, x##9
+
+int
+main ()
+{
+  if (__CHAR_BIT__ != 8)
+    return 0;
+  V a = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
+          TEN (1), TEN (2), TEN (3), TEN (4),
+          TEN (5), TEN (6), TEN (7), TEN (8),
+          TEN (9), TEN (10), TEN (11), TEN (12),
+          TEN (13), TEN (14), TEN (15), TEN (16),
+          TEN (17), TEN (18), TEN (19), TEN (20),
+          TEN (21), TEN (22), TEN (23), TEN (24),
+          250, 251, 252, 253, 254, 255 };
+  V b = __builtin_shufflevector (a, a,
+#embed <magna-carta.txt> limit (256)
+                                );
+  V c = __builtin_shufflevector (b, b,
+#embed <magna-carta.txt> limit (256)
+                                );
+  int i;
+  for (i = 0; i < 256; ++i)
+    if (b[i] != magna_carta[i])
+      __builtin_abort ();
+  for (i = 0; i < 256; ++i)
+    if (c[i] != magna_carta[magna_carta[i]])
+      __builtin_abort ();
+}
diff --git a/gcc/testsuite/gcc.dg/cpp/embed-10.c 
b/gcc/testsuite/gcc.dg/cpp/embed-10.c
new file mode 100644
index 000000000000..867de7520625
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cpp/embed-10.c
@@ -0,0 +1,7 @@
+/* This is a comment with some UTF-8 non-ASCII characters: áéíóú.  */
+/* { dg-do compile } */
+/* { dg-options "-std=c23 -Wconversion" } */
+
+signed char a[] = {
+#embed __FILE__        /* { dg-warning "conversion from 'int' to 'signed char' 
changes value from '\[12]\[0-9]\[0-9]' to '-\[0-9]\[0-9]*'" } */
+};
diff --git a/gcc/testsuite/gcc.dg/cpp/embed-11.c 
b/gcc/testsuite/gcc.dg/cpp/embed-11.c
new file mode 100644
index 000000000000..829239aa8f9e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cpp/embed-11.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-std=c23" } */
+
+struct __attribute__((designated_init)) S {
+  int a, b, c, d;
+  unsigned char e[128];
+};
+
+struct S s = { .a = 1, .b =
+#embed __FILE__ limit(128)     /* { dg-warning "positional initialization of 
field in 'struct' declared with 'designated_init' attribute" } */
+};                             /* { dg-message "near initialization" "" { 
target *-*-* } .-1 } */
diff --git a/gcc/testsuite/gcc.dg/cpp/embed-12.c 
b/gcc/testsuite/gcc.dg/cpp/embed-12.c
new file mode 100644
index 000000000000..f4203ec1b975
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cpp/embed-12.c
@@ -0,0 +1,31 @@
+/* { dg-do run } */
+/* { dg-options "-std=c23" } */
+
+int
+foo (int x)
+{
+  if (x != 3)
+    __builtin_abort ();
+  return 1;
+}
+
+int
+main ()
+{
+  unsigned char a[] = {
+    [5] = foo (0),
+    [7] = foo (1),
+    [42] = foo (2),
+    #embed __FILE__ prefix([0] = ) suffix (,) /* { dg-warning "initialized 
field with side-effects overwritten" } */
+    [12] = foo (3) /* { dg-message "near initialization" "" { target *-*-* } 
.-1 } */
+
+  };
+  const unsigned char b[] = {
+    #embed __FILE__
+  };
+  if (sizeof (a) != sizeof (b)
+      || __builtin_memcmp (a, b, 12)
+      || a[12] != 1
+      || __builtin_memcmp (a + 13, b + 13, sizeof (a) - 13))
+    __builtin_abort ();
+}
diff --git a/gcc/testsuite/gcc.dg/cpp/embed-13.c 
b/gcc/testsuite/gcc.dg/cpp/embed-13.c
new file mode 100644
index 000000000000..3a10f1da8afe
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cpp/embed-13.c
@@ -0,0 +1,39 @@
+/* { dg-do run } */
+/* { dg-options "-std=c23 -Wunused-value" } */
+
+#include <stdarg.h>
+
+const unsigned char a[] = {
+  #embed __FILE__ limit (128)
+};
+
+int
+foo (...)
+{
+  va_list ap;
+  va_start (ap);
+  for (int i = 0; i < 128; ++i)
+    if (va_arg (ap, int) != a[i])
+      {
+       va_end (ap);
+       return 1;
+      }
+  va_end (ap);
+  return 0;
+}
+
+int b, c;
+
+int
+main ()
+{
+  if (foo (
+#embed __FILE__ limit (128)
+      ))
+    __builtin_abort ();
+  b = (
+#embed __FILE__ limit (128) prefix (c = 2 * ) suffix ( + 6)    /* { dg-warning 
"right-hand operand of comma expression has no effect" } */
+  );
+  if (b != a[127] + 6 || c != 2 * a[0])
+    __builtin_abort ();
+}
diff --git a/gcc/testsuite/gcc.dg/cpp/embed-14.c 
b/gcc/testsuite/gcc.dg/cpp/embed-14.c
new file mode 100644
index 000000000000..eb381a7ae9d9
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cpp/embed-14.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-options "-std=gnu23 -Wnonnull" } */
+
+#define A(n) int *p##n
+#define B(n) A(n##0), A(n##1), A(n##2), A(n##3), A(n##4), A(n##5), A(n##6), 
A(n##7)
+#define C(n) B(n##0), B(n##1), B(n##2), B(n##3), B(n##4), B(n##5), B(n##6), 
B(n##7)
+#define D C(0), C(1), C(2), C(3)
+
+void foo (D) __attribute__((nonnull (  /* { dg-message "in a call to function 
'foo' declared 'nonnull'" } */
+#embed __FILE__ limit (128)
+)));
+[[gnu::nonnull (
+#embed __FILE__ limit (128)
+)]] void bar (D);      /* { dg-message "in a call to function 'bar' declared 
'nonnull'" } */
+
+#undef A
+#define A(n) nullptr
+
+void
+baz ()
+{
+  foo (D);     /* { dg-warning "argument \[0-9]\+ null where non-null 
expected" } */
+  bar (D);     /* { dg-warning "argument \[0-9]\+ null where non-null 
expected" } */
+}
diff --git a/gcc/testsuite/gcc.dg/cpp/embed-15.c 
b/gcc/testsuite/gcc.dg/cpp/embed-15.c
new file mode 100644
index 000000000000..ad52b3452f1b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cpp/embed-15.c
@@ -0,0 +1,18 @@
+/* { dg-do run } */
+/* { dg-options "-std=gnu23 -O2" } */
+
+const unsigned char a[] = {
+#embed __FILE__
+};
+const unsigned char b[] = {
+  [10] = 2, [5] = 3, [13] = 4, [17] = 5, [0] =
+#embed __FILE__ suffix(,) limit (256)
+  [18] = a[18]
+};
+
+int
+main ()
+{
+  if (sizeof (b) != 256 || __builtin_memcmp (b, a, 256))
+    __builtin_abort ();
+}
diff --git a/gcc/testsuite/gcc.dg/cpp/embed-16.c 
b/gcc/testsuite/gcc.dg/cpp/embed-16.c
new file mode 100644
index 000000000000..0bb4d35bc500
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cpp/embed-16.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-std=gnu23 -Woverride-init" } */
+
+const unsigned char a[] = {
+#embed __FILE__
+};
+const unsigned char b[] = {
+  [10] = 2, [5] = 3, [13] = 4, [17] = 5, [0] =
+#embed __FILE__ suffix(,) limit (256)  /* { dg-warning "initialized field 
overwritten" } */
+  [18] = a[18]                         /* { dg-warning "initialized field 
overwritten" } */
+};
diff --git a/gcc/testsuite/gcc.dg/cpp/embed-8.c 
b/gcc/testsuite/gcc.dg/cpp/embed-8.c
new file mode 100644
index 000000000000..414e026b535a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cpp/embed-8.c
@@ -0,0 +1,87 @@
+/* { dg-do run } */
+/* { dg-options "-std=c23" } */
+
+unsigned char a[] = {
+#embed __FILE__
+};
+unsigned char b[] = {
+  [26] =
+#embed __FILE__
+};
+unsigned char c[] = {
+#embed __FILE__ suffix (,)
+  [sizeof (a) / 4] = 0,
+  [sizeof (a) / 2] = 1,
+  [1] = 2,
+  [sizeof (a) - 2] = 3
+};
+unsigned char d[] = {
+  [1] = 4,
+  [26] = 5,
+  [sizeof (a) / 4] = 6,
+  [sizeof (a) / 2] = 7,
+  [sizeof (a) - 2] = 8,
+#embed __FILE__ prefix ([0] = )
+};
+unsigned char e[] = {
+#embed __FILE__ suffix (,)
+  [2] = 9,
+  [sizeof (a) - 3] = 10
+};
+unsigned char f[] = {
+  [23] = 11,
+  [sizeof (a) / 4 - 1] = 12,
+#embed __FILE__ limit (128) prefix ([sizeof (a) / 4 - 1] = ) suffix (,)
+#embed __FILE__ limit (130) prefix ([sizeof (a) / 4 - 2] = ) suffix (,)
+#embed __FILE__ prefix ([sizeof (a) / 4 + 10] = ) suffix (,)
+#embed __FILE__ limit (128) prefix ([sizeof (a) + sizeof (a) / 4 - 30] = ) 
suffix (,)
+#embed __FILE__ limit (128) prefix ([sizeof (a) / 4 + 96] = ) suffix (,)
+};
+const unsigned char g[] = {
+#embed __FILE__ limit (128) prefix (  [10] = 2, [5] = 3, [13] = 4, [17] = 5, 
[0] = )
+};
+unsigned char z[sizeof (a) / 4] = {
+};
+
+int
+main ()
+{
+  if (sizeof (b) != sizeof (a) + 26
+      || __builtin_memcmp (a, b + 26, sizeof (a)))
+    __builtin_abort ();
+  if (sizeof (c) != sizeof (a)
+      || a[0] != c[0]
+      || c[1] != 2
+      || __builtin_memcmp (a + 2, c + 2, sizeof (a) / 4 - 2)
+      || c[sizeof (a) / 4] != 0
+      || __builtin_memcmp (a + sizeof (a) / 4 + 1, c + sizeof (a) / 4 + 1, 
sizeof (a) / 2 - sizeof (a) / 4 - 1)
+      || c[sizeof (a) / 2] != 1
+      || __builtin_memcmp (a + sizeof (a) / 2 + 1, c + sizeof (a) / 2 + 1, 
sizeof (a) - sizeof (a) / 2 - 3)
+      || c[sizeof (a) - 2] != 3
+      || a[sizeof (a) - 1] != c[sizeof (a) - 1])
+    __builtin_abort ();
+  if (sizeof (d) != sizeof (a)
+      || __builtin_memcmp (a, d, sizeof (a)))
+    __builtin_abort ();
+  if (sizeof (e) != sizeof (a)
+      || a[0] != e[0]
+      || a[1] != e[1]
+      || e[2] != 9
+      || __builtin_memcmp (a + 3, e + 3, sizeof (a) - 6)
+      || e[sizeof (a) - 3] != 10
+      || a[sizeof (a) - 2] != e[sizeof (a) - 2]
+      || a[sizeof (a) - 1] != e[sizeof (a) - 1])
+    __builtin_abort ();
+  if (sizeof (f) != sizeof (a) + sizeof (a) / 4 - 30 + 128
+      || __builtin_memcmp (z, f, 23)
+      || f[23] != 11
+      || __builtin_memcmp (z, f + 24, sizeof (a) / 4 - 2 - 24)
+      || __builtin_memcmp (f + sizeof (a) / 4 - 2, a, 12)
+      || __builtin_memcmp (f + sizeof (a) / 4 + 10, a, 86)
+      || __builtin_memcmp (f + sizeof (a) / 4 + 96, a, 128)
+      || __builtin_memcmp (f + sizeof (a) / 4 + 96 + 128, a + 86 + 128, sizeof 
(a) - 86 - 128 - 40)
+      || __builtin_memcmp (f + sizeof (a) + sizeof (a) / 4 - 30, a, 128))
+    __builtin_abort ();
+  if (sizeof (g) != 128 || __builtin_memcmp (g, a, 128))
+    __builtin_abort ();
+}
diff --git a/gcc/testsuite/gcc.dg/cpp/embed-9.c 
b/gcc/testsuite/gcc.dg/cpp/embed-9.c
new file mode 100644
index 000000000000..677540a14398
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cpp/embed-9.c
@@ -0,0 +1,42 @@
+/* { dg-do compile } */
+/* { dg-options "-std=c23 -Woverride-init" } */
+
+unsigned char a[] = {
+#embed __FILE__
+};
+unsigned char b[] = {
+  [26] =
+#embed __FILE__
+};
+unsigned char c[] = {
+#embed __FILE__ suffix (,)
+  [sizeof (a) / 4] = 0,                /* { dg-warning "initialized field 
overwritten" } */
+  [sizeof (a) / 2] = 1,                /* { dg-warning "initialized field 
overwritten" } */
+  [1] = 2,                     /* { dg-warning "initialized field overwritten" 
} */
+  [sizeof (a) - 2] = 3         /* { dg-warning "initialized field overwritten" 
} */
+};
+unsigned char d[] = {
+  [1] = 4,
+  [26] = 5,
+  [sizeof (a) / 4] = 6,
+  [sizeof (a) / 2] = 7,
+  [sizeof (a) - 2] = 8,
+#embed __FILE__ prefix ([0] = )        /* { dg-warning "initialized field 
overwritten" } */
+};
+unsigned char e[] = {
+#embed __FILE__ suffix (,)
+  [2] = 9,                     /* { dg-warning "initialized field overwritten" 
} */
+  [sizeof (a) - 3] = 10                /* { dg-warning "initialized field 
overwritten" } */
+};
+unsigned char f[] = {
+  [23] = 11,
+  [sizeof (a) / 4 - 1] = 12,
+#embed __FILE__ limit (128) prefix ([sizeof (a) / 4 - 1] = ) suffix (,)        
        /* { dg-warning "initialized field overwritten" } */
+#embed __FILE__ limit (130) prefix ([sizeof (a) / 4 - 2] = ) suffix (,)        
        /* { dg-warning "initialized field overwritten" } */
+#embed __FILE__ prefix ([sizeof (a) / 4 + 10] = ) suffix (,)                   
/* { dg-warning "initialized field overwritten" } */
+#embed __FILE__ limit (128) prefix ([sizeof (a) + sizeof (a) / 4 - 30] = ) 
suffix (,) /* { dg-warning "initialized field overwritten" } */
+#embed __FILE__ limit (128) prefix ([sizeof (a) / 4 + 96] = ) suffix (,)       
/* { dg-warning "initialized field overwritten" } */
+};
+const unsigned char g[] = {
+#embed __FILE__ limit (128) prefix (  [10] = 2, [5] = 3, [13] = 4, [17] = 5, 
[0] = )   /* { dg-warning "initialized field overwritten" } */
+};
diff --git a/gcc/testsuite/gcc.dg/lto/embed-1_0.c 
b/gcc/testsuite/gcc.dg/lto/embed-1_0.c
new file mode 100644
index 000000000000..f74a0300926c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/lto/embed-1_0.c
@@ -0,0 +1,19 @@
+/* { dg-lto-do run } */
+/* { dg-lto-options { { -std=c23 -flto } } } */
+
+extern const unsigned char a[];
+extern const int asz;
+
+const unsigned char b[] = {
+  #embed "../../c-c++-common/cpp/embed-dir/magna-carta.txt"
+};
+
+int
+main ()
+{
+  if (asz != sizeof (b)
+      || __builtin_memcmp (a, b, 256)
+      || a[256] != 42
+      || __builtin_memcmp (a + 257, b + 257, asz - 257))
+    __builtin_abort ();
+}
diff --git a/gcc/testsuite/gcc.dg/lto/embed-1_1.c 
b/gcc/testsuite/gcc.dg/lto/embed-1_1.c
new file mode 100644
index 000000000000..65f0c821f791
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/lto/embed-1_1.c
@@ -0,0 +1,5 @@
+const unsigned char a[] = {
+  #embed "../../c-c++-common/cpp/embed-dir/magna-carta.txt" prefix ([0] = ) 
suffix (,)
+  [256] = 42
+};
+const int asz = sizeof (a);
diff --git a/gcc/testsuite/gcc.dg/pch/embed-1.c 
b/gcc/testsuite/gcc.dg/pch/embed-1.c
new file mode 100644
index 000000000000..76507c76c05c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pch/embed-1.c
@@ -0,0 +1,17 @@
+/* { dg-options "-std=c23 --embed-dir=${srcdir}/c-c++-common/cpp/embed-dir" } 
*/
+
+#include "embed-1.h"
+
+const unsigned char b[] = {
+  #embed <magna-carta.txt>
+};
+
+int
+main ()
+{
+  if (sizeof (a) != sizeof (b)
+      || __builtin_memcmp (a, b, 256)
+      || a[256] != 42
+      || __builtin_memcmp (a + 257, b + 257, sizeof (a) - 257))
+    __builtin_abort ();
+}
diff --git a/gcc/testsuite/gcc.dg/pch/embed-1.hs 
b/gcc/testsuite/gcc.dg/pch/embed-1.hs
new file mode 100644
index 000000000000..96575401eb09
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pch/embed-1.hs
@@ -0,0 +1,6 @@
+/* { dg-options "-std=c23 --embed-dir=${srcdir}/c-c++-common/cpp/embed-dir" } 
*/
+
+const unsigned char a[] = {
+  #embed <magna-carta.txt> prefix ([0] = ) suffix (,)
+  [256] = 42
+};
diff --git a/gcc/tree-core.h b/gcc/tree-core.h
index 4ba63ebd4f1b..e82f3a5f4f5b 100644
--- a/gcc/tree-core.h
+++ b/gcc/tree-core.h
@@ -1515,6 +1515,13 @@ struct GTY(()) tree_string {
   char str[1];
 };
 
+struct GTY((user)) tree_raw_data {
+  struct tree_typed typed;
+  tree owner;
+  const char *str;
+  int length;
+};
+
 struct GTY(()) tree_complex {
   struct tree_typed typed;
   tree real;
@@ -2105,6 +2112,7 @@ union GTY ((ptr_alias (union lang_tree_node),
   struct tree_fixed_cst GTY ((tag ("TS_FIXED_CST"))) fixed_cst;
   struct tree_vector GTY ((tag ("TS_VECTOR"))) vector;
   struct tree_string GTY ((tag ("TS_STRING"))) string;
+  struct tree_raw_data GTY ((tag ("TS_RAW_DATA_CST"))) raw_data_cst;
   struct tree_complex GTY ((tag ("TS_COMPLEX"))) complex;
   struct tree_identifier GTY ((tag ("TS_IDENTIFIER"))) identifier;
   struct tree_decl_minimal GTY((tag ("TS_DECL_MINIMAL"))) decl_minimal;
diff --git a/gcc/tree-pretty-print.cc b/gcc/tree-pretty-print.cc
index b378ffbfb4ca..cf36b3716910 100644
--- a/gcc/tree-pretty-print.cc
+++ b/gcc/tree-pretty-print.cc
@@ -2519,6 +2519,28 @@ dump_generic_node (pretty_printer *pp, tree node, int 
spc, dump_flags_t flags,
       }
       break;
 
+    case RAW_DATA_CST:
+      for (unsigned i = 0; i < (unsigned) RAW_DATA_LENGTH (node); ++i)
+       {
+         if (TYPE_UNSIGNED (TREE_TYPE (node))
+             || TYPE_PRECISION (TREE_TYPE (node)) > CHAR_BIT)
+           pp_decimal_int (pp, ((const unsigned char *)
+                                RAW_DATA_POINTER (node))[i]);
+         else
+           pp_decimal_int (pp, ((const signed char *)
+                                RAW_DATA_POINTER (node))[i]);
+         if (i == RAW_DATA_LENGTH (node) - 1U)
+           break;
+         else if (i == 9 && RAW_DATA_LENGTH (node) > 20)
+           {
+             pp_string (pp, ", ..., ");
+             i = RAW_DATA_LENGTH (node) - 11;
+           }
+         else
+           pp_string (pp, ", ");
+       }
+      break;
+
     case FUNCTION_TYPE:
     case METHOD_TYPE:
       dump_generic_node (pp, TREE_TYPE (node), spc, flags, false);
diff --git a/gcc/tree-streamer-in.cc b/gcc/tree-streamer-in.cc
index 329d218e7d4e..2fa998d37f85 100644
--- a/gcc/tree-streamer-in.cc
+++ b/gcc/tree-streamer-in.cc
@@ -636,6 +636,19 @@ streamer_alloc_tree (class lto_input_block *ib, class 
data_in *data_in,
        = (enum omp_clause_code) streamer_read_uhwi (ib);
       return build_omp_clause (UNKNOWN_LOCATION, subcode);
     }
+  else if (code == RAW_DATA_CST)
+    {
+      unsigned HOST_WIDE_INT len = streamer_read_uhwi (ib);
+      if (len == 0)
+       result = streamer_read_string_cst (data_in, ib);
+      else
+       {
+         unsigned HOST_WIDE_INT off = streamer_read_uhwi (ib);
+         result = make_node (code);
+         RAW_DATA_LENGTH (result) = len;
+         RAW_DATA_POINTER (result) = (const char *) (uintptr_t) off;
+       }
+    }
   else
     {
       /* For all other nodes, materialize the tree with a raw
@@ -1048,6 +1061,22 @@ lto_input_ts_constructor_tree_pointers (class 
lto_input_block *ib,
 }
 
 
+/* Read all pointer fields in the TS_RAW_DATA_CST structure of EXPR from
+   input block IB.  DATA_IN contains tables and descriptors for the
+   file being read.  */
+
+static void
+lto_input_ts_raw_data_cst_tree_pointers (class lto_input_block *ib,
+                                        class data_in *data_in, tree expr)
+{
+  RAW_DATA_OWNER (expr) = stream_read_tree_ref (ib, data_in);
+  gcc_checking_assert (RAW_DATA_OWNER (expr)
+                      && TREE_CODE (RAW_DATA_OWNER (expr)) == STRING_CST);
+  RAW_DATA_POINTER (expr) = (TREE_STRING_POINTER (RAW_DATA_OWNER (expr))
+                            + (uintptr_t) RAW_DATA_POINTER (expr));
+}
+
+
 /* Read all pointer fields in the TS_OMP_CLAUSE structure of EXPR from
    input block IB.  DATA_IN contains tables and descriptors for the
    file being read.  */
@@ -1129,6 +1158,9 @@ streamer_read_tree_body (class lto_input_block *ib, class 
data_in *data_in,
   if (CODE_CONTAINS_STRUCT (code, TS_CONSTRUCTOR))
     lto_input_ts_constructor_tree_pointers (ib, data_in, expr);
 
+  if (code == RAW_DATA_CST)
+    lto_input_ts_raw_data_cst_tree_pointers (ib, data_in, expr);
+
   if (code == OMP_CLAUSE)
     lto_input_ts_omp_clause_tree_pointers (ib, data_in, expr);
 }
diff --git a/gcc/tree-streamer-out.cc b/gcc/tree-streamer-out.cc
index 81f5aeb30a6d..2fdd914b1663 100644
--- a/gcc/tree-streamer-out.cc
+++ b/gcc/tree-streamer-out.cc
@@ -880,6 +880,19 @@ write_ts_constructor_tree_pointers (struct output_block 
*ob, tree expr)
 }
 
 
+/* Write all pointer fields in the RAW_DATA_CST/TS_RAW_DATA_CST structure of
+   EXPR to output block OB.  */
+
+static void
+write_ts_raw_data_cst_tree_pointers (struct output_block *ob, tree expr)
+{
+  /* Only write this for non-NULL RAW_DATA_OWNER.  RAW_DATA_CST with
+     NULL RAW_DATA_OWNER is streamed to be read back as STRING_CST.  */
+  if (RAW_DATA_OWNER (expr) != NULL_TREE)
+    stream_write_tree_ref (ob, RAW_DATA_OWNER (expr));
+}
+
+
 /* Write all pointer fields in the TS_OMP_CLAUSE structure of EXPR
    to output block OB.  If REF_P is true, write a reference to EXPR's
    pointer fields.  */
@@ -973,6 +986,9 @@ streamer_write_tree_body (struct output_block *ob, tree 
expr)
   if (CODE_CONTAINS_STRUCT (code, TS_CONSTRUCTOR))
     write_ts_constructor_tree_pointers (ob, expr);
 
+  if (code == RAW_DATA_CST)
+    write_ts_raw_data_cst_tree_pointers (ob, expr);
+
   if (code == OMP_CLAUSE)
     write_ts_omp_clause_tree_pointers (ob, expr);
 }
@@ -1028,6 +1044,35 @@ streamer_write_tree_header (struct output_block *ob, 
tree expr)
     streamer_write_uhwi (ob, call_expr_nargs (expr));
   else if (TREE_CODE (expr) == OMP_CLAUSE)
     streamer_write_uhwi (ob, OMP_CLAUSE_CODE (expr));
+  else if (TREE_CODE (expr) == RAW_DATA_CST)
+    {
+      if (RAW_DATA_OWNER (expr) == NULL_TREE)
+       {
+         /* RAW_DATA_CST with NULL RAW_DATA_OWNER is an owner of other
+            RAW_DATA_CST's data.  This should be streamed out so that
+            it can be streamed back in as a STRING_CST instead, but without
+            the need to duplicate the possibly large data.  */
+         streamer_write_uhwi (ob, 0);
+         streamer_write_string_with_length (ob, ob->main_stream,
+                                            RAW_DATA_POINTER (expr),
+                                            RAW_DATA_LENGTH (expr), true);
+       }
+      else
+       {
+         streamer_write_uhwi (ob, RAW_DATA_LENGTH (expr));
+         tree owner = RAW_DATA_OWNER (expr);
+         unsigned HOST_WIDE_INT off;
+         if (TREE_CODE (owner) == STRING_CST)
+           off = RAW_DATA_POINTER (expr) - TREE_STRING_POINTER (owner);
+         else
+           {
+             gcc_checking_assert (TREE_CODE (owner) == RAW_DATA_CST
+                                  && RAW_DATA_OWNER (owner) == NULL_TREE);
+             off = RAW_DATA_POINTER (expr) - RAW_DATA_POINTER (owner);
+           }
+         streamer_write_uhwi (ob, off);
+       }
+    }
   else if (CODE_CONTAINS_STRUCT (code, TS_INT_CST))
     {
       gcc_checking_assert (TREE_INT_CST_NUNITS (expr));
diff --git a/gcc/tree-streamer.cc b/gcc/tree-streamer.cc
index f4e1290b20f5..72e967a9861b 100644
--- a/gcc/tree-streamer.cc
+++ b/gcc/tree-streamer.cc
@@ -60,6 +60,7 @@ streamer_check_handled_ts_structures (void)
   handled_p[TS_FIXED_CST] = true;
   handled_p[TS_VECTOR] = true;
   handled_p[TS_STRING] = true;
+  handled_p[TS_RAW_DATA_CST] = true;
   handled_p[TS_COMPLEX] = true;
   handled_p[TS_IDENTIFIER] = true;
   handled_p[TS_DECL_MINIMAL] = true;
diff --git a/gcc/tree.cc b/gcc/tree.cc
index 095c02c54741..392c3dc879e7 100644
--- a/gcc/tree.cc
+++ b/gcc/tree.cc
@@ -513,6 +513,7 @@ tree_node_structure_for_code (enum tree_code code)
     case STRING_CST:           return TS_STRING;
     case VECTOR_CST:           return TS_VECTOR;
     case VOID_CST:             return TS_TYPED;
+    case RAW_DATA_CST:         return TS_RAW_DATA_CST;
 
       /* tcc_exceptional cases.  */
     case BLOCK:                        return TS_BLOCK;
@@ -571,6 +572,7 @@ initialize_tree_contains_struct (void)
        case TS_FIXED_CST:
        case TS_VECTOR:
        case TS_STRING:
+       case TS_RAW_DATA_CST:
        case TS_COMPLEX:
        case TS_SSA_NAME:
        case TS_CONSTRUCTOR:
@@ -1026,6 +1028,7 @@ tree_code_size (enum tree_code code)
        case REAL_CST:          return sizeof (tree_real_cst);
        case FIXED_CST:         return sizeof (tree_fixed_cst);
        case COMPLEX_CST:       return sizeof (tree_complex);
+       case RAW_DATA_CST:      return sizeof (tree_raw_data);
        case VECTOR_CST:        gcc_unreachable ();
        case STRING_CST:        gcc_unreachable ();
        default:
@@ -10458,6 +10461,15 @@ initializer_zerop (const_tree init, bool *nonzero /* = 
NULL */)
       *nonzero = true;
       return false;
 
+    case RAW_DATA_CST:
+      for (unsigned int i = 0; i < (unsigned int) RAW_DATA_LENGTH (init); ++i)
+       if (RAW_DATA_POINTER (init)[i])
+         {
+           *nonzero = true;
+           return false;
+         }
+      return true;
+
     case CONSTRUCTOR:
       {
        if (TREE_CLOBBER_P (init))
@@ -15219,6 +15231,50 @@ tree_cc_finalize (void)
   vec_free (bitint_type_cache);
 }
 
+void
+gt_ggc_mx (tree_raw_data *x)
+{
+  gt_ggc_m_9tree_node (x->typed.type);
+  gt_ggc_m_9tree_node (x->owner);
+}
+
+void
+gt_pch_nx (tree_raw_data *x)
+{
+  gt_pch_n_9tree_node (x->typed.type);
+  gt_pch_n_9tree_node (x->owner);
+}
+
+/* For PCH we guarantee that RAW_DATA_CST's RAW_DATA_OWNER is a STRING_CST and
+   RAW_DATA_POINTER points into it.  We don't want to save/restore
+   RAW_DATA_POINTER on its own but want to restore it pointing at the same
+   offset of the STRING_CST as before.  */
+
+void
+gt_pch_nx (tree_raw_data *x, gt_pointer_operator op, void *cookie)
+{
+  op (&x->typed.type, NULL, cookie);
+  gcc_checking_assert (x->owner
+                      && TREE_CODE (x->owner) == STRING_CST
+                      && x->str >= TREE_STRING_POINTER (x->owner)
+                      && (x->str + x->length
+                          <= (TREE_STRING_POINTER (x->owner)
+                              + TREE_STRING_LENGTH (x->owner))));
+  ptrdiff_t off = x->str - (const char *) (x->owner);
+  tree owner = x->owner;
+  op (&x->owner, NULL, cookie);
+  x->owner = owner;
+  /* The above op call relocates x->owner and remembers the address
+     for relocation e.g. if the compiler is position independent.
+     We then restore x->owner back to its previous value and call
+     op again, for x->owner itself this just repeats (uselessly) what
+     the first call did, but as the second argument is now non-NULL
+     and different, it also arranges for &x->str to be noted for the
+     PIE relocation.  */
+  op (&x->owner, &x->str, cookie);
+  x->str = (const char *) (x->owner) + off;
+}
+
 #if CHECKING_P
 
 namespace selftest {
diff --git a/gcc/tree.def b/gcc/tree.def
index 85ab182c6f5c..dd60d1ecde71 100644
--- a/gcc/tree.def
+++ b/gcc/tree.def
@@ -309,6 +309,14 @@ DEFTREECODE (VECTOR_CST, "vector_cst", tcc_constant, 0)
 /* Contents are TREE_STRING_LENGTH and the actual contents of the string.  */
 DEFTREECODE (STRING_CST, "string_cst", tcc_constant, 0)
 
+/* Contents are RAW_DATA_LENGTH and the actual content
+   of the raw data, plus RAW_DATA_OWNER for owner of the
+   data.  That can be either a STRING_CST, used e.g. when writing
+   PCH header, or another RAW_DATA_CST representing data owned by
+   libcpp and representing the original range (if possible).
+   TREE_TYPE is the type of each of the RAW_DATA_LENGTH elements.  */
+DEFTREECODE (RAW_DATA_CST, "raw_data_cst", tcc_constant, 0)
+
 /* Declarations.  All references to names are represented as ..._DECL
    nodes.  The decls in one binding context are chained through the
    TREE_CHAIN field.  Each DECL has a DECL_NAME field which contains
diff --git a/gcc/tree.h b/gcc/tree.h
index 75efc760a163..d324a3f42a67 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -1165,6 +1165,14 @@ extern void omp_clause_range_check_failed (const_tree, 
const char *, int,
 #define TREE_STRING_POINTER(NODE) \
   ((const char *)(STRING_CST_CHECK (NODE)->string.str))
 
+/* In a RAW_DATA_CST */
+#define RAW_DATA_LENGTH(NODE) \
+  (RAW_DATA_CST_CHECK (NODE)->raw_data_cst.length)
+#define RAW_DATA_POINTER(NODE) \
+  (RAW_DATA_CST_CHECK (NODE)->raw_data_cst.str)
+#define RAW_DATA_OWNER(NODE) \
+  (RAW_DATA_CST_CHECK (NODE)->raw_data_cst.owner)
+
 /* In a COMPLEX_CST node.  */
 #define TREE_REALPART(NODE) (COMPLEX_CST_CHECK (NODE)->complex.real)
 #define TREE_IMAGPART(NODE) (COMPLEX_CST_CHECK (NODE)->complex.imag)
@@ -6757,6 +6765,9 @@ extern location_t set_block (location_t loc, tree block);
 extern void gt_ggc_mx (tree &);
 extern void gt_pch_nx (tree &);
 extern void gt_pch_nx (tree &, gt_pointer_operator, void *);
+extern void gt_ggc_mx (tree_raw_data *);
+extern void gt_pch_nx (tree_raw_data *);
+extern void gt_pch_nx (tree_raw_data *, gt_pointer_operator, void *);
 
 extern bool nonnull_arg_p (const_tree);
 extern bool is_empty_type (const_tree);
diff --git a/gcc/treestruct.def b/gcc/treestruct.def
index d5d0a60415b3..0f73a9747ba9 100644
--- a/gcc/treestruct.def
+++ b/gcc/treestruct.def
@@ -39,6 +39,7 @@ DEFTREESTRUCT(TS_REAL_CST, "real cst")
 DEFTREESTRUCT(TS_FIXED_CST, "fixed cst")
 DEFTREESTRUCT(TS_VECTOR, "vector")
 DEFTREESTRUCT(TS_STRING, "string")
+DEFTREESTRUCT(TS_RAW_DATA_CST, "raw data cst")
 DEFTREESTRUCT(TS_COMPLEX, "complex")
 DEFTREESTRUCT(TS_IDENTIFIER, "identifier")
 DEFTREESTRUCT(TS_DECL_MINIMAL, "decl minimal")
diff --git a/gcc/varasm.cc b/gcc/varasm.cc
index 4426e7ce6c65..92b105a4089a 100644
--- a/gcc/varasm.cc
+++ b/gcc/varasm.cc
@@ -3194,6 +3194,11 @@ const_hash_1 (const tree exp)
        return hi;
       }
 
+    case RAW_DATA_CST:
+      p = RAW_DATA_POINTER (exp);
+      len = RAW_DATA_LENGTH (exp);
+      break;
+
     case CONSTRUCTOR:
       {
        unsigned HOST_WIDE_INT idx;
@@ -4887,6 +4892,7 @@ initializer_constant_valid_p_1 (tree value, tree endtype, 
tree *cache)
     case FIXED_CST:
     case STRING_CST:
     case COMPLEX_CST:
+    case RAW_DATA_CST:
       return null_pointer_node;
 
     case ADDR_EXPR:
@@ -5480,6 +5486,9 @@ array_size_for_constructor (tree val)
     {
       if (TREE_CODE (index) == RANGE_EXPR)
        index = TREE_OPERAND (index, 1);
+      if (value && TREE_CODE (value) == RAW_DATA_CST)
+       index = size_binop (PLUS_EXPR, index,
+                           size_int (RAW_DATA_LENGTH (value) - 1));
       if (max_index == NULL_TREE || tree_int_cst_lt (max_index, index))
        max_index = index;
     }
@@ -5671,6 +5680,12 @@ output_constructor_regular_field (oc_local_state *local)
   /* Output the element's initial value.  */
   if (local->val == NULL_TREE)
     assemble_zeros (fieldsize);
+  else if (local->val && TREE_CODE (local->val) == RAW_DATA_CST)
+    {
+      fieldsize *= RAW_DATA_LENGTH (local->val);
+      assemble_string (RAW_DATA_POINTER (local->val),
+                      RAW_DATA_LENGTH (local->val));
+    }
   else
     fieldsize = output_constant (local->val, fieldsize, align2,
                                 local->reverse, false);
diff --git a/libcpp/files.cc b/libcpp/files.cc
index 8f9a5a4f7972..fbbd59e62a3d 100644
--- a/libcpp/files.cc
+++ b/libcpp/files.cc
@@ -1239,15 +1239,19 @@ finish_embed (cpp_reader *pfile, _cpp_file *file,
   if (params->limit < limit)
     limit = params->limit;
 
-  /* For sizes larger than say 64 bytes, this is just a temporary
-     solution, we should emit a single new token which the FEs will
-     handle as an optimization.  */
+  size_t embed_tokens = 0;
+  if (!CPP_OPTION (pfile, cplusplus)
+      && CPP_OPTION (pfile, lang) != CLK_ASM
+      && limit >= 64)
+    embed_tokens = ((limit - 2) / INT_MAX) + (((limit - 2) % INT_MAX) != 0);
+
   size_t max = INTTYPE_MAXIMUM (size_t) / sizeof (cpp_token);
-  if (limit > max / 2
+  if ((embed_tokens ? (embed_tokens > (max - 3) / 2) : (limit > max / 2))
       || (limit
          ? (params->prefix.count > max
             || params->suffix.count > max
-            || (limit * 2 - 1 + params->prefix.count
+            || ((embed_tokens ? embed_tokens * 2 + 3 : limit * 2 - 1)
+                + params->prefix.count
                 + params->suffix.count > max))
          : params->if_empty.count > max))
     {
@@ -1281,13 +1285,16 @@ finish_embed (cpp_reader *pfile, _cpp_file *file,
                        "%s is too large", file->path);
          return 0;
        }
+      if (embed_tokens && i == 0)
+       i = limit - 2;
     }
   uchar *s = len ? _cpp_unaligned_alloc (pfile, len) : NULL;
   _cpp_buff *tok_buff = NULL;
   cpp_token *tok = &pfile->directive_result, *toks = tok;
   size_t count = 0;
   if (limit)
-    count = (params->prefix.count + limit * 2 - 1
+    count = (params->prefix.count
+            + (embed_tokens ? embed_tokens * 2 + 3 : limit * 2 - 1)
             + params->suffix.count) - 1;
   else if (params->if_empty.count)
     count = params->if_empty.count - 1;
@@ -1339,6 +1346,34 @@ finish_embed (cpp_reader *pfile, _cpp_file *file,
          tok->flags = NO_EXPAND;
          tok++;
        }
+      if (i == 0 && embed_tokens)
+       {
+         ++i;
+         for (size_t j = 0; j < embed_tokens; ++j)
+           {
+             tok->src_loc = params->loc;
+             tok->type = CPP_EMBED;
+             tok->flags = NO_EXPAND;
+             tok->val.str.text = &buffer[i];
+             tok->val.str.len
+               = limit - 1 - i > INT_MAX ? INT_MAX : limit - 1 - i;
+             i += tok->val.str.len;
+             if (tok->val.str.len < 32 && j)
+               {
+                 /* Avoid CPP_EMBED with a fewer than 32 bytes, shrink the
+                    previous CPP_EMBED by 64 and grow this one by 64.  */
+                 tok[-2].val.str.len -= 64;
+                 tok->val.str.text -= 64;
+                 tok->val.str.len += 64;
+               }
+             tok++;
+             tok->src_loc = params->loc;
+             tok->type = CPP_COMMA;
+             tok->flags = NO_EXPAND;
+             tok++;
+           }
+         --i;
+       }
     }
   if (limit && params->suffix.count)
     {
diff --git a/libcpp/include/cpplib.h b/libcpp/include/cpplib.h
index 3f05d085fcf9..0d11d076dcbc 100644
--- a/libcpp/include/cpplib.h
+++ b/libcpp/include/cpplib.h
@@ -144,6 +144,8 @@ class rich_location;
   TK(STRING32_USERDEF, LITERAL) /* U"string"_suffix - C++11 */         \
   TK(UTF8STRING_USERDEF,LITERAL) /* u8"string"_suffix - C++11 */       \
                                                                        \
+  TK(EMBED,            LITERAL) /* #embed - C23 */                     \
+                                                                       \
   TK(COMMENT,          LITERAL) /* Only if output comments.  */        \
                                 /* SPELL_LITERAL happens to DTRT.  */  \
   TK(MACRO_ARG,                NONE)    /* Macro argument.  */                 
\

Reply via email to