On Thu, Sep 14, 2023 at 10:54 AM Joseph Myers <jos...@codesourcery.com> wrote: > > On Wed, 13 Sep 2023, Ken Matsui via Gcc-patches wrote: > > > diff --git a/gcc/c/c-parser.h b/gcc/c/c-parser.h > > index 545f0f4d9eb..eed6deaf0f8 100644 > > --- a/gcc/c/c-parser.h > > +++ b/gcc/c/c-parser.h > > @@ -51,14 +51,14 @@ enum c_id_kind { > > /* A single C token after string literal concatenation and conversion > > of preprocessing tokens to tokens. */ > > struct GTY (()) c_token { > > + /* If this token is a keyword, this value indicates which keyword. > > + Otherwise, this value is RID_MAX. */ > > + ENUM_BITFIELD (rid) keyword : 16; > > /* The kind of token. */ > > ENUM_BITFIELD (cpp_ttype) type : 8; > > /* If this token is a CPP_NAME, this value indicates whether also > > declared as some kind of type. Otherwise, it is C_ID_NONE. */ > > ENUM_BITFIELD (c_id_kind) id_kind : 8; > > - /* If this token is a keyword, this value indicates which keyword. > > - Otherwise, this value is RID_MAX. */ > > - ENUM_BITFIELD (rid) keyword : 8; > > /* If this token is a CPP_PRAGMA, this indicates the pragma that > > was seen. Otherwise it is PRAGMA_NONE. */ > > ENUM_BITFIELD (pragma_kind) pragma_kind : 8; > > If you want to optimize layout, I'd expect flags to move so it can share > the same 32-bit unit as the pragma_kind bit-field (not sure if any changes > should be made to the declaration of flags to maximise the chance of such > sharing across different host bit-field ABIs). >
Thank you for your review! I did not make this change aggressively, but we can do the following to minimize the fragmentation: struct GTY (()) c_token { tree value; /* pointer, depends, but 4 or 8 bytes as usual */ location_t location; /* unsigned int, at least 2 bytes, 4 bytes as usual */ ENUM_BITFIELD (rid) keyword : 16; /* 2 bytes */ ENUM_BITFIELD (cpp_ttype) type : 8; /* 1 byte */ ENUM_BITFIELD (c_id_kind) id_kind : 8; /* 1 byte */ ENUM_BITFIELD (pragma_kind) pragma_kind : 8; /* 1 byte */ unsigned char flags; /* 1 byte */ } Supposing a pointer size is 8 bytes and int is 4 bytes, the struct size would be 24 bytes. The internal fragmentation would be 0 bytes, and the external fragmentation is 6 bytes since the overall struct alignment requirement is $K_{max} = 8$ from the pointer. Here is the original struct before making keyword 16-bit. The overall struct alignment requirement is $K_{max} = 8$ from the pointer. This struct size would be 24 bytes since the internal fragmentation is 4 bytes (after location), and the external fragmentation is 3 bytes. struct GTY (()) c_token { ENUM_BITFIELD (cpp_ttype) type : 8; /* 1 byte */ ENUM_BITFIELD (c_id_kind) id_kind : 8; /* 1 byte */ ENUM_BITFIELD (rid) keyword : 8; /* 1 byte */ ENUM_BITFIELD (pragma_kind) pragma_kind : 8; /* 1 byte */ location_t location; /* unsigned int, at least 2 bytes, 4 bytes as usual */ tree value; /* pointer, depends, but 4 or 8 bytes as usual */ unsigned char flags; /* 1 byte */ } If we keep the original order with the 16-bit keyword, the struct size would be 32 bytes (my current implementation as well, I will update this patch). struct GTY (()) c_token { ENUM_BITFIELD (cpp_ttype) type : 8; /* 1 byte */ ENUM_BITFIELD (c_id_kind) id_kind : 8; /* 1 byte */ ENUM_BITFIELD (rid) keyword : 16; /* 2 bytes */ ENUM_BITFIELD (pragma_kind) pragma_kind : 8; /* 1 byte */ location_t location; /* unsigned int, at least 2 bytes, 4 bytes as usual */ tree value; /* pointer, depends, but 4 or 8 bytes as usual */ unsigned char flags; /* 1 byte */ } Likewise, the overall struct alignment requirement is $K_{max} = 8$ from the pointer. The internal fragmentation would be 7 bytes (3 bytes after pragma_kind + 4 bytes after location), and the external fragmentation would be 7 bytes. I think optimizing the size is worth doing unless this breaks GCC. > > diff --git a/gcc/cp/parser.h b/gcc/cp/parser.h > > index 6cbb9a8e031..3c3c482c6ce 100644 > > --- a/gcc/cp/parser.h > > +++ b/gcc/cp/parser.h > > @@ -40,11 +40,11 @@ struct GTY(()) tree_check { > > /* A C++ token. */ > > > > struct GTY (()) cp_token { > > - /* The kind of token. */ > > - enum cpp_ttype type : 8; > > /* If this token is a keyword, this value indicates which keyword. > > Otherwise, this value is RID_MAX. */ > > - enum rid keyword : 8; > > + enum rid keyword : 16; > > + /* The kind of token. */ > > + enum cpp_ttype type : 8; > > /* Token flags. */ > > unsigned char flags; > > /* True if this token is from a context where it is implicitly extern > > "C" */ > > You're missing an update to the "3 unused bits." comment further down. > > > @@ -988,7 +988,7 @@ struct GTY(()) cpp_hashnode { > > unsigned int directive_index : 7; /* If is_directive, > > then index into directive table. > > Otherwise, a NODE_OPERATOR. */ > > - unsigned int rid_code : 8; /* Rid code - for front ends. */ > > + unsigned int rid_code : 16; /* Rid code - for front ends. > > */ > > unsigned int flags : 9; /* CPP flags. */ > > ENUM_BITFIELD(node_type) type : 2; /* CPP node type. */ > > You're missing an update to the "5 bits spare." comment further down. > Thank you! > Do you have any figures for the effects on compilation time or memory > usage from the increase in size of these structures? > Regarding only c_token, we will have the same size if we optimize the size. Although I did not calculate the size of other structs, we might not see any significant performance change? I am taking benchmarks and will let you know once it is done. > -- > Joseph S. Myers > jos...@codesourcery.com