On Mon, Jun 19, 2023 at 08:40:58AM +0000, Richard Biener wrote:
> You also have to fix bp_pack_machine_mode/bp_unpack_machine_mode which
> streams exactly values in [0, 1<<8 - 1].
> 
> CCing Jakub who invented this code.

For stream-out, all it stores is a bool flag whether the mode is streamed
out, and on stream in it contains a mapping table between host and
offloading modes.
For stream in, it actually isn't used despite the comment maybe suggesting
it is, so I guess using MAX_MACHINE_MODE for it is ok.

As you said,
inline void
bp_pack_machine_mode (struct bitpack_d *bp, machine_mode mode)
{
  streamer_mode_table[mode] = 1;
  bp_pack_enum (bp, machine_mode, 1 << 8, mode);
}

inline machine_mode
bp_unpack_machine_mode (struct bitpack_d *bp)
{
  return (machine_mode)
           ((class lto_input_block *)
            bp->stream)->mode_table[bp_unpack_enum (bp, machine_mode, 1 << 8)];
}
needs changing for the case when MAX_MACHINE_MODE > 256, but far more
places make similar assumptions:
E.g. lto_write_mode_table has
          bp_pack_value (&bp, m, 8);
(if MAX_MACHINE_MODE > 256, this can't encode all modes),
          bp_pack_value (&bp, GET_MODE_INNER (m), 8);
(ditto).
lto_input_mode_table has e.g.
  unsigned char *table = ggc_cleared_vec_alloc<unsigned char> (1 << 8);
  file_data->mode_table = table;
Here we need to decide if we keep requiring that offloading architectures
still have MAX_MACHINE_MODE <= 256 or not.  Currently the offloading
arches are nvptx and amdgcn, intelmic support has been removed.
If yes, table can have unsigned char elements, but its size actually depends
on the number of modes on the host side, so lto_write_mode_table would need
to stream out the host MAX_MACHINE_MODE value and lto_input_mode_table
stream it in and use instead of the 1 << 8 size above.
If not, mode_table and unsigned char * would need to change to unsigned
short *, or just conditionally depending on if MAX_MACHINE_MODE <= 256 or
not.
Then
  while ((m = bp_unpack_value (&bp, 8)) != VOIDmode)
    {
...
      machine_mode inner = (machine_mode) bp_unpack_value (&bp, 8);
again hardcode 8 bits, that needs to match how many bits packs the host
compiler in lto_write_mode_table.

        Jakub

Reply via email to