On Mon, Aug 12, 2024 at 10:36 PM Prathamesh Kulkarni
<prathame...@nvidia.com> wrote:
>
> Hi,
> As mentioned in:
> https://gcc.gnu.org/pipermail/gcc/2024-August/244581.html
>
> AArch64 cl_optimization_stream_out streams out target-specific optimization 
> options like flag_aarch64_early_ldp_fusion, aarch64_early_ra etc, which 
> breaks AArch64/nvptx offloading,
> since nvptx cl_optimization_stream_in doesn't have corresponding stream-in 
> for these options and ends up setting invalid values for ptr->explicit_mask 
> (and subsequent data structures).
>
> This makes even a trivial test like the following to cause ICE in 
> lto_read_decls with -O3 -fopenmp -foffload=nvptx-none:
>
> int main()
> {
>   int x;
>   #pragma omp target map(x)
>     x;
> }
>
> The attached patch modifies optc-save-gen.awk to generate if 
> (!lto_stream_offload_p) check before streaming out target-specific opt in 
> cl_optimization_stream_out, which
> fixes the issue. cl_optimization_stream_out after patch (last few entries):
>
>   bp_pack_var_len_int (bp, ptr->x_flag_wrapv_pointer);
>   bp_pack_var_len_int (bp, ptr->x_debug_nonbind_markers_p);
>   if (!lto_stream_offload_p)
>   bp_pack_var_len_int (bp, ptr->x_flag_aarch64_early_ldp_fusion);
>   if (!lto_stream_offload_p)
>   bp_pack_var_len_int (bp, ptr->x_aarch64_early_ra);
>   if (!lto_stream_offload_p)
>   bp_pack_var_len_int (bp, ptr->x_flag_aarch64_late_ldp_fusion);
>   if (!lto_stream_offload_p)
>   bp_pack_var_len_int (bp, ptr->x_flag_mlow_precision_div);
>   if (!lto_stream_offload_p)
>   bp_pack_var_len_int (bp, ptr->x_flag_mrecip_low_precision_sqrt);
>   if (!lto_stream_offload_p)
>   bp_pack_var_len_int (bp, ptr->x_flag_mlow_precision_sqrt);
>   for (size_t i = 0; i < ARRAY_SIZE (ptr->explicit_mask); i++)
>     bp_pack_value (bp, ptr->explicit_mask[i], 64);
>
> For target-specific options, streaming out is gated on !lto_stream_offload_p 
> check.
>
> The patch also fixes failures due to same issue with x86_64->nvptx offloading 
> for target-print-1.f90 (and couple more).
> Does the patch look OK ?
I think it seems to be on the right track. One thing that is also
going to be an issue is streaming in, there could be a target option
on the offload side that is marked as Optimization that would might
also cause issues. We should check to make sure that also gets fixed
here too. Or error out for offloading targets can't have target
options with Optimization on them during the build.

Thanks,
Andrew Pinski

>
> Signed-off-by: Prathamesh Kulkarni <prathame...@nvidia.com>
>
> Thanks,
> Prathamesh

Reply via email to