https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108895

            Bug ID: 108895
           Summary: [13.0.1 (exp)] Fortran + gfx90a !$acc update device
                    produces a segfault.
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: libgomp
          Assignee: unassigned at gcc dot gnu.org
          Reporter: hberre3 at gatech dot edu
                CC: jakub at gcc dot gnu.org
  Target Milestone: ---

Created attachment 54510
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54510&action=edit
The minimal reproducible sample (fortran + openacc).

gfortran configuration:

> [hberre3@96:instinct]:gcc-acc-test $ gfortran -v
> Using built-in specs.
> COLLECT_GCC=gfortran
> COLLECT_LTO_WRAPPER=/nethome/hberre3/gcc-acc/libexec/gcc/x86_64-pc-linux-gnu/13.0.1/lto-wrapper
> OFFLOAD_TARGET_NAMES=amdgcn-amdhsa
> Target: x86_64-pc-linux-gnu
> Configured with: 
> /nethome/hberre3/temp-gcc-acc-work/build-gcc-amdgpu//gcc/configure 
> --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu 
> --target=x86_64-pc-linux-gnu 
> --enable-offload-targets=amdgcn-amdhsa=/nethome/hberre3/gcc-acc//amdgcn-amdhsa
>  --enable-languages=c,c++,fortran,lto --disable-multilib 
> --prefix=/nethome/hberre3/gcc-acc/
> Thread model: posix
> Supported LTO compression algorithms: zlib zstd
> gcc version 13.0.1 20230219 (experimental) (GCC)

Compiled on:

> GNU:  0263e9d5d84b4abbb53e73fbc8d72fd233764fc8 (master)
> LLVM: llvmorg-13.0.1 (GitHub release)

Minimal reproducible (also found attached):

> ! Henry Le Berre <hber...@gatech.edu> 
> 
> program p_main
> 
>     real(kind(0d0)), allocatable, dimension(:) :: arrs
>     !$acc declare create(arrs)
> 
>     allocate(arrs(1000))
>     !$acc enter data create(arrs(1000))
>     !$acc update device(arrs(1:1000))
> 
> end program

Compiled with:

> gfortran -g -fopenacc -foffload-options=-march=gfx90a sample.f90 -o sample

Produces:

> [hberre3@102:instinct]:gcc-acc-test $ ./sample
> 
> Program received signal SIGSEGV: Segmentation fault - invalid memory 
> reference.
>
> Backtrace for this error:
> #0  0x7fd01c643b1f in ???
> #1  0x7fd01c6c4ee9 in ???
> #2  0x7fd01bdf6007 in ???
> #3  0x7fd01bdd921f in ???
> #4  0x7fd01c1e5088 in hsa_memory_copy_wrapper
>       at 
> /nethome/hberre3/temp-gcc-acc-work/build-gcc-amdgpu//gcc/libgomp/plugin/plugin-gcn.c:2958
> #5  0x7fd01c1eb1eb in GOMP_OFFLOAD_host2dev
>       at 
> /nethome/hberre3/temp-gcc-acc-work/build-gcc-amdgpu//gcc/libgomp/plugin/plugin-gcn.c:3796
> #6  0x7fd01ce25cba in gomp_device_copy
>       at 
> /nethome/hberre3/temp-gcc-acc-work/build-gcc-amdgpu//gcc/libgomp/target.c:234
> #7  0x7fd01ce25cba in gomp_copy_host2dev
>       at 
> /nethome/hberre3/temp-gcc-acc-work/build-gcc-amdgpu//gcc/libgomp/target.c:433
> #8  0x7fd01ce35596 in update_dev_host
>       at 
> /nethome/hberre3/temp-gcc-acc-work/build-gcc-amdgpu//gcc/libgomp/oacc-mem.c:877
> #9  0x7fd01ce33142 in GOACC_update
>       at 
> /nethome/hberre3/temp-gcc-acc-work/build-gcc-amdgpu//gcc/libgomp/oacc-parallel.c:678
> #10  0x400cad in p_main
>       at /nethome/hberre3/gcc-acc-test/sample.f90:10
> #11  0x400ced in main
>       at /nethome/hberre3/gcc-acc-test/sample.f90:3
> Segmentation fault (core dumped)

Observations:

1) If the length/size of the array were smaller (say 10 or 100) no segmentation
fault is observed, possibly indicating silent R/W operations to memory we don't
own.

2) On ORNL Summit's GCC 8.3.1 (nvptx), this sample does not produce a segfault.
It was configured with:

> [henrylb@login4.summit ~]$ gcc -v
> Using built-in specs.
> COLLECT_GCC=gcc
> COLLECT_LTO_WRAPPER=/usr/libexec/gcc/ppc64le-redhat-linux/8/lto-wrapper
> OFFLOAD_TARGET_NAMES=nvptx-none
> OFFLOAD_TARGET_DEFAULT=1
> Target: ppc64le-redhat-linux
> Configured with: ../configure --enable-bootstrap 
> --enable-languages=c,c++,fortran,lto --prefix=/usr --mandir=/usr/share/man 
> --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla 
> --enable-shared --enable-threads=posix --enable-checking=release 
> --enable-targets=powerpcle-linux --disable-multilib --with-system-zlib 
> --enable-__cxa_atexit --disable-libunwind-exceptions 
> --enable-gnu-unique-object --enable-linker-build-id 
> --with-gcc-major-version-only --with-linker-hash-style=gnu --enable-plugin 
> --enable-initfini-array --with-isl --disable-libmpx 
> --enable-offload-targets=nvptx-none --without-cuda-driver 
> --enable-gnu-indirect-function --enable-secureplt --with-long-double-128 
> --with-cpu-32=power8 --with-tune-32=power8 --with-cpu-64=power8 
> --with-tune-64=power8 --build=ppc64le-redhat-linux
> Thread model: posix
> gcc version 8.3.1 20191121 (Red Hat 8.3.1-5) (GCC)

3) If I translate this sample to C, no matter how large the array is, a
segfault is not produced. Please excuse me if this C/OpenACC sample is invalid
as I only use hip/Cuda when writing offloaded code in C/C++. This might
indicate it is not an issue with libomp but I am not sure.

> #include <stdlib.h>
> 
> double* arrs;
> #pragma acc declare create(arrs)
> 
> int main() {
>     arrs = malloc(sizeof(double)*100000);
> #pragma acc enter data create(arrs[1:100000])
> #pragma acc update device(arrs[1:100000])
> }

System: The gfx90a system I used for testing has AMD MI 210 GPUs and the nvptx
ones have NVIDIA A100s/V100s.

Please let me know if there is anything more I can provide you with. I thank
you in advance for your help!

Reply via email to