Do you know if there are plans for NCCL support in PETSc? On Tue, Apr 16, 2024, 10:41 PM Junchao Zhang <junchao.zh...@gmail.com> wrote:
> Glad to hear you found a way. Did you use Frontera at TACC? If yes, I > could have a try. > > --Junchao Zhang > > > On Tue, Apr 16, 2024 at 8:35 PM Sreeram R Venkat <srven...@utexas.edu> > wrote: > >> I finally figured out a way to make it work. I had to build PETSc and my >> application using the (non GPU-aware) Intel MPI. Then, before running, I >> switch to the MVAPICH2-GDR. I'm not sure why that works, but it's the only >> way I've >> ZjQcmQRYFpfptBannerStart >> This Message Is From an External Sender >> This message came from outside your organization. >> >> ZjQcmQRYFpfptBannerEnd >> I finally figured out a way to make it work. I had to build PETSc and my >> application using the (non GPU-aware) Intel MPI. Then, before running, I >> switch to the MVAPICH2-GDR. >> I'm not sure why that works, but it's the only way I've found to compile >> and run successfully without throwing any errors about not having a >> GPU-aware MPI. >> >> >> >> On Fri, Dec 8, 2023 at 5:30 PM Mark Adams <mfad...@lbl.gov> wrote: >> >>> You may need to set some env variables. This can be system specific so >>> you might want to look at docs or ask TACC how to run with GPU-aware MPI. >>> >>> Mark >>> >>> On Fri, Dec 8, 2023 at 5:17 PM Sreeram R Venkat <srven...@utexas.edu> >>> wrote: >>> >>>> Actually, when I compile my program with this build of PETSc and run, I >>>> still get the error: >>>> >>>> PETSC ERROR: PETSc is configured with GPU support, but your MPI is not >>>> GPU-aware. For better performance, please use a GPU-aware MPI. >>>> >>>> I have the mvapich2-gdr module loaded and MV2_USE_CUDA=1. >>>> >>>> Is there anything else I need to do? >>>> >>>> Thanks, >>>> Sreeram >>>> >>>> On Fri, Dec 8, 2023 at 3:29 PM Sreeram R Venkat <srven...@utexas.edu> >>>> wrote: >>>> >>>>> Thank you, changing to CUDA 11.4 fixed the issue. The mvapich2-gdr >>>>> module didn't require CUDA 11.4 as a dependency, so I was using 12.0 >>>>> >>>>> On Fri, Dec 8, 2023 at 1:15 PM Satish Balay <ba...@mcs.anl.gov> wrote: >>>>> >>>>>> Executing: mpicc -show >>>>>> stdout: icc -I/opt/apps/cuda/11.4/include >>>>>> -I/opt/apps/cuda/11.4/include -lcuda -L/opt/apps/cuda/11.4/lib64/stubs >>>>>> -L/opt/apps/cuda/11.4/lib64 -lcudart -lrt >>>>>> -Wl,-rpath,/opt/apps/cuda/11.4/lib64 -Wl,-rpath,XORIGIN/placeholder >>>>>> -Wl,--build-id -L/opt/apps/cuda/11.4/lib64/ -lm >>>>>> -I/opt/apps/intel19/mvapich2-gdr/2.3.7/include >>>>>> -L/opt/apps/intel19/mvapich2-gdr/2.3.7/lib64 -Wl,-rpath >>>>>> -Wl,/opt/apps/intel19/mvapich2-gdr/2.3.7/lib64 -Wl,--enable-new-dtags >>>>>> -lmpi >>>>>> >>>>>> Checking for program /opt/apps/cuda/12.0/bin/nvcc...found >>>>>> >>>>>> Looks like you are trying to mix in 2 different cuda versions in this >>>>>> build. >>>>>> >>>>>> Perhaps you need to use cuda-11.4 - with this install of mvapich.. >>>>>> >>>>>> Satish >>>>>> >>>>>> On Fri, 8 Dec 2023, Matthew Knepley wrote: >>>>>> >>>>>> > On Fri, Dec 8, 2023 at 1:54 PM Sreeram R Venkat < >>>>>> srven...@utexas.edu> wrote: >>>>>> > >>>>>> > > I am trying to build PETSc with CUDA using the CUDA-Aware >>>>>> MVAPICH2-GDR. >>>>>> > > >>>>>> > > Here is my configure command: >>>>>> > > >>>>>> > > ./configure PETSC_ARCH=linux-c-debug-mvapich2-gdr --download-hypre >>>>>> > > --with-cuda=true --cuda-dir=$TACC_CUDA_DIR --with-hdf5=true >>>>>> > > --with-hdf5-dir=$TACC_PHDF5_DIR --download-elemental >>>>>> --download-metis >>>>>> > > --download-parmetis --with-cc=mpicc --with-cxx=mpicxx >>>>>> --with-fc=mpif90 >>>>>> > > >>>>>> > > which errors with: >>>>>> > > >>>>>> > > UNABLE to CONFIGURE with GIVEN OPTIONS (see >>>>>> configure.log for >>>>>> > > details): >>>>>> > > >>>>>> > > >>>>>> --------------------------------------------------------------------------------------------- >>>>>> > > CUDA compile failed with arch flags " -ccbin mpic++ -std=c++14 >>>>>> > > -Xcompiler -fPIC >>>>>> > > -Xcompiler -fvisibility=hidden -g -lineinfo -gencode >>>>>> > > arch=compute_80,code=sm_80" >>>>>> > > generated from "--with-cuda-arch=80" >>>>>> > > >>>>>> > > >>>>>> > > >>>>>> > > The same configure command works when I use the Intel MPI and I >>>>>> can build >>>>>> > > with CUDA. The full config.log file is attached. Please let me >>>>>> know if you >>>>>> > > need any other information. I appreciate your help with this. >>>>>> > > >>>>>> > >>>>>> > The proximate error is >>>>>> > >>>>>> > Executing: nvcc -c -o >>>>>> /tmp/petsc-kn3f29gl/config.packages.cuda/conftest.o >>>>>> > -I/tmp/petsc-kn3f29gl/config.setCompilers >>>>>> > -I/tmp/petsc-kn3f29gl/config.types >>>>>> > -I/tmp/petsc-kn3f29gl/config.packages.cuda -ccbin mpic++ -std=c++14 >>>>>> > -Xcompiler -fPIC -Xcompiler -fvisibility=hidden -g -lineinfo >>>>>> -gencode >>>>>> > arch=compute_80,code=sm_80 >>>>>> /tmp/petsc-kn3f29gl/config.packages.cuda/ >>>>>> > conftest.cu >>>>>> <https://urldefense.us/v3/__http://conftest.cu__;!!G_uCfscf7eWS!duKUz7pE9N0adJ-FOW7PLZ_1cSZvYlnqh7J0TIcZN0v8RLplcWxh1YE8Vis29K0cuw_zAvjdK-H9H2JYYuUUKRXxlA$> >>>>>> > stdout: >>>>>> > /opt/apps/cuda/11.4/include/crt/sm_80_rt.hpp(141): error: more than >>>>>> one >>>>>> > instance of overloaded function >>>>>> "__nv_associate_access_property_impl" has >>>>>> > "C" linkage >>>>>> > 1 error detected in the compilation of >>>>>> > "/tmp/petsc-kn3f29gl/config.packages.cuda/conftest.cu >>>>>> <https://urldefense.us/v3/__http://conftest.cu__;!!G_uCfscf7eWS!duKUz7pE9N0adJ-FOW7PLZ_1cSZvYlnqh7J0TIcZN0v8RLplcWxh1YE8Vis29K0cuw_zAvjdK-H9H2JYYuUUKRXxlA$> >>>>>> ". >>>>>> > Possible ERROR while running compiler: exit code 1 >>>>>> > stderr: >>>>>> > /opt/apps/cuda/11.4/include/crt/sm_80_rt.hpp(141): error: more than >>>>>> one >>>>>> > instance of overloaded function >>>>>> "__nv_associate_access_property_impl" has >>>>>> > "C" linkage >>>>>> > >>>>>> > 1 error detected in the compilation of >>>>>> > "/tmp/petsc-kn3f29gl/config.packages.cuda >>>>>> > >>>>>> > This looks like screwed up headers to me, but I will let someone >>>>>> that >>>>>> > understands CUDA compilation reply. >>>>>> > >>>>>> > Thanks, >>>>>> > >>>>>> > Matt >>>>>> > >>>>>> > Thanks, >>>>>> > > Sreeram >>>>>> > > >>>>>> > >>>>>> > >>>>>> > >>>>> >>>>>