I can try on other systems. I'll get back to you on this. Running in the CPU with mpiaij and vec mpi works correctly in Frontier. Thank you, M ________________________________ From: Barry Smith <bsm...@petsc.dev> Sent: Friday, July 19, 2024 7:58 PM To: Vanella, Marcos (Fed) <marcos.vane...@nist.gov> Cc: petsc-users@mcs.anl.gov <petsc-users@mcs.anl.gov>; Patel, Saumil Sudhir <spa...@anl.gov> Subject: Re: [petsc-users] compilation error with latest petsc source
It is unlikely, though, of course, possible that the problem comes from the Fortran code. Is there any way to ./configure/build the code in the same way on another system that is easier to debug for? Or with less options on Frontier? (For example without the optimization flags and the extra -lxpmem etc?) and see if it still crashes in the same way? Frontier is very flaky. Barry On Jul 19, 2024, at 3:37 PM, Vanella, Marcos (Fed) <marcos.vane...@nist.gov> wrote: Hi Barry, with the changes in place for my fortran calls I'm now picking up the following error running PC + gamg preconditioner and mpiaijkokkos, kokkos vec: [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 7 BUS: Bus Error, possibly illegal memory access [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see https://urldefense.us/v3/__https://petsc.org/release/faq/*valgrind__;Iw!!G_uCfscf7eWS!daJnYwttia2Bg9KKsNib1kyA1jOn2_4XG0YWVvYP72NswC9nJAvmCg63dDlMaA8hNmxlpr1kvCTSahSxkF6EdZAW5_Fg4KOp$ and https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!daJnYwttia2Bg9KKsNib1kyA1jOn2_4XG0YWVvYP72NswC9nJAvmCg63dDlMaA8hNmxlpr1kvCTSahSxkF6EdZAW5_q4WhLE$ [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [0]PETSC ERROR: The line numbers in the error traceback are not always exact. [0]PETSC ERROR: #1 MPI function [0]PETSC ERROR: #2 PetscSFLinkFinishCommunication_Default() at /autofs/nccs-svm1_home1/vanellam/Software/petsc/src/vec/is/sf/impls/basic/sfmpi.c:13 [0]PETSC ERROR: #3 PetscSFLinkFinishCommunication() at /autofs/nccs-svm1_home1/vanellam/Software/petsc/include/../src/vec/is/sf/impls/basic/sfpack.h:291 [0]PETSC ERROR: #4 PetscSFBcastEnd_Basic() at /autofs/nccs-svm1_home1/vanellam/Software/petsc/src/vec/is/sf/impls/basic/sfbasic.c:373 [0]PETSC ERROR: #5 PetscSFBcastEnd() at /autofs/nccs-svm1_home1/vanellam/Software/petsc/src/vec/is/sf/interface/sf.c:1540 [0]PETSC ERROR: #6 VecScatterEnd_Internal() at /autofs/nccs-svm1_home1/vanellam/Software/petsc/src/vec/is/sf/interface/vscat.c:95 [0]PETSC ERROR: #7 VecScatterEnd() at /autofs/nccs-svm1_home1/vanellam/Software/petsc/src/vec/is/sf/interface/vscat.c:1352 [0]PETSC ERROR: #8 MatDiagonalScale_MPIAIJ() at /autofs/nccs-svm1_home1/vanellam/Software/petsc/src/mat/impls/aij/mpi/mpiaij.c:1990 [0]PETSC ERROR: #9 MatDiagonalScale() at /autofs/nccs-svm1_home1/vanellam/Software/petsc/src/mat/interface/matrix.c:5691 [0]PETSC ERROR: #10 MatCreateGraph_Simple_AIJ() at /autofs/nccs-svm1_home1/vanellam/Software/petsc/src/mat/impls/aij/mpi/mpiaij.c:8026 [0]PETSC ERROR: #11 MatCreateGraph() at /autofs/nccs-svm1_home1/vanellam/Software/petsc/src/mat/interface/matrix.c:11426 [0]PETSC ERROR: #12 PCGAMGCreateGraph_AGG() at /autofs/nccs-svm1_home1/vanellam/Software/petsc/src/ksp/pc/impls/gamg/agg.c:663 [0]PETSC ERROR: #13 PCGAMGCreateGraph() at /autofs/nccs-svm1_home1/vanellam/Software/petsc/src/ksp/pc/impls/gamg/gamg.c:2041 [0]PETSC ERROR: #14 PCSetUp_GAMG() at /autofs/nccs-svm1_home1/vanellam/Software/petsc/src/ksp/pc/impls/gamg/gamg.c:695 [0]PETSC ERROR: #15 PCSetUp() at /autofs/nccs-svm1_home1/vanellam/Software/petsc/src/ksp/pc/interface/precon.c:1077 [0]PETSC ERROR: #16 KSPSetUp() at /autofs/nccs-svm1_home1/vanellam/Software/petsc/src/ksp/ksp/interface/itfunc.c:415 [0]PETSC ERROR: #17 KSPSolve_Private() at /autofs/nccs-svm1_home1/vanellam/Software/petsc/src/ksp/ksp/interface/itfunc.c:826 [0]PETSC ERROR: #18 KSPSolve() at /autofs/nccs-svm1_home1/vanellam/Software/petsc/src/ksp/ksp/interface/itfunc.c:1073 MPICH ERROR [Rank 0] [job id 2109802.0] [Fri Jul 19 15:31:18 2024] [frontier03726] - Abort(59) (rank 0 in comm 0): application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 I setup PETSc with gnu compilers like this in Frontier: ./configure COPTFLAGS="-O2" CXXOPTFLAGS="-O2" FOPTFLAGS="-O2" FCOPTFLAGS="-O2" HIPOPTFLAGS="-O2 --offload-arch=gfx90a" --with-debugging=1 --with-cc=cc --with-cxx=CC --with-fc=ftn --with-hip --with-hip-arch=gfx908 --with-hipc=hipcc --LIBS="-L${MPICH_DIR}/lib -lmpi ${CRAY_XPMEM_POST_LINK_OPTS} -lxpmem ${PE_MPICH_GTL_DIR_amd_gfx90a} ${PE_MPICH_GTL_LIBS_amd_gfx90a}" --download-kokkos --download-kokkos-kernels --download-hypre --download-suitesparse --download-cmake --force Have you guys come across this before? Thank you for your time, Marcos ________________________________ From: Vanella, Marcos (Fed) <marcos.vane...@nist.gov<mailto:marcos.vane...@nist.gov>> Sent: Friday, July 19, 2024 12:54 PM To: Barry Smith <bsm...@petsc.dev<mailto:bsm...@petsc.dev>> Cc: petsc-users@mcs.anl.gov<mailto:petsc-users@mcs.anl.gov> <petsc-users@mcs.anl.gov<mailto:petsc-users@mcs.anl.gov>>; Patel, Saumil Sudhir <spa...@anl.gov<mailto:spa...@anl.gov>> Subject: Re: [petsc-users] compilation error with latest petsc source Thank you Barry! We'll address the change accordingly. M ________________________________ From: Barry Smith <bsm...@petsc.dev<mailto:bsm...@petsc.dev>> Sent: Friday, July 19, 2024 12:42 PM To: Vanella, Marcos (Fed) <marcos.vane...@nist.gov<mailto:marcos.vane...@nist.gov>> Cc: petsc-users@mcs.anl.gov<mailto:petsc-users@mcs.anl.gov> <petsc-users@mcs.anl.gov<mailto:petsc-users@mcs.anl.gov>> Subject: Re: [petsc-users] compilation error with latest petsc source We made some superficial changes to the Fortran API to better support Fortran and its error checking. See the bottom of https://urldefense.us/v3/__https://petsc.org/main/changes/dev/__;!!G_uCfscf7eWS!daJnYwttia2Bg9KKsNib1kyA1jOn2_4XG0YWVvYP72NswC9nJAvmCg63dDlMaA8hNmxlpr1kvCTSahSxkF6EdZAW52HSty8Q$ Basically, you have to respect Fortran's pickiness about passing the correct dimension (or lack of dimension) of arguments. In the error below, you need to pass PETSC_NULL_INTEGER_ARRAY On Jul 19, 2024, at 12:20 PM, Vanella, Marcos (Fed) via petsc-users <petsc-users@mcs.anl.gov<mailto:petsc-users@mcs.anl.gov>> wrote: This Message Is From an External Sender This message came from outside your organization. Hi, I did an update and compiled PETSc in Frontier with gnu compilers. When compiling my code with PETSc I see this new error pop up: Building mpich_gnu_frontier ftn -c -m64 -O2 -g -std=f2018 -frecursive -ffpe-summary=none -fall-intrinsics -cpp -DGITHASH_PP=\"FDS-6.9.1-894-g0b77ae0-FireX\" -DGITDATE_PP=\""Thu Jul 11 16:05:44 2024 -0400\"" -DBUILDDATE_PP=\""Jul 19, 2024 12:13:39\"" -DWITH_PETSC -I"/autofs/nccs-svm1_home1/vanellam/Software/petsc/include/" -I"/autofs/nccs-svm1_home1/vanellam/Software/petsc/arch-linux-frontier-opt-gcc2/include" -fopenmp ../../Source/pres.f90 ../../Source/pres.f90:2799:65: 2799 | CALL MATCREATESEQAIJ(PETSC_COMM_SELF,ZM%NUNKH,ZM%NUNKH,NNZ_7PT_H,PETSC_NULL_INTEGER,ZM%PETSC_MZ%A_H,PETSC_IERR) | 1 Error: Rank mismatch in argument ‘e’ at (1) (rank-1 and scalar) It seems the use of PETSC_NULL_INTEGER is causing an issue now. From the PETSc docs this entry is nnz which can be an array or NULL. Has there been any change on the API for this routine? Thanks, Marcos PS: I see some other erros in calls to PETSc routines, same type.