It is unlikely, though, of course, possible that the problem comes from the 
Fortran code. Is there any way to ./configure/build the code in the same way on 
another system that is easier to debug for? Or with less options on Frontier? 
(For example without the optimization flags and the extra -lxpmem etc?) and see 
if it still crashes in the same way?  Frontier is very flaky.

   Barry


> On Jul 19, 2024, at 3:37 PM, Vanella, Marcos (Fed) <marcos.vane...@nist.gov> 
> wrote:
> 
> Hi Barry, with the changes in place for my fortran calls I'm now picking up 
> the following error running PC + gamg preconditioner and mpiaijkokkos, kokkos 
> vec:
> 
> [0]PETSC ERROR: 
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Caught signal number 7 BUS: Bus Error, possibly illegal 
> memory access
> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [0]PETSC ERROR: or see 
> https://urldefense.us/v3/__https://petsc.org/release/faq/*valgrind__;Iw!!G_uCfscf7eWS!eN32mKT-DWEElShhgK8OZIc3vOOsf_Mdz0zToUJkoNBD1nYonVt8s8ERpHIlut7cq7wLaBIV8INywSoF5L-ds58$
>   and 
> https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!eN32mKT-DWEElShhgK8OZIc3vOOsf_Mdz0zToUJkoNBD1nYonVt8s8ERpHIlut7cq7wLaBIV8INywSoFnOnytYM$
>  
> [0]PETSC ERROR: ---------------------  Stack Frames 
> ------------------------------------
> [0]PETSC ERROR: The line numbers in the error traceback are not always exact.
> [0]PETSC ERROR: #1 MPI function
> [0]PETSC ERROR: #2 PetscSFLinkFinishCommunication_Default() at 
> /autofs/nccs-svm1_home1/vanellam/Software/petsc/src/vec/is/sf/impls/basic/sfmpi.c:13
> [0]PETSC ERROR: #3 PetscSFLinkFinishCommunication() at 
> /autofs/nccs-svm1_home1/vanellam/Software/petsc/include/../src/vec/is/sf/impls/basic/sfpack.h:291
> [0]PETSC ERROR: #4 PetscSFBcastEnd_Basic() at 
> /autofs/nccs-svm1_home1/vanellam/Software/petsc/src/vec/is/sf/impls/basic/sfbasic.c:373
> [0]PETSC ERROR: #5 PetscSFBcastEnd() at 
> /autofs/nccs-svm1_home1/vanellam/Software/petsc/src/vec/is/sf/interface/sf.c:1540
> [0]PETSC ERROR: #6 VecScatterEnd_Internal() at 
> /autofs/nccs-svm1_home1/vanellam/Software/petsc/src/vec/is/sf/interface/vscat.c:95
> [0]PETSC ERROR: #7 VecScatterEnd() at 
> /autofs/nccs-svm1_home1/vanellam/Software/petsc/src/vec/is/sf/interface/vscat.c:1352
> [0]PETSC ERROR: #8 MatDiagonalScale_MPIAIJ() at 
> /autofs/nccs-svm1_home1/vanellam/Software/petsc/src/mat/impls/aij/mpi/mpiaij.c:1990
> [0]PETSC ERROR: #9 MatDiagonalScale() at 
> /autofs/nccs-svm1_home1/vanellam/Software/petsc/src/mat/interface/matrix.c:5691
> [0]PETSC ERROR: #10 MatCreateGraph_Simple_AIJ() at 
> /autofs/nccs-svm1_home1/vanellam/Software/petsc/src/mat/impls/aij/mpi/mpiaij.c:8026
> [0]PETSC ERROR: #11 MatCreateGraph() at 
> /autofs/nccs-svm1_home1/vanellam/Software/petsc/src/mat/interface/matrix.c:11426
> [0]PETSC ERROR: #12 PCGAMGCreateGraph_AGG() at 
> /autofs/nccs-svm1_home1/vanellam/Software/petsc/src/ksp/pc/impls/gamg/agg.c:663
> [0]PETSC ERROR: #13 PCGAMGCreateGraph() at 
> /autofs/nccs-svm1_home1/vanellam/Software/petsc/src/ksp/pc/impls/gamg/gamg.c:2041
> [0]PETSC ERROR: #14 PCSetUp_GAMG() at 
> /autofs/nccs-svm1_home1/vanellam/Software/petsc/src/ksp/pc/impls/gamg/gamg.c:695
> [0]PETSC ERROR: #15 PCSetUp() at 
> /autofs/nccs-svm1_home1/vanellam/Software/petsc/src/ksp/pc/interface/precon.c:1077
> [0]PETSC ERROR: #16 KSPSetUp() at 
> /autofs/nccs-svm1_home1/vanellam/Software/petsc/src/ksp/ksp/interface/itfunc.c:415
> [0]PETSC ERROR: #17 KSPSolve_Private() at 
> /autofs/nccs-svm1_home1/vanellam/Software/petsc/src/ksp/ksp/interface/itfunc.c:826
> [0]PETSC ERROR: #18 KSPSolve() at 
> /autofs/nccs-svm1_home1/vanellam/Software/petsc/src/ksp/ksp/interface/itfunc.c:1073
> MPICH ERROR [Rank 0] [job id 2109802.0] [Fri Jul 19 15:31:18 2024] 
> [frontier03726] - Abort(59) (rank 0 in comm 0): application called 
> MPI_Abort(MPI_COMM_WORLD, 59) - process 0
> 
> I setup PETSc with gnu compilers like this in Frontier:
> ./configure COPTFLAGS="-O2" CXXOPTFLAGS="-O2" FOPTFLAGS="-O2" 
> FCOPTFLAGS="-O2" HIPOPTFLAGS="-O2 --offload-arch=gfx90a" --with-debugging=1 
> --with-cc=cc --with-cxx=CC --with-fc=ftn --with-hip --with-hip-arch=gfx908 
> --with-hipc=hipcc   --LIBS="-L${MPICH_DIR}/lib -lmpi 
> ${CRAY_XPMEM_POST_LINK_OPTS} -lxpmem ${PE_MPICH_GTL_DIR_amd_gfx90a} 
> ${PE_MPICH_GTL_LIBS_amd_gfx90a}" --download-kokkos --download-kokkos-kernels 
> --download-hypre --download-suitesparse --download-cmake --force
>  
> Have you guys come across this before? Thank you for your time,
> Marcos
> 
> From: Vanella, Marcos (Fed) <marcos.vane...@nist.gov 
> <mailto:marcos.vane...@nist.gov>>
> Sent: Friday, July 19, 2024 12:54 PM
> To: Barry Smith <bsm...@petsc.dev <mailto:bsm...@petsc.dev>>
> Cc: petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov> 
> <petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov>>; Patel, Saumil 
> Sudhir <spa...@anl.gov <mailto:spa...@anl.gov>>
> Subject: Re: [petsc-users] compilation error with latest petsc source
>  
> Thank you Barry! We'll address the change accordingly.
> M
> From: Barry Smith <bsm...@petsc.dev <mailto:bsm...@petsc.dev>>
> Sent: Friday, July 19, 2024 12:42 PM
> To: Vanella, Marcos (Fed) <marcos.vane...@nist.gov 
> <mailto:marcos.vane...@nist.gov>>
> Cc: petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov> 
> <petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov>>
> Subject: Re: [petsc-users] compilation error with latest petsc source
>  
> 
>    We made some superficial changes to the Fortran API to better support 
> Fortran and its error checking. See the bottom of 
> https://urldefense.us/v3/__https://petsc.org/main/changes/dev/__;!!G_uCfscf7eWS!eN32mKT-DWEElShhgK8OZIc3vOOsf_Mdz0zToUJkoNBD1nYonVt8s8ERpHIlut7cq7wLaBIV8INywSoFtLbsQb4$
>  
>  
> <https://urldefense.us/v3/__https://petsc.org/main/changes/dev/__;!!G_uCfscf7eWS!eN32mKT-DWEElShhgK8OZIc3vOOsf_Mdz0zToUJkoNBD1nYonVt8s8ERpHIlut7cq7wLaBIV8INywSoFtLbsQb4$
>  >
>    Basically, you have to respect Fortran's pickiness about passing the 
> correct dimension (or lack of dimension) of arguments. In the error below, 
> you need to pass PETSC_NULL_INTEGER_ARRAY
> 
> 
> 
> 
> 
>> On Jul 19, 2024, at 12:20 PM, Vanella, Marcos (Fed) via petsc-users 
>> <petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov>> wrote:
>> 
>> This Message Is From an External Sender
>> This message came from outside your organization.
>> Hi, I did an update and compiled PETSc in Frontier with gnu compilers. When 
>> compiling my code with PETSc I see this new error pop up:
>> 
>> Building mpich_gnu_frontier
>> ftn -c -m64 -O2 -g  -std=f2018 -frecursive -ffpe-summary=none 
>> -fall-intrinsics -cpp -DGITHASH_PP=\"FDS-6.9.1-894-g0b77ae0-FireX\" 
>> -DGITDATE_PP=\""Thu Jul 11 16:05:44 2024 -0400\"" -DBUILDDATE_PP=\""Jul 19, 
>> 2024  12:13:39\""   -DWITH_PETSC 
>> -I"/autofs/nccs-svm1_home1/vanellam/Software/petsc/include/" 
>> -I"/autofs/nccs-svm1_home1/vanellam/Software/petsc/arch-linux-frontier-opt-gcc2/include"
>>   -fopenmp ../../Source/pres.f90
>> ../../Source/pres.f90:2799:65:
>> 
>>  2799 | CALL 
>> MATCREATESEQAIJ(PETSC_COMM_SELF,ZM%NUNKH,ZM%NUNKH,NNZ_7PT_H,PETSC_NULL_INTEGER,ZM%PETSC_MZ%A_H,PETSC_IERR)
>>       |                                                                 1
>> Error: Rank mismatch in argument ‘e’ at (1) (rank-1 and scalar)
>> 
>> It seems the use of PETSC_NULL_INTEGER is causing an issue now. From the 
>> PETSc docs this entry is nnz which can be an array or NULL. Has there been 
>> any change on the API for this routine? 
>> 
>> Thanks,
>> Marcos
>> 
>> PS: I see some other erros in calls to PETSc routines, same type.

Reply via email to