Re: [petsc-users] DIVERGED_NANORINF when using HYPRE/BoomerAMG

Matthew Knepley Tue, 30 Jun 2020 06:16:58 -0700

On Tue, Jun 30, 2020 at 7:59 AM Andrea Iob <andrea_...@hotmail.com> wrote:


> Hi,
>
> I'm trying to solve a linear system using HYPRE/BoomerAMG as
> preconditioner. The system comes from a two-dimensional adjoint
> Navier-Stokes problem.
>

Do you expect the system you are solving to be elliptic? BoomerAMG is
designed for elliptic systems, and can easily fail if applied
to a more general system, say one with zeros on the diagonal.

  Thanks,

     Matt


> The mesh is structured (6400 cells) and there are 24 DoFs for each cell
> (the matrix has a total of 153600 rows). The condition number of the matrix
> should be in the order of 1e9. Using ILU preconditioner and FGMRES, the
> system is correctly solved (it takes 800 iterations to reach a residual of
> 1e-12). However, replacing ILU with HYPRE/BoomerAMG, the solver stops right
> after the first iteration (DIVERGED_NANORINF). I've tried different
> BoomerAMG options and different solvers (including Richardson to use
> BoomerAMG without a Krylov method), but I always get the same
> DIVERGED_NANORINF error.
>
> Using the same HYPRE/BoomerAMG + FGMERS setup on another problem
> (Laplacian on a highly stretched grid), the solver reaches convergence very
> quickly without any problems.
>
> At the bottom of this mail, you'll find the log of a run that stops with
> DIVERGED_NANORINF. Do you spot something in my setup that may explain why
> the solver is not converging?
>
> Thanks. Best regards,
> Andrea
>
> -----------------------------------------
> Assembly started...
>  Reading matrix file...
>    - Number of rows (N) = 153600
>    - Number of non-zero entries (NNZ) = 18247680
>  Assembly completed.
>  Time elapsed 83.3231s
>  Set up started...
>    Set up completed.
>    Time elapsed 6.62441s
>  Solution started...
>   0 KSP unpreconditioned resid norm 7.736650641501e-01 true resid norm
> 7.736650641501e-01 ||r(i)||/||b|| 1.000000000000e+00
>   0 KSP Residual norm 7.736650641501e-01 % max 1.000000000000e+00 min
> 1.000000000000e+00 max/min 1.000000000000e+00
> [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> [0]PETSC ERROR:
> [0]PETSC ERROR: KSPSolve has not converged due to Nan or Inf inner product
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
> [0]PETSC ERROR: Petsc Release Version 3.10.3, unknown
> [0]PETSC ERROR: bitpit_system_solver on a  named xxx by xxx Tue Jun 30
> 09:42:09 2020
> [0]PETSC ERROR: Configure options PETSC_ARCH=arch-linux2-c-opt-gcc7-hypre
> --with-blaslapack-dir=/opt/lapack/3.8.0-gcc.7.4.0/lib64 --with-debugging=0
> COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 --with-valgrind=1
> --with-valgrind-dir=/opt/valgrind/3.14.0/
> --prefix=/opt/petsc/3.10.3_gcc-7.4.0 --with-mpi=1 --download-hypre
> [0]PETSC ERROR: #1 KSPGMRESClassicalGramSchmidtOrthogonalization() line 67
> in /root/InstallSources/petsc/src/ksp/ksp/impls/gmres/borthog2.c
> [0]PETSC ERROR: #2 KSPFGMRESCycle() line 175 in
> /root/InstallSources/petsc/src/ksp/ksp/impls/gmres/fgmres/fgmres.c
> [0]PETSC ERROR: #3 KSPSolve_FGMRES() line 291 in
> /root/InstallSources/petsc/src/ksp/ksp/impls/gmres/fgmres/fgmres.c
> [0]PETSC ERROR: #4 KSPSolve() line 780 in
> /root/InstallSources/petsc/src/ksp/ksp/interface/itfunc.c
> KSP Object: 1 MPI processes
>   type: fgmres
>     restart=45, using Classical (unmodified) Gram-Schmidt
> Orthogonalization with no iterative refinement
>     happy breakdown tolerance 1e-30
>   maximum iterations=10000, nonzero initial guess
>   tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.
>   right preconditioning
>   using UNPRECONDITIONED norm type for convergence test
> PC Object: 1 MPI processes
>   type: hypre
>     HYPRE BoomerAMG preconditioning
>       Cycle type V
>       Maximum number of levels 25
>       Maximum number of iterations PER hypre call 1
>       Convergence tolerance PER hypre call 0.
>       Threshold for strong coupling 0.7
>       Interpolation truncation factor 0.3
>       Interpolation: max elements per row 2
>       Number of levels of aggressive coarsening 4
>       Number of paths for aggressive coarsening 5
>       Maximum row sums 0.9
>       Sweeps down         1
>       Sweeps up           1
>       Sweeps on coarse    1
>       Relax down          sequential-Gauss-Seidel
>       Relax up            sequential-Gauss-Seidel
>       Relax on coarse     Gaussian-elimination
>       Relax weight  (all)      1.
>       Outer relax weight (all) 1.
>       Using CF-relaxation
>       Smooth type          Euclid
>       Smooth num levels    25
>       Euclid ILU(k) levels 0
>       Euclid ILU(k) drop tolerance 0.
>       Euclid ILU use Block-Jacobi? 1
>       Measure type        local
>       Coarsen type        HMIS
>       Interpolation type  ext+i
>   linear system matrix = precond matrix:
>   Mat Object: Mat_0x1e88040_0 1 MPI processes
>     type: seqaij
>     rows=153600, cols=153600, bs=24
>     total: nonzeros=18247680, allocated nonzeros=18247680
>     total number of mallocs used during MatSetValues calls =0
>       using I-node routines: found 32000 nodes, limit used is 5
>    Solution completed.
>    Time elapsed 6.93336s
>  Solution information...
>     Error (L2)   : 32.8359
>     Error (Linf) : 1.93278
>
> ************************************************************************************************************************
> ***             WIDEN YOUR WINDOW TO 120 CHARACTERS.  Use 'enscript -r
> -fCourier9' to print this document            ***
>
> ************************************************************************************************************************
>
> ---------------------------------------------- PETSc Performance Summary:
> ----------------------------------------------
>
> bitpit_system_solver on a  named xxx with 1 processor, by xxx Tue Jun 30
> 09:43:46 2020
> Using Petsc Release Version 3.10.3, unknown
>
>                          Max       Max/Min     Avg       Total
> Time (sec):           9.718e+01     1.000   9.718e+01
> Objects:              5.300e+01     1.000   5.300e+01
> Flop:                 1.107e+08     1.000   1.107e+08  1.107e+08
> Flop/sec:             1.139e+06     1.000   1.139e+06  1.139e+06
> MPI Messages:         0.000e+00     0.000   0.000e+00  0.000e+00
> MPI Message Lengths:  0.000e+00     0.000   0.000e+00  0.000e+00
> MPI Reductions:       0.000e+00     0.000
>
> Flop counting convention: 1 flop = 1 real number operation of type
> (multiply/divide/add/subtract)
>                             e.g., VecAXPY() for real vectors of length N
> --> 2N flop
>                             and VecAXPY() for complex vectors of length N
> --> 8N flop
>
> Summary of Stages:   ----- Time ------  ----- Flop ------  --- Messages
> ---  -- Message Lengths --  -- Reductions --
>                         Avg     %Total     Avg     %Total    Count
> %Total     Avg         %Total    Count   %Total
>  0:      Main Stage: 9.7178e+01 100.0%  1.1071e+08 100.0%  0.000e+00
> 0.0%  0.000e+00        0.0%  0.000e+00   0.0%
>
>
> ------------------------------------------------------------------------------------------------------------------------
> See the 'Profiling' chapter of the users' manual for details on
> interpreting output.
> Phase summary info:
>    Count: number of times phase was executed
>    Time and Flop: Max - maximum over all processors
>                   Ratio - ratio of maximum to minimum over all processors
>    Mess: number of messages sent
>    AvgLen: average message length (bytes)
>    Reduct: number of global reductions
>    Global: entire computation
>    Stage: stages of a computation. Set stages with PetscLogStagePush() and
> PetscLogStagePop().
>       %T - percent time in this phase         %F - percent flop in this
> phase
>       %M - percent messages in this phase     %L - percent message lengths
> in this phase
>       %R - percent reductions in this phase
>    Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over
> all processors)
>
> ------------------------------------------------------------------------------------------------------------------------
> Event                Count      Time (sec)
> Flop                              --- Global ---  --- Stage ----  Total
>                    Max Ratio  Max     Ratio   Max  Ratio  Mess   AvgLen
> Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
>
> ------------------------------------------------------------------------------------------------------------------------
>
> --- Event Stage 0: Main Stage
>
> BuildTwoSidedF         2 1.0 1.8471e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatMult                3 1.0 6.8688e-02 1.0 1.09e+08 1.0 0.0e+00 0.0e+00
> 0.0e+00  0 98  0  0  0   0 98  0  0  0  1587
> MatConvert             2 1.0 3.8534e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatAssemblyBegin       3 1.0 2.8112e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatAssemblyEnd         3 1.0 7.8563e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatGetRowIJ            2 1.0 3.6100e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatView                2 1.0 5.4618e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 56  0  0  0  0  56  0  0  0  0     0
> VecView                2 1.0 7.6981e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  1  0  0  0  0   1  0  0  0  0     0
> VecMDot                1 1.0 1.7635e-04 1.0 3.07e+05 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0  1742
> VecNorm                3 1.0 5.4832e-04 1.0 9.22e+05 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  1  0  0  0   0  1  0  0  0  1681
> VecScale               1 1.0 1.5216e-04 1.0 1.54e+05 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0  1009
> VecCopy                1 1.0 1.5406e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecSet                21 1.0 7.1106e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecAYPX                1 1.0 1.9940e-04 1.0 1.54e+05 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0   770
> VecWAXPY               1 1.0 7.8802e-04 1.0 1.54e+05 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0   195
> KSPSetUp               2 1.0 6.8995e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> PCSetUp                2 1.0 1.3190e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 14  0  0  0  0  14  0  0  0  0     0
> PCApply                1 1.0 2.2728e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>
> ------------------------------------------------------------------------------------------------------------------------
>
> Memory usage is given in bytes:
>
> Object Type          Creations   Destructions     Memory  Descendants' Mem.
> Reports information only for process 0.
>
> --- Event Stage 0: Main Stage
>
>               Matrix     3              1         2872     0.
>               Vector    32             20     17234800     0.
>            Index Set     4              4         3200     0.
>    IS L to G Mapping     2              0            0     0.
>          Vec Scatter     2              0            0     0.
>               Viewer     6              4         3392     0.
>        Krylov Solver     2              1        61676     0.
>       Preconditioner     2              1         1432     0.
>
> ========================================================================================================================
> Average time to get PetscTime(): 3.36e-08
> #PETSc Option Table entries:
> -ksp_converged_reason
> -ksp_error_if_not_converged
> -ksp_monitor_singular_value
> -ksp_monitor_true_residual
> -log_view
> -pc_hypre_boomeramg_agg_nl 4
> -pc_hypre_boomeramg_agg_num_paths 5
> -pc_hypre_boomeramg_coarsen_type HMIS
> -pc_hypre_boomeramg_eu_bj true
> -pc_hypre_boomeramg_interp_type ext+i
> -pc_hypre_boomeramg_P_max 2
> -pc_hypre_boomeramg_relax_type_all sequential-Gauss-Seidel
> -pc_hypre_boomeramg_smooth_type Euclid
> -pc_hypre_boomeramg_strong_threshold 0.7
> -pc_hypre_boomeramg_truncfactor 0.3
> #End of PETSc Option Table entries
> Compiled without FORTRAN kernels
> Compiled with full precision matrices (default)
> sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8
> sizeof(PetscScalar) 8 sizeof(PetscInt) 4
> Configure options: PETSC_ARCH=arch-linux2-c-opt-gcc7-hypre
> --with-blaslapack-dir=/opt/lapack/3.8.0-gcc.7.4.0/lib64 --with-debugging=0
> COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 --with-valgrind=1
> --with-valgrind-dir=/opt/valgrind/3.14.0/
> --prefix=/opt/petsc/3.10.3_gcc-7.4.0 --with-mpi=1 --download-hypre
> -----------------------------------------
> Libraries compiled on 2020-06-26 12:35:46 on xxx
> Machine characteristics:
> Linux-3.10.0-1127.8.2.el7.x86_64-x86_64-with-centos-7.6.1810-Core
> Using PETSc directory: /opt/petsc/3.10.3_gcc-7.4.0
> Using PETSc arch:
> -----------------------------------------
>
> Using C compiler: mpicc  -fPIC  -Wall -Wwrite-strings -Wno-strict-aliasing
> -Wno-unknown-pragmas -fstack-protector -fvisibility=hidden -O3
> Using Fortran compiler: mpif90  -fPIC -Wall -ffree-line-length-0
> -Wno-unused-dummy-argument -O3
> -----------------------------------------
>
> Using include paths: -I/opt/petsc/3.10.3_gcc-7.4.0/include
> -I/opt/valgrind/3.14.0/include
> -----------------------------------------
>
> Using C linker: mpicc
> Using Fortran linker: mpif90
> Using libraries: -Wl,-rpath,/opt/petsc/3.10.3_gcc-7.4.0/lib
> -L/opt/petsc/3.10.3_gcc-7.4.0/lib -lpetsc
> -Wl,-rpath,/opt/petsc/3.10.3_gcc-7.4.0/lib
> -L/opt/petsc/3.10.3_gcc-7.4.0/lib
> -Wl,-rpath,/opt/lapack/3.8.0-gcc.7.4.0/lib64
> -L/opt/lapack/3.8.0-gcc.7.4.0/lib64
> -Wl,-rpath,/opt/openmpi4.0.0-gcc7.4.0_slurm/lib
> -L/opt/openmpi4.0.0-gcc7.4.0_slurm/lib
> -Wl,-rpath,/opt/gnu/7.4/lib/gcc/x86_64-pc-linux-gnu/7.4.0
> -L/opt/gnu/7.4/lib/gcc/x86_64-pc-linux-gnu/7.4.0
> -Wl,-rpath,/opt/gnu/7.4/lib/gcc -L/opt/gnu/7.4/lib/gcc
> -Wl,-rpath,/opt/gnu/7.4/lib64 -L/opt/gnu/7.4/lib64
> -Wl,-rpath,/opt/gnu/7.4/lib -L/opt/gnu/7.4/lib -lHYPRE -llapack -lblas -lm
> -lX11 -lstdc++ -ldl -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh
> -lmpi -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lpthread -lstdc++
> -ldl
> -----------------------------------------
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>

Re: [petsc-users] DIVERGED_NANORINF when using HYPRE/BoomerAMG

Reply via email to