Use Valgrind. It will show the exact lines that memory is allocated that 
does not get freed later. I am guess some memory allocation within MUMPS is not 
properly freed.

   Barry


> On Sep 5, 2024, at 1:33 PM, Corbijn van Willenswaard, Lars (UT) via 
> petsc-users <petsc-users@mcs.anl.gov> wrote:
> 
> Dear PETSc,
> 
> For the last months I’ve struggled with a solver that I wrote for a FEM 
> eigenvalue problem running out of memory. I’ve traced it to KSPSolve + MUMPS 
> being the issue, but I'm getting stuck on digging deeper.
> 
> The reason I suspect the KSPSolve/MUMPS is that when commenting out the 
> KSPSolve the memory stays constant while running the rest of the algorithm. 
> Of course, the algorithm also converges to a different result in this setup. 
> When changing the KSP statement to 
> for(int i = 0; i < 100000000; i++) KSPSolve(A_, vec1_, vec2_);
> the memory grows faster than when running the algorithm. Logging shows that 
> the program never the terminating i=100M. Measuring the memory growth using 
> ps (started debugging before I knew of PETSc's features) I see a growth in 
> the RSS on a single compute node of up to 300MB/min for this artificial case. 
> Real cases grow more like 60MB/min/node, which causes a kill due to memory 
> exhaustion after about 2-3 days.
> 
> Locally (Mac) I've been able to reproduce this both with 6 MPI processes and 
> with a single one. Instrumenting the code to show differences in 
> PetscMemoryGetCurrentUsage (full code below), shows that the memory increases 
> every step at the start, but also does at later iterations (small excerpt 
> from the output):
> rank step        memory (increase since prev step)
> 0   6544 current 39469056(  8192)
> 0   7086 current 39477248(  8192)
> 0   7735 current 39497728( 20480)
> 0   9029 current 39501824(  4096)
> A similar output is visible in a run with 6 ranks, where there does not seem 
> to be a pattern as to which of the ranks increases at which step. (Note I've 
> checked PetscMallocGetCurrentUsage, but that is constant)
> 
> Switching the solver to petsc's own solver on a single rank does not show a 
> memory increase after the first solve. Changing the solve to overwrite the 
> vector will result in a few increases after the first solve, but these do not 
> seem to repeat. So, changes like VecCopy(vec2, vec1_); KSPSolve(A_, vec1_, 
> vec1_);.
> 
> Does anyone have an idea on how to further dig into this problem?
> 
> Kind regards,
> Lars Corbijn
> 
> 
> Instrumentation:
> 
> PetscLogDouble lastCurrent, current;
> int rank;
> MPI_Comm_rank(PETSC_COMM_WORLD, &rank);
> for(int i = 0; i < 100000000; ++i) {
>    PetscMemoryGetCurrentUsage(&lastCurrent);
>    KSPSolve(A_, vec1_, vec2_);
>    PetscMemoryGetCurrentUsage(&current);
>    if(current != lastCurrent) {
>        std::cout << std::setw(2) << rank << " " << std::setw(6) << i
>                  << " current " << std::setw(8) << (int) current << std::right
>                  << "(" << std::setw(6) << (int)(current - lastCurrent) << ")"
>                  << std::endl;
>    }
>    lastCurrent = current;
> }
> 
> 
> Matrix details
> The matrix A in question is created from a complex valued matrix C_ (type 
> mataij) using the following code (modulo renames). Theoretically it should be 
> a Laplacian with phase-shift periodic boundary conditions
> MatHermitianTranspose(C_, MAT_INITIAL_MATRIX, &Y_);
> MatProductCreate(C_, Y_, NULL, & A_);
> MatProductSetType(A_, MATPRODUCT_AB);
> MatProductSetFromOptions(A_);
> MatProductSymbolic(A_);
> MatProductNumeric(A_);
> 
> Petsc arguments: -log_view_memory -log_view :petsc.log -ksp_type preonly 
> -pc_type lu -pc_factor_mat_solver_type mumps -bv_matmult vecs -memory_view
> 

Reply via email to