Use Valgrind. It will show the exact lines that memory is allocated that does not get freed later. I am guess some memory allocation within MUMPS is not properly freed.
Barry > On Sep 5, 2024, at 1:33 PM, Corbijn van Willenswaard, Lars (UT) via > petsc-users <petsc-users@mcs.anl.gov> wrote: > > Dear PETSc, > > For the last months I’ve struggled with a solver that I wrote for a FEM > eigenvalue problem running out of memory. I’ve traced it to KSPSolve + MUMPS > being the issue, but I'm getting stuck on digging deeper. > > The reason I suspect the KSPSolve/MUMPS is that when commenting out the > KSPSolve the memory stays constant while running the rest of the algorithm. > Of course, the algorithm also converges to a different result in this setup. > When changing the KSP statement to > for(int i = 0; i < 100000000; i++) KSPSolve(A_, vec1_, vec2_); > the memory grows faster than when running the algorithm. Logging shows that > the program never the terminating i=100M. Measuring the memory growth using > ps (started debugging before I knew of PETSc's features) I see a growth in > the RSS on a single compute node of up to 300MB/min for this artificial case. > Real cases grow more like 60MB/min/node, which causes a kill due to memory > exhaustion after about 2-3 days. > > Locally (Mac) I've been able to reproduce this both with 6 MPI processes and > with a single one. Instrumenting the code to show differences in > PetscMemoryGetCurrentUsage (full code below), shows that the memory increases > every step at the start, but also does at later iterations (small excerpt > from the output): > rank step memory (increase since prev step) > 0 6544 current 39469056( 8192) > 0 7086 current 39477248( 8192) > 0 7735 current 39497728( 20480) > 0 9029 current 39501824( 4096) > A similar output is visible in a run with 6 ranks, where there does not seem > to be a pattern as to which of the ranks increases at which step. (Note I've > checked PetscMallocGetCurrentUsage, but that is constant) > > Switching the solver to petsc's own solver on a single rank does not show a > memory increase after the first solve. Changing the solve to overwrite the > vector will result in a few increases after the first solve, but these do not > seem to repeat. So, changes like VecCopy(vec2, vec1_); KSPSolve(A_, vec1_, > vec1_);. > > Does anyone have an idea on how to further dig into this problem? > > Kind regards, > Lars Corbijn > > > Instrumentation: > > PetscLogDouble lastCurrent, current; > int rank; > MPI_Comm_rank(PETSC_COMM_WORLD, &rank); > for(int i = 0; i < 100000000; ++i) { > PetscMemoryGetCurrentUsage(&lastCurrent); > KSPSolve(A_, vec1_, vec2_); > PetscMemoryGetCurrentUsage(¤t); > if(current != lastCurrent) { > std::cout << std::setw(2) << rank << " " << std::setw(6) << i > << " current " << std::setw(8) << (int) current << std::right > << "(" << std::setw(6) << (int)(current - lastCurrent) << ")" > << std::endl; > } > lastCurrent = current; > } > > > Matrix details > The matrix A in question is created from a complex valued matrix C_ (type > mataij) using the following code (modulo renames). Theoretically it should be > a Laplacian with phase-shift periodic boundary conditions > MatHermitianTranspose(C_, MAT_INITIAL_MATRIX, &Y_); > MatProductCreate(C_, Y_, NULL, & A_); > MatProductSetType(A_, MATPRODUCT_AB); > MatProductSetFromOptions(A_); > MatProductSymbolic(A_); > MatProductNumeric(A_); > > Petsc arguments: -log_view_memory -log_view :petsc.log -ksp_type preonly > -pc_type lu -pc_factor_mat_solver_type mumps -bv_matmult vecs -memory_view >