On Thu, Sep 5, 2024 at 2:46 PM Corbijn van Willenswaard, Lars (UT) < l.j.corbijnvanwillenswa...@utwente.nl> wrote:
> Thank you, that makes testing so much easier. So far, I’ve been able to > shrink the matrix (now only 64x64) and see that it still has growing memory > usage over time. Unfortunately, I’ve no access to a linux machine right > now, so running through valgrind like Barry suggested has to wait. > Just to check. Your matrix, with that simple code, producing growing memory? If you send the matrix, I will find the problem. Thanks, Matt > Lars > > > > *From: *Matthew Knepley <knep...@gmail.com> > *Date: *Thursday, 5 September 2024 at 19:56 > *To: *"Corbijn van Willenswaard, Lars (UT)" < > l.j.corbijnvanwillenswa...@utwente.nl> > *Cc: *"petsc-users@mcs.anl.gov" <petsc-users@mcs.anl.gov> > *Subject: *Re: [petsc-users] KSPSolve + MUMPS memory growth issues > > > > On Thu, Sep 5, 2024 at 1:40 PM Corbijn van Willenswaard, Lars (UT) via > petsc-users <petsc-users@mcs.anl.gov> wrote: > > Dear PETSc, > > For the last months I’ve struggled with a solver that I wrote for a FEM > eigenvalue problem running out of memory. I’ve traced it to KSPSolve + > MUMPS being the issue, but I'm getting stuck on digging deeper. > > The reason I suspect the KSPSolve/MUMPS is that when commenting out the > KSPSolve the memory stays constant while running the rest of the algorithm. > Of course, the algorithm also converges to a different result in this > setup. When changing the KSP statement to > for(int i = 0; i < 100000000; i++) KSPSolve(A_, vec1_, vec2_); > the memory grows faster than when running the algorithm. Logging shows > that the program never the terminating i=100M. Measuring the memory growth > using ps (started debugging before I knew of PETSc's features) I see a > growth in the RSS on a single compute node of up to 300MB/min for this > artificial case. Real cases grow more like 60MB/min/node, which causes a > kill due to memory exhaustion after about 2-3 days. > > Locally (Mac) I've been able to reproduce this both with 6 MPI processes > and with a single one. Instrumenting the code to show differences in > PetscMemoryGetCurrentUsage (full code below), shows that the memory > increases every step at the start, but also does at later iterations (small > excerpt from the output): > rank step memory (increase since prev step) > 0 6544 current 39469056( 8192) > 0 7086 current 39477248( 8192) > 0 7735 current 39497728( 20480) > 0 9029 current 39501824( 4096) > A similar output is visible in a run with 6 ranks, where there does not > seem to be a pattern as to which of the ranks increases at which step. > (Note I've checked PetscMallocGetCurrentUsage, but that is constant) > > Switching the solver to petsc's own solver on a single rank does not show > a memory increase after the first solve. Changing the solve to overwrite > the vector will result in a few increases after the first solve, but these > do not seem to repeat. So, changes like VecCopy(vec2, vec1_); KSPSolve(A_, > vec1_, vec1_);. > > Does anyone have an idea on how to further dig into this problem? > > > > I think the best way is to construct the simplest code that reproduces > your problem. For example, we could save your matrix in a binary file > > > > -ksp_view_mat binary:mat.bin > > > > and then use a very simple code: > > > > #include <petsc.h> > > int main(int argc, char **argv) > { > PetscViewer viewer; > Mat A; > Vec b, x; > > PetscCall(PetscInitialize(&argc, &argv, NULL, NULL)); > PetscCall(PetscViewerBinaryOpen(PETSC_COMM_WORLD, "mat.bin", > PETSC_MODE_READ, &viewer)); > PetscCall(MatLoad(A, viewer)); > PetscCall(PetscViewerDestroy(&viewer)); > PetscCall(MatCreateVecs(A, &x, &b)); > PetscCall(VecSet(b, 1.)); > > PetscCall(KSPCreate(PETSC_COMM_WORLD, &ksp)); > PetscCall(KSPSetOperators(ksp, A, A)); > PetscCall(KSPSetFromOptions(ksp)); > for (PetscInt i = 0; i < 100000; ++i) PetscCall(KSPSolve(ksp, b, x)); > PetscCall(KSPDestroy(&ksp)); > > PetscCall(MatDestroy(&A)); > PetscCall(VecDestroy(&b)); > PetscCall(VecDestroy(&x)); > PetscCall(PetscFinalize()); > return(0); > } > > > > and see if you get memory increase. > > > > Thanks, > > > > Matt > > > > Kind regards, > Lars Corbijn > > > Instrumentation: > > PetscLogDouble lastCurrent, current; > int rank; > MPI_Comm_rank(PETSC_COMM_WORLD, &rank); > for(int i = 0; i < 100000000; ++i) { > PetscMemoryGetCurrentUsage(&lastCurrent); > KSPSolve(A_, vec1_, vec2_); > PetscMemoryGetCurrentUsage(¤t); > if(current != lastCurrent) { > std::cout << std::setw(2) << rank << " " << std::setw(6) << i > << " current " << std::setw(8) << (int) current << > std::right > << "(" << std::setw(6) << (int)(current - lastCurrent) > << ")" > << std::endl; > } > lastCurrent = current; > } > > > Matrix details > The matrix A in question is created from a complex valued matrix C_ (type > mataij) using the following code (modulo renames). Theoretically it should > be a Laplacian with phase-shift periodic boundary conditions > MatHermitianTranspose(C_, MAT_INITIAL_MATRIX, &Y_); > MatProductCreate(C_, Y_, NULL, & A_); > MatProductSetType(A_, MATPRODUCT_AB); > MatProductSetFromOptions(A_); > MatProductSymbolic(A_); > MatProductNumeric(A_); > > Petsc arguments: -log_view_memory -log_view :petsc.log -ksp_type preonly > -pc_type lu -pc_factor_mat_solver_type mumps -bv_matmult vecs -memory_view > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!a3fMSsz8iwy1xT-RP5SLCxWCyf4c8fdJbl-sBGzc7G9RNSjBFaOZMQIIs26Kkmx86CA_koT3TvTMMeiNQTeQ$ > > <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!a3fMSsz8iwy1xT-RP5SLCxWCyf4c8fdJbl-sBGzc7G9RNSjBFaOZMQIIs26Kkmx86CA_koT3TvTMMRdxjnQm$ > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!a3fMSsz8iwy1xT-RP5SLCxWCyf4c8fdJbl-sBGzc7G9RNSjBFaOZMQIIs26Kkmx86CA_koT3TvTMMeiNQTeQ$ <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!a3fMSsz8iwy1xT-RP5SLCxWCyf4c8fdJbl-sBGzc7G9RNSjBFaOZMQIIs26Kkmx86CA_koT3TvTMMRdxjnQm$ >