Dear Petsc community, Recently, I encountered a memory leak while using Valgrind (3.25.1) with MPI (2 processes, MPICH 4.21.1) to test a PETSc-based code, which is a variant of the pestc's built-in example.
*#include <petscksp.h>* *static char help[] = "Demonstrate PCFIELDSPLIT after MatZeroRowsColumns() inside PCREDISTRIBUTE";* *int main(int argc, char **argv)* *{* * PetscMPIInt rank, size;* * Mat A;* * PetscCall(PetscInitialize(&argc, &argv, NULL, help));* * PetscCallMPI(MPI_Comm_size(PETSC_COMM_WORLD, &size));* * PetscCallMPI(MPI_Comm_rank(PETSC_COMM_WORLD, &rank));* * PetscCheck(size == 2, PETSC_COMM_WORLD, PETSC_ERR_WRONG_MPI_SIZE, "Must be run with 2 MPI processes");* * // Set up a small problem with 2 dofs on rank 0 and 4 on rank 1* * PetscCall(MatCreate(PETSC_COMM_WORLD, &A));* * PetscCall(MatSetSizes(A, !rank ? 2 : 4, !rank ? 2 : 4, PETSC_DETERMINE, PETSC_DETERMINE));* * PetscCall(MatSetFromOptions(A));* * if (rank == 0) {* * PetscCall(MatSetValue(A, 0, 0, 2.0, ADD_VALUES));* * PetscCall(MatSetValue(A, 0, 1, -1.0, ADD_VALUES));* * PetscCall(MatSetValue(A, 1, 1, 3.0, ADD_VALUES));* * PetscCall(MatSetValue(A, 1, 2, -1.0, ADD_VALUES));* * } else if (rank == 1) {* * PetscCall(MatSetValue(A, 1, 2, 40.0, ADD_VALUES));//Additional line added* * PetscCall(MatSetValue(A, 2, 2, 4.0, ADD_VALUES));* * PetscCall(MatSetValue(A, 2, 3, -1.0, ADD_VALUES));* * PetscCall(MatSetValue(A, 3, 3, 5.0, ADD_VALUES));* * PetscCall(MatSetValue(A, 3, 4, -1.0, ADD_VALUES));* * PetscCall(MatSetValue(A, 4, 4, 6.0, ADD_VALUES));* * PetscCall(MatSetValue(A, 4, 5, -1.0, ADD_VALUES));* * PetscCall(MatSetValue(A, 5, 5, 7.0, ADD_VALUES));* * PetscCall(MatSetValue(A, 5, 4, -0.5, ADD_VALUES));* * }* * PetscCall(MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY));* * PetscCall(MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY));* * PetscCall(MatView(A, PETSC_VIEWER_STDOUT_WORLD));* * PetscCall(MatDestroy(&A));* * PetscCall(PetscFinalize());* * return 0;* *}* Rank 0 and 1 own 2 (from 0 to 1) and 4 (from 2 to 5) local rows respectively. I tried to add * PetscCall(MatSetValue(A, 1, 2, 40.0, ADD_VALUES));* for rank 1. //1 is owned by rank 0 only but is also modified in rank 1 now. After adding this line, a memory leak occurred. Does this imply that we cannot assign values to entries owned by other processors? In my case, I am assembling a global matrix from a DMPlex. With overlap=0, it seems necessary to use MatSetValues for rows owned by other processes. I'm not certain whether these two scenarios are equivalent, but they both appear to trigger the same memory leak. Did I miss something? Thanks a lot, Xiaodong ==3932339== 96 bytes in 1 blocks are definitely lost in loss record 2 of 3 ==3932339== at 0x4C392E1: malloc (vg_replace_malloc.c:446) ==3932339== by 0xA267A97: MPL_malloc (mpl_trmem.h:373) ==3932339== by 0xA267CE4: MPIR_Datatype_set_contents (mpir_datatype.h:420) ==3932339== by 0xA26E26E: MPIR_Type_create_struct_impl (type_create.c:919) ==3932339== by 0xA068F45: internal_Type_create_struct (c_binding.c:36491) ==3932339== by 0xA06911F: PMPI_Type_create_struct (c_binding.c:36551) ==3932339== by 0x4E78222: PMPI_Type_create_struct (libmpiwrap.c:2752) ==3932339== by 0x6F5C7D6: MatStashBlockTypeSetUp (matstash.c:772) ==3932339== by 0x6F61162: MatStashScatterBegin_BTS (matstash.c:838) ==3932339== by 0x6F54511: MatStashScatterBegin_Private (matstash.c:437) ==3932339== by 0x60BA9BD: MatAssemblyBegin_MPI_Hash (mpihashmat.h:59) ==3932339== by 0x6E8D3AE: MatAssemblyBegin (matrix.c:5749) ==3932339== ==3932339== 96 bytes in 1 blocks are definitely lost in loss record 3 of 3 ==3932339== at 0x4C392E1: malloc (vg_replace_malloc.c:446) ==3932339== by 0xA2385FF: MPL_malloc (mpl_trmem.h:373) ==3932339== by 0xA2395F8: MPII_Dataloop_alloc_and_copy (dataloop.c:400) ==3932339== by 0xA2393DC: MPII_Dataloop_alloc (dataloop.c:319) ==3932339== by 0xA23B239: MPIR_Dataloop_create_contiguous (dataloop_create_contig.c:56) ==3932339== by 0xA23BFF9: MPIR_Dataloop_create_indexed (dataloop_create_indexed.c:89) ==3932339== by 0xA23D5DC: create_basic_all_bytes_struct (dataloop_create_struct.c:252) ==3932339== by 0xA23D178: MPIR_Dataloop_create_struct (dataloop_create_struct.c:146) ==3932339== by 0xA25D4DB: MPIR_Typerep_commit (typerep_dataloop_commit.c:284) ==3932339== by 0xA261549: MPIR_Type_commit_impl (datatype_impl.c:185) ==3932339== by 0xA0624CA: internal_Type_commit (c_binding.c:34506) ==3932339== by 0xA062679: PMPI_Type_commit (c_binding.c:34553)