Hello Matt, I have attached the output with mat_view for 8 and 40 processors.
I am unsure what is meant by the matrix communicator and the partitioning. I am using the default behaviour in every case. How can I find this information? I have attached the log view as well if that helps. Thanks, Matt On 23 Jul 2024, at 9:24 PM, Matthew Knepley <knep...@gmail.com> wrote: You don't often get email from knep...@gmail.com<mailto:knep...@gmail.com>. Learn why this is important<https://urldefense.us/v3/__https://aka.ms/LearnAboutSenderIdentification__;!!G_uCfscf7eWS!defNF55JDHADXFMCPrlWVnASGb8l1sxXg5-10IVx4Ff5FFmO2N003z0BQ80cCU3clrwdPmEGeMWVUhdzckDhFG0VKlPdvbvDJrA$ > Also, you could run with -mat_view ::ascii_info_detail and send the output for both cases. The storage of matrix values is not redundant, so something else is going on. First, what communicator do you use for the matrix, and what partitioning? Thanks, Matt On Mon, Jul 22, 2024 at 10:27 PM Barry Smith <bsm...@petsc.dev<mailto:bsm...@petsc.dev>> wrote: This Message Is From an External Sender This message came from outside your organization. Send the code. On Jul 22, 2024, at 9:18 PM, Matthew Thomas via petsc-users <petsc-users@mcs.anl.gov<mailto:petsc-users@mcs.anl.gov>> wrote: This Message Is From an External Sender This message came from outside your organization. Hello, I am using petsc and slepc to solve an eigenvalue problem for sparse matrices. When I run my code with double the number of processors, the memory usage also doubles. I am able to reproduce this behaviour with ex1 of slepc’s hands on exercises. The issue is occurring with petsc not with slepc as this still occurs when I remove the solve step and just create and assemble the petsc matrix. With n=100000, this uses ~1Gb with 8 processors, but ~5Gb with 40 processors. This was done with petsc 3.21.3, on linux compiled with Intel using Intel-MPI Is this the expected behaviour? If not, how can I bug fix this? Thanks, Matt -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!defNF55JDHADXFMCPrlWVnASGb8l1sxXg5-10IVx4Ff5FFmO2N003z0BQ80cCU3clrwdPmEGeMWVUhdzckDhFG0VKlPduQ6gjvc$ <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!defNF55JDHADXFMCPrlWVnASGb8l1sxXg5-10IVx4Ff5FFmO2N003z0BQ80cCU3clrwdPmEGeMWVUhdzckDhFG0VKlPdbv0ojtA$ >
Mat Object: 8 MPI processes type: mpiaij rows=100000, cols=100000 total: nonzeros=299998, allocated nonzeros=299998 total number of mallocs used during MatSetValues calls=0 [0] Local rows 12500 nz 37499 nz alloced 37499 mem 0., not using I-node routines [0] on-diagonal part: nz 37498 [0] off-diagonal part: nz 1 [1] Local rows 12500 nz 37500 nz alloced 37500 mem 0., not using I-node routines [1] on-diagonal part: nz 37498 [1] off-diagonal part: nz 2 [2] Local rows 12500 nz 37500 nz alloced 37500 mem 0., not using I-node routines [2] on-diagonal part: nz 37498 [2] off-diagonal part: nz 2 [3] Local rows 12500 nz 37500 nz alloced 37500 mem 0., not using I-node routines [3] on-diagonal part: nz 37498 [3] off-diagonal part: nz 2 [4] Local rows 12500 nz 37500 nz alloced 37500 mem 0., not using I-node routines [4] on-diagonal part: nz 37498 [4] off-diagonal part: nz 2 [5] Local rows 12500 nz 37500 nz alloced 37500 mem 0., not using I-node routines [5] on-diagonal part: nz 37498 [5] off-diagonal part: nz 2 [6] Local rows 12500 nz 37500 nz alloced 37500 mem 0., not using I-node routines [6] on-diagonal part: nz 37498 [6] off-diagonal part: nz 2 [7] Local rows 12500 nz 37499 nz alloced 37499 mem 0., not using I-node routines [7] on-diagonal part: nz 37498 [7] off-diagonal part: nz 1 Information on VecScatter used in matrix-vector product: PetscSF Object: 8 MPI processes type: basic [0] Number of roots=12500, leaves=1, remote ranks=1 [0] 0 <- (1,0) [1] Number of roots=12500, leaves=2, remote ranks=2 [1] 0 <- (0,12499) [1] 1 <- (2,0) [2] Number of roots=12500, leaves=2, remote ranks=2 [2] 0 <- (1,12499) [2] 1 <- (3,0) [3] Number of roots=12500, leaves=2, remote ranks=2 [3] 0 <- (2,12499) [3] 1 <- (4,0) [4] Number of roots=12500, leaves=2, remote ranks=2 [4] 0 <- (3,12499) [4] 1 <- (5,0) [5] Number of roots=12500, leaves=2, remote ranks=2 [5] 0 <- (4,12499) [5] 1 <- (6,0) [6] Number of roots=12500, leaves=2, remote ranks=2 [6] 0 <- (5,12499) [6] 1 <- (7,0) [7] Number of roots=12500, leaves=1, remote ranks=1 [7] 0 <- (6,12499) [0] Roots referenced by my leaves, by rank [0] 1: 1 edges [0] 0 <- 0 [1] Roots referenced by my leaves, by rank [1] 0: 1 edges [1] 0 <- 12499 [1] 2: 1 edges [1] 1 <- 0 [2] Roots referenced by my leaves, by rank [2] 1: 1 edges [2] 0 <- 12499 [2] 3: 1 edges [2] 1 <- 0 [3] Roots referenced by my leaves, by rank [3] 2: 1 edges [3] 0 <- 12499 [3] 4: 1 edges [3] 1 <- 0 [4] Roots referenced by my leaves, by rank [4] 3: 1 edges [4] 0 <- 12499 [4] 5: 1 edges [4] 1 <- 0 [5] Roots referenced by my leaves, by rank [5] 4: 1 edges [5] 0 <- 12499 [5] 6: 1 edges [5] 1 <- 0 [6] Roots referenced by my leaves, by rank [6] 5: 1 edges [6] 0 <- 12499 [6] 7: 1 edges [6] 1 <- 0 [7] Roots referenced by my leaves, by rank [7] 6: 1 edges [7] 0 <- 12499 MultiSF sort=rank-order ====================================================================================== Resource Usage on 2024-07-24 09:36:33: Job Id: 121558447.gadi-pbs Project: y08 Exit Status: 0 Service Units: 0.04 NCPUs Requested: 8 NCPUs Used: 8 CPU Time Used: 00:00:42 Memory Requested: 8.0GB Memory Used: 919.55MB Walltime requested: 00:05:00 Walltime Used: 00:00:08 JobFS requested: 100.0MB JobFS used: 0B ======================================================================================
Mat Object: 40 MPI processes type: mpiaij rows=100000, cols=100000 total: nonzeros=299998, allocated nonzeros=299998 total number of mallocs used during MatSetValues calls=0 [0] Local rows 2500 nz 7499 nz alloced 7499 mem 0., not using I-node routines [0] on-diagonal part: nz 7498 [0] off-diagonal part: nz 1 [1] Local rows 2500 nz 7500 nz alloced 7500 mem 0., not using I-node routines [1] on-diagonal part: nz 7498 [1] off-diagonal part: nz 2 [2] Local rows 2500 nz 7500 nz alloced 7500 mem 0., not using I-node routines [2] on-diagonal part: nz 7498 [2] off-diagonal part: nz 2 [3] Local rows 2500 nz 7500 nz alloced 7500 mem 0., not using I-node routines [3] on-diagonal part: nz 7498 [3] off-diagonal part: nz 2 [4] Local rows 2500 nz 7500 nz alloced 7500 mem 0., not using I-node routines [4] on-diagonal part: nz 7498 [4] off-diagonal part: nz 2 [5] Local rows 2500 nz 7500 nz alloced 7500 mem 0., not using I-node routines [5] on-diagonal part: nz 7498 [5] off-diagonal part: nz 2 [6] Local rows 2500 nz 7500 nz alloced 7500 mem 0., not using I-node routines [6] on-diagonal part: nz 7498 [6] off-diagonal part: nz 2 [7] Local rows 2500 nz 7500 nz alloced 7500 mem 0., not using I-node routines [7] on-diagonal part: nz 7498 [7] off-diagonal part: nz 2 [8] Local rows 2500 nz 7500 nz alloced 7500 mem 0., not using I-node routines [8] on-diagonal part: nz 7498 [8] off-diagonal part: nz 2 [9] Local rows 2500 nz 7500 nz alloced 7500 mem 0., not using I-node routines [9] on-diagonal part: nz 7498 [9] off-diagonal part: nz 2 [10] Local rows 2500 nz 7500 nz alloced 7500 mem 0., not using I-node routines [10] on-diagonal part: nz 7498 [10] off-diagonal part: nz 2 [11] Local rows 2500 nz 7500 nz alloced 7500 mem 0., not using I-node routines [11] on-diagonal part: nz 7498 [11] off-diagonal part: nz 2 [12] Local rows 2500 nz 7500 nz alloced 7500 mem 0., not using I-node routines [12] on-diagonal part: nz 7498 [12] off-diagonal part: nz 2 [13] Local rows 2500 nz 7500 nz alloced 7500 mem 0., not using I-node routines [13] on-diagonal part: nz 7498 [13] off-diagonal part: nz 2 [14] Local rows 2500 nz 7500 nz alloced 7500 mem 0., not using I-node routines [14] on-diagonal part: nz 7498 [14] off-diagonal part: nz 2 [15] Local rows 2500 nz 7500 nz alloced 7500 mem 0., not using I-node routines [15] on-diagonal part: nz 7498 [15] off-diagonal part: nz 2 [16] Local rows 2500 nz 7500 nz alloced 7500 mem 0., not using I-node routines [16] on-diagonal part: nz 7498 [16] off-diagonal part: nz 2 [17] Local rows 2500 nz 7500 nz alloced 7500 mem 0., not using I-node routines [17] on-diagonal part: nz 7498 [17] off-diagonal part: nz 2 [18] Local rows 2500 nz 7500 nz alloced 7500 mem 0., not using I-node routines [18] on-diagonal part: nz 7498 [18] off-diagonal part: nz 2 [19] Local rows 2500 nz 7500 nz alloced 7500 mem 0., not using I-node routines [19] on-diagonal part: nz 7498 [19] off-diagonal part: nz 2 [20] Local rows 2500 nz 7500 nz alloced 7500 mem 0., not using I-node routines [20] on-diagonal part: nz 7498 [20] off-diagonal part: nz 2 [21] Local rows 2500 nz 7500 nz alloced 7500 mem 0., not using I-node routines [21] on-diagonal part: nz 7498 [21] off-diagonal part: nz 2 [22] Local rows 2500 nz 7500 nz alloced 7500 mem 0., not using I-node routines [22] on-diagonal part: nz 7498 [22] off-diagonal part: nz 2 [23] Local rows 2500 nz 7500 nz alloced 7500 mem 0., not using I-node routines [23] on-diagonal part: nz 7498 [23] off-diagonal part: nz 2 [24] Local rows 2500 nz 7500 nz alloced 7500 mem 0., not using I-node routines [24] on-diagonal part: nz 7498 [24] off-diagonal part: nz 2 [25] Local rows 2500 nz 7500 nz alloced 7500 mem 0., not using I-node routines [25] on-diagonal part: nz 7498 [25] off-diagonal part: nz 2 [26] Local rows 2500 nz 7500 nz alloced 7500 mem 0., not using I-node routines [26] on-diagonal part: nz 7498 [26] off-diagonal part: nz 2 [27] Local rows 2500 nz 7500 nz alloced 7500 mem 0., not using I-node routines [27] on-diagonal part: nz 7498 [27] off-diagonal part: nz 2 [28] Local rows 2500 nz 7500 nz alloced 7500 mem 0., not using I-node routines [28] on-diagonal part: nz 7498 [28] off-diagonal part: nz 2 [29] Local rows 2500 nz 7500 nz alloced 7500 mem 0., not using I-node routines [29] on-diagonal part: nz 7498 [29] off-diagonal part: nz 2 [30] Local rows 2500 nz 7500 nz alloced 7500 mem 0., not using I-node routines [30] on-diagonal part: nz 7498 [30] off-diagonal part: nz 2 [31] Local rows 2500 nz 7500 nz alloced 7500 mem 0., not using I-node routines [31] on-diagonal part: nz 7498 [31] off-diagonal part: nz 2 [32] Local rows 2500 nz 7500 nz alloced 7500 mem 0., not using I-node routines [32] on-diagonal part: nz 7498 [32] off-diagonal part: nz 2 [33] Local rows 2500 nz 7500 nz alloced 7500 mem 0., not using I-node routines [33] on-diagonal part: nz 7498 [33] off-diagonal part: nz 2 [34] Local rows 2500 nz 7500 nz alloced 7500 mem 0., not using I-node routines [34] on-diagonal part: nz 7498 [34] off-diagonal part: nz 2 [35] Local rows 2500 nz 7500 nz alloced 7500 mem 0., not using I-node routines [35] on-diagonal part: nz 7498 [35] off-diagonal part: nz 2 [36] Local rows 2500 nz 7500 nz alloced 7500 mem 0., not using I-node routines [36] on-diagonal part: nz 7498 [36] off-diagonal part: nz 2 [37] Local rows 2500 nz 7500 nz alloced 7500 mem 0., not using I-node routines [37] on-diagonal part: nz 7498 [37] off-diagonal part: nz 2 [38] Local rows 2500 nz 7500 nz alloced 7500 mem 0., not using I-node routines [38] on-diagonal part: nz 7498 [38] off-diagonal part: nz 2 [39] Local rows 2500 nz 7499 nz alloced 7499 mem 0., not using I-node routines [39] on-diagonal part: nz 7498 [39] off-diagonal part: nz 1 Information on VecScatter used in matrix-vector product: PetscSF Object: 40 MPI processes type: basic [0] Number of roots=2500, leaves=1, remote ranks=1 [0] 0 <- (1,0) [1] Number of roots=2500, leaves=2, remote ranks=2 [1] 0 <- (0,2499) [1] 1 <- (2,0) [2] Number of roots=2500, leaves=2, remote ranks=2 [2] 0 <- (1,2499) [2] 1 <- (3,0) [3] Number of roots=2500, leaves=2, remote ranks=2 [3] 0 <- (2,2499) [3] 1 <- (4,0) [4] Number of roots=2500, leaves=2, remote ranks=2 [4] 0 <- (3,2499) [4] 1 <- (5,0) [5] Number of roots=2500, leaves=2, remote ranks=2 [5] 0 <- (4,2499) [5] 1 <- (6,0) [6] Number of roots=2500, leaves=2, remote ranks=2 [6] 0 <- (5,2499) [6] 1 <- (7,0) [7] Number of roots=2500, leaves=2, remote ranks=2 [7] 0 <- (6,2499) [7] 1 <- (8,0) [8] Number of roots=2500, leaves=2, remote ranks=2 [8] 0 <- (7,2499) [8] 1 <- (9,0) [9] Number of roots=2500, leaves=2, remote ranks=2 [9] 0 <- (8,2499) [9] 1 <- (10,0) [10] Number of roots=2500, leaves=2, remote ranks=2 [10] 0 <- (9,2499) [10] 1 <- (11,0) [11] Number of roots=2500, leaves=2, remote ranks=2 [11] 0 <- (10,2499) [11] 1 <- (12,0) [12] Number of roots=2500, leaves=2, remote ranks=2 [12] 0 <- (11,2499) [12] 1 <- (13,0) [13] Number of roots=2500, leaves=2, remote ranks=2 [13] 0 <- (12,2499) [13] 1 <- (14,0) [14] Number of roots=2500, leaves=2, remote ranks=2 [14] 0 <- (13,2499) [14] 1 <- (15,0) [15] Number of roots=2500, leaves=2, remote ranks=2 [15] 0 <- (14,2499) [15] 1 <- (16,0) [16] Number of roots=2500, leaves=2, remote ranks=2 [16] 0 <- (15,2499) [16] 1 <- (17,0) [17] Number of roots=2500, leaves=2, remote ranks=2 [17] 0 <- (16,2499) [17] 1 <- (18,0) [18] Number of roots=2500, leaves=2, remote ranks=2 [18] 0 <- (17,2499) [18] 1 <- (19,0) [19] Number of roots=2500, leaves=2, remote ranks=2 [19] 0 <- (18,2499) [19] 1 <- (20,0) [20] Number of roots=2500, leaves=2, remote ranks=2 [20] 0 <- (19,2499) [20] 1 <- (21,0) [21] Number of roots=2500, leaves=2, remote ranks=2 [21] 0 <- (20,2499) [21] 1 <- (22,0) [22] Number of roots=2500, leaves=2, remote ranks=2 [22] 0 <- (21,2499) [22] 1 <- (23,0) [23] Number of roots=2500, leaves=2, remote ranks=2 [23] 0 <- (22,2499) [23] 1 <- (24,0) [24] Number of roots=2500, leaves=2, remote ranks=2 [24] 0 <- (23,2499) [24] 1 <- (25,0) [25] Number of roots=2500, leaves=2, remote ranks=2 [25] 0 <- (24,2499) [25] 1 <- (26,0) [26] Number of roots=2500, leaves=2, remote ranks=2 [26] 0 <- (25,2499) [26] 1 <- (27,0) [27] Number of roots=2500, leaves=2, remote ranks=2 [27] 0 <- (26,2499) [27] 1 <- (28,0) [28] Number of roots=2500, leaves=2, remote ranks=2 [28] 0 <- (27,2499) [28] 1 <- (29,0) [29] Number of roots=2500, leaves=2, remote ranks=2 [29] 0 <- (28,2499) [29] 1 <- (30,0) [30] Number of roots=2500, leaves=2, remote ranks=2 [30] 0 <- (29,2499) [30] 1 <- (31,0) [31] Number of roots=2500, leaves=2, remote ranks=2 [31] 0 <- (30,2499) [31] 1 <- (32,0) [32] Number of roots=2500, leaves=2, remote ranks=2 [32] 0 <- (31,2499) [32] 1 <- (33,0) [33] Number of roots=2500, leaves=2, remote ranks=2 [33] 0 <- (32,2499) [33] 1 <- (34,0) [34] Number of roots=2500, leaves=2, remote ranks=2 [34] 0 <- (33,2499) [34] 1 <- (35,0) [35] Number of roots=2500, leaves=2, remote ranks=2 [35] 0 <- (34,2499) [35] 1 <- (36,0) [36] Number of roots=2500, leaves=2, remote ranks=2 [36] 0 <- (35,2499) [36] 1 <- (37,0) [37] Number of roots=2500, leaves=2, remote ranks=2 [37] 0 <- (36,2499) [37] 1 <- (38,0) [38] Number of roots=2500, leaves=2, remote ranks=2 [38] 0 <- (37,2499) [38] 1 <- (39,0) [39] Number of roots=2500, leaves=1, remote ranks=1 [39] 0 <- (38,2499) [0] Roots referenced by my leaves, by rank [0] 1: 1 edges [0] 0 <- 0 [1] Roots referenced by my leaves, by rank [1] 0: 1 edges [1] 0 <- 2499 [1] 2: 1 edges [1] 1 <- 0 [2] Roots referenced by my leaves, by rank [2] 1: 1 edges [2] 0 <- 2499 [2] 3: 1 edges [2] 1 <- 0 [3] Roots referenced by my leaves, by rank [3] 2: 1 edges [3] 0 <- 2499 [3] 4: 1 edges [3] 1 <- 0 [4] Roots referenced by my leaves, by rank [4] 3: 1 edges [4] 0 <- 2499 [4] 5: 1 edges [4] 1 <- 0 [5] Roots referenced by my leaves, by rank [5] 4: 1 edges [5] 0 <- 2499 [5] 6: 1 edges [5] 1 <- 0 [6] Roots referenced by my leaves, by rank [6] 5: 1 edges [6] 0 <- 2499 [6] 7: 1 edges [6] 1 <- 0 [7] Roots referenced by my leaves, by rank [7] 6: 1 edges [7] 0 <- 2499 [7] 8: 1 edges [7] 1 <- 0 [8] Roots referenced by my leaves, by rank [8] 7: 1 edges [8] 0 <- 2499 [8] 9: 1 edges [8] 1 <- 0 [9] Roots referenced by my leaves, by rank [9] 8: 1 edges [9] 0 <- 2499 [9] 10: 1 edges [9] 1 <- 0 [10] Roots referenced by my leaves, by rank [10] 9: 1 edges [10] 0 <- 2499 [10] 11: 1 edges [10] 1 <- 0 [11] Roots referenced by my leaves, by rank [11] 10: 1 edges [11] 0 <- 2499 [11] 12: 1 edges [11] 1 <- 0 [12] Roots referenced by my leaves, by rank [12] 11: 1 edges [12] 0 <- 2499 [12] 13: 1 edges [12] 1 <- 0 [13] Roots referenced by my leaves, by rank [13] 12: 1 edges [13] 0 <- 2499 [13] 14: 1 edges [13] 1 <- 0 [14] Roots referenced by my leaves, by rank [14] 13: 1 edges [14] 0 <- 2499 [14] 15: 1 edges [14] 1 <- 0 [15] Roots referenced by my leaves, by rank [15] 14: 1 edges [15] 0 <- 2499 [15] 16: 1 edges [15] 1 <- 0 [16] Roots referenced by my leaves, by rank [16] 15: 1 edges [16] 0 <- 2499 [16] 17: 1 edges [16] 1 <- 0 [17] Roots referenced by my leaves, by rank [17] 16: 1 edges [17] 0 <- 2499 [17] 18: 1 edges [17] 1 <- 0 [18] Roots referenced by my leaves, by rank [18] 17: 1 edges [18] 0 <- 2499 [18] 19: 1 edges [18] 1 <- 0 [19] Roots referenced by my leaves, by rank [19] 18: 1 edges [19] 0 <- 2499 [19] 20: 1 edges [19] 1 <- 0 [20] Roots referenced by my leaves, by rank [20] 19: 1 edges [20] 0 <- 2499 [20] 21: 1 edges [20] 1 <- 0 [21] Roots referenced by my leaves, by rank [21] 20: 1 edges [21] 0 <- 2499 [21] 22: 1 edges [21] 1 <- 0 [22] Roots referenced by my leaves, by rank [22] 21: 1 edges [22] 0 <- 2499 [22] 23: 1 edges [22] 1 <- 0 [23] Roots referenced by my leaves, by rank [23] 22: 1 edges [23] 0 <- 2499 [23] 24: 1 edges [23] 1 <- 0 [24] Roots referenced by my leaves, by rank [24] 23: 1 edges [24] 0 <- 2499 [24] 25: 1 edges [24] 1 <- 0 [25] Roots referenced by my leaves, by rank [25] 24: 1 edges [25] 0 <- 2499 [25] 26: 1 edges [25] 1 <- 0 [26] Roots referenced by my leaves, by rank [26] 25: 1 edges [26] 0 <- 2499 [26] 27: 1 edges [26] 1 <- 0 [27] Roots referenced by my leaves, by rank [27] 26: 1 edges [27] 0 <- 2499 [27] 28: 1 edges [27] 1 <- 0 [28] Roots referenced by my leaves, by rank [28] 27: 1 edges [28] 0 <- 2499 [28] 29: 1 edges [28] 1 <- 0 [29] Roots referenced by my leaves, by rank [29] 28: 1 edges [29] 0 <- 2499 [29] 30: 1 edges [29] 1 <- 0 [30] Roots referenced by my leaves, by rank [30] 29: 1 edges [30] 0 <- 2499 [30] 31: 1 edges [30] 1 <- 0 [31] Roots referenced by my leaves, by rank [31] 30: 1 edges [31] 0 <- 2499 [31] 32: 1 edges [31] 1 <- 0 [32] Roots referenced by my leaves, by rank [32] 31: 1 edges [32] 0 <- 2499 [32] 33: 1 edges [32] 1 <- 0 [33] Roots referenced by my leaves, by rank [33] 32: 1 edges [33] 0 <- 2499 [33] 34: 1 edges [33] 1 <- 0 [34] Roots referenced by my leaves, by rank [34] 33: 1 edges [34] 0 <- 2499 [34] 35: 1 edges [34] 1 <- 0 [35] Roots referenced by my leaves, by rank [35] 34: 1 edges [35] 0 <- 2499 [35] 36: 1 edges [35] 1 <- 0 [36] Roots referenced by my leaves, by rank [36] 35: 1 edges [36] 0 <- 2499 [36] 37: 1 edges [36] 1 <- 0 [37] Roots referenced by my leaves, by rank [37] 36: 1 edges [37] 0 <- 2499 [37] 38: 1 edges [37] 1 <- 0 [38] Roots referenced by my leaves, by rank [38] 37: 1 edges [38] 0 <- 2499 [38] 39: 1 edges [38] 1 <- 0 [39] Roots referenced by my leaves, by rank [39] 38: 1 edges [39] 0 <- 2499 MultiSF sort=rank-order ====================================================================================== Resource Usage on 2024-07-24 09:34:31: Job Id: 121558351.gadi-pbs Project: y08 Exit Status: 0 Service Units: 0.07 NCPUs Requested: 40 NCPUs Used: 40 CPU Time Used: 00:00:36 Memory Requested: 8.0GB Memory Used: 4.59GB Walltime requested: 00:05:00 Walltime Used: 00:00:03 JobFS requested: 100.0MB JobFS used: 0B ======================================================================================
**************************************************************************************************************************************************************** *** WIDEN YOUR WINDOW TO 160 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** **************************************************************************************************************************************************************** ------------------------------------------------------------------ PETSc Performance Summary: ------------------------------------------------------------------ ########################################################## # # # WARNING!!! # # # # This code was compiled with a debugging option. # # To get timing results run ./configure # # using --with-debugging=no, the performance will # # be generally two or three times faster. # # # ########################################################## /home/149/mt3516/island_damping/petsc_test/slepc_test/ex1 on a arch-linux-c-debug named gadi-cpu-clx-0898.gadi.nci.org.au with 8 processes, by mt3516 on Wed Jul 24 09:51:44 2024 Using Petsc Release Version 3.21.1, unknown Max Max/Min Avg Total Time (sec): 3.080e-02 1.000 3.080e-02 Objects: 0.000e+00 0.000 0.000e+00 Flops: 0.000e+00 0.000 0.000e+00 0.000e+00 Flops/sec: 0.000e+00 0.000 0.000e+00 0.000e+00 Memory (bytes): 1.953e+06 1.000 1.953e+06 1.562e+07 MPI Msg Count: 4.000e+00 2.000 3.500e+00 2.800e+01 MPI Msg Len (bytes): 1.600e+01 2.000 4.000e+00 1.120e+02 MPI Reductions: 8.100e+01 1.000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total Count %Total Avg %Total Count %Total 0: Main Stage: 3.0757e-02 99.9% 0.0000e+00 0.0% 2.800e+01 100.0% 4.000e+00 100.0% 6.200e+01 76.5% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flop: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent AvgLen: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flop in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ ########################################################## # # # WARNING!!! # # # # This code was compiled with a debugging option. # # To get timing results run ./configure # # using --with-debugging=no, the performance will # # be generally two or three times faster. # # # ########################################################## Event Count Time (sec) Flop --- Global --- --- Stage ---- Total Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage BuildTwoSided 3 1.0 2.1506e-04 1.4 0.00e+00 0.0 1.4e+01 4.0e+00 6.0e+00 1 0 50 50 7 1 0 50 50 10 0 BuildTwoSidedF 2 1.0 1.9223e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 1 0 0 0 5 1 0 0 0 6 0 MatAssemblyBegin 1 1.0 2.5808e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 1 0 0 0 7 1 0 0 0 10 0 MatAssemblyEnd 1 1.0 1.0028e-02 1.0 0.00e+00 0.0 2.8e+01 4.0e+00 3.7e+01 33 0 100 100 46 33 0 100 100 60 0 SFSetGraph 1 1.0 1.6253e-05 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 2 0 0 0 0 3 0 SFSetUp 1 1.0 2.0841e-04 1.0 0.00e+00 0.0 2.8e+01 4.0e+00 4.0e+00 1 0 100 100 5 1 0 100 100 6 0 ------------------------------------------------------------------------------------------------------------------------ Object Type Creations Destructions. Reports information only for process 0. --- Event Stage 0: Main Stage Matrix 3 3 Vector 2 2 Index Set 2 2 Star Forest Graph 1 1 ======================================================================================================================== Average time to get PetscTime(): 3.6601e-08 Average time for MPI_Barrier(): 1.48194e-05 Average time for zero size MPI_Send(): 6.92962e-06 #PETSc Option Table entries: -log_view # (source: command line) #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 16 sizeof(PetscInt) 4 Configure options: --with-scalar-type=complex --with-mkl_cpardiso ----------------------------------------- Libraries compiled on 2024-07-19 04:07:25 on gadi-login-09.gadi.nci.org.au Machine characteristics: Linux-4.18.0-513.24.1.el8.nci.x86_64-x86_64-with-centos-8.9-Green_Obsidian Using PETSc directory: /home/149/mt3516/island_damping/matrix_packages/petsc Using PETSc arch: arch-linux-c-debug ----------------------------------------- Using C compiler: mpicc -fPIC -Wall -Wwrite-strings -Wno-unknown-pragmas -Wno-lto-type-mismatch -fstack-protector -fvisibility=hidden -g3 -O0 Using Fortran compiler: mpif90 -fPIC -Wall -ffree-line-length-none -ffree-line-length-0 -Wno-lto-type-mismatch -Wno-unused-dummy-argument -g -O0 ----------------------------------------- Using include paths: -I/home/149/mt3516/island_damping/matrix_packages/petsc/include -I/home/149/mt3516/island_damping/matrix_packages/petsc/arch-linux-c-debug/include ----------------------------------------- Using C linker: mpicc Using Fortran linker: mpif90 Using libraries: -Wl,-rpath,/home/149/mt3516/island_damping/matrix_packages/petsc/arch-linux-c-debug/lib -L/home/149/mt3516/island_damping/matrix_packages/petsc/arch-linux-c-debug/lib -lpetsc -Wl,-rpath,/apps/intel-ct/2022.2.0/mkl/lib/intel64 -L/apps/intel-ct/2022.2.0/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_core -lmkl_gnu_thread -lmkl_blacs_intelmpi_lp64 -lgomp -ldl -lpthread -lm -lX11 -lgfortran -lstdc++ -lquadmath ----------------------------------------- ########################################################## # # # WARNING!!! # # # # This code was compiled with a debugging option. # # To get timing results run ./configure # # using --with-debugging=no, the performance will # # be generally two or three times faster. # # # ########################################################## ====================================================================================== Resource Usage on 2024-07-24 09:51:48: Job Id: 121558958.gadi-pbs Project: y08 Exit Status: 0 Service Units: 0.01 NCPUs Requested: 8 NCPUs Used: 8 CPU Time Used: 00:00:03 Memory Requested: 8.0GB Memory Used: 1.0GB Walltime requested: 00:05:00 Walltime Used: 00:00:03 JobFS requested: 100.0MB JobFS used: 0B ======================================================================================