> On May 8, 2019, at 8:54 PM, Matthew Knepley via petsc-users > <petsc-users@mcs.anl.gov> wrote: > > On Wed, May 8, 2019 at 9:00 PM Zhang, Hong <hzh...@mcs.anl.gov> wrote: > Justin: > Great, the issue is resolved. > Why MatSetOption(J,MAT_NEW_NONZERO_ALLOCATION_ERR,PETSC_FALSE) does not raise > an error? > > Because it has PETSC_FALSE. Why would it EVER be set to false? Does DMNetwork preallocate appropriately? If not, why? Barry > > Matt, > > We usually prevent this with a structured SetValues API. For example, DMDA > uses MatSetValuesStencil() which cannot write > outside the stencil you set. DMPlex uses MatSetValuesClosure(), which is > guaranteed to be allocated. We should write one > for DMNetwork. The allocation is just like Plex (I believe) where you > allocate closure(star(p)), which would mean that setting > values for a vertex gets the neighboring edges and their vertices, and > setting values for an edge gets the covering vertices. > Is that right for DMNetwork? > Yes, DMNetwork behaves in this fashion. > I cannot find MatSetValuesClosure() in petsc-master. > > I mean > https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DMPLEX/DMPlexMatSetClosure.html > > Can you provide detailed instruction on how to implement > MatSetValuesClosure() for DMNetwork? > > It will just work as is for edges, but not for vertices since you want to set > the star, not the closure. You would > just need to reverse exactly what is in that function. > > Thanks, > > Matt > > Note, dmnetwork is a subclass of DMPlex. > > Hong > > > On Wed, May 8, 2019 at 4:00 PM Dave May <dave.mayhe...@gmail.com> wrote: > > > On Wed, 8 May 2019 at 20:34, Justin Chang via petsc-users > <petsc-users@mcs.anl.gov> wrote: > So here's the branch/repo to the working example I have: > > https://github.com/jychang48/petsc-dss/tree/single-bus-vertex > > Type 'make' to compile the dss, it should work with the latest petsc-dev > > To test the performance, I've taken an existing IEEE 13-bus and duplicated it > N times to create a long radial-like network. I have three sizes where N = > 100, 500, and 1000. Those test files are listed as: > > input/test_100.m > input/test_500.m > input/test_1000.m > > I also created another set of examples where the IEEE 13-bus is fully > balanced (but the program will crash ar the solve step because I used some > unrealistic parameters for the Y-bus matrices and probably have some zeros > somewhere). They are listed as: > > input/test2_100.m > input/test2_500.m > input/test2_1000.m > > The dof count and matrices for the test2_*.m files are slightly larger than > their respective test_*.m but they have a bs=6. > > To run these tests, type the following: > > ./dpflow -input input/test_100.m > > I have a timer that shows how long it takes to compute the Jacobian. Attached > are the log outputs I have for each of the six cases. > > Turns out that only the first call to the SNESComputeJacobian() is slow, all > the subsequent calls are fast as I expect. This makes me think it still has > something to do with matrix allocation. > > I think it is a preallocation issue. > Looking to some of the output files (test_1000.out, test_100.out), under Mat > Object I see this in the KSPView > > total number of mallocs used during MatSetValues calls =10000 > > > > > > Thanks for the help everyone, > > Justin > > On Wed, May 8, 2019 at 12:36 PM Matthew Knepley <knep...@gmail.com> wrote: > On Wed, May 8, 2019 at 2:30 PM Justin Chang <jychan...@gmail.com> wrote: > Hi everyone, > > Yes I have these lines in my code: > > ierr = DMCreateMatrix(networkdm,&J);CHKERRQ(ierr); > ierr = > MatSetOption(J,MAT_NEW_NONZERO_ALLOCATION_ERR,PETSC_FALSE);CHKERRQ(ierr); > > Okay, its not allocation. So maybe Hong is right that its setting great big > element matrices. We will see with the example. > > Thanks, > > Matt > > I tried -info and here's my output: > > [0] PetscInitialize(): PETSc successfully started: number of processors = 1 > [0] PetscInitialize(): Running on machine: jchang31606s.domain > [0] PetscCommDuplicate(): Duplicating a communicator 4436504608 > 140550815662944 max tags = 2147483647 > [0] PetscCommDuplicate(): Using internal PETSc communicator 4436504608 > 140550815662944 > [0] PetscCommDuplicate(): Using internal PETSc communicator 4436504608 > 140550815662944 > Base power = 0.166667, numbus = 115000, numgen = 5000, numyl = 75000, numdl = > 5000, numlbr = 109999, numtbr = 5000 > > **** Power flow dist case **** > > Base power = 0.166667, nbus = 115000, ngen = 5000, nwye = 75000, ndelta = > 5000, nbranch = 114999 > [0] PetscCommDuplicate(): Duplicating a communicator 4436505120 > 140550815683104 max tags = 2147483647 > [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120 > 140550815683104 > [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120 > 140550815683104 > [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120 > 140550815683104 > [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120 > 140550815683104 > [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120 > 140550815683104 > [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120 > 140550815683104 > [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120 > 140550815683104 > [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120 > 140550815683104 > [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120 > 140550815683104 > [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120 > 140550815683104 > [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120 > 140550815683104 > [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120 > 140550815683104 > [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 620000 X 620000; storage space: 0 > unneeded,10799928 used > [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 28 > [0] MatCheckCompressedRow(): Found the ratio (num_zerorows 0)/(num_localrows > 620000) < 0.6. Do not use CompressedRow routines. > [0] MatSeqAIJCheckInode(): Found 205000 nodes of 620000. Limit used: 5. Using > Inode routines > [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120 > 140550815683104 > [0] PetscCommDuplicate(): Using internal PETSc communicator 4436504608 > 140550815662944 > [0] DMGetDMSNES(): Creating new DMSNES > [0] DMGetDMKSP(): Creating new DMKSP > [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120 > 140550815683104 > 0 SNES Function norm 1155.45 > > nothing else -info related shows up as I'm iterating through the vertex loop. > > I'll have a MWE for you guys to play with shortly. > > Thanks, > Justin > > On Wed, May 8, 2019 at 12:10 PM Smith, Barry F. <bsm...@mcs.anl.gov> wrote: > > Justin, > > Are you providing matrix entries that connect directly one vertex to > another vertex ACROSS an edge? I don't think that is supported by the > DMNetwork model. The assumption is that edges are only connected to vertices > and vertices are only connected to neighboring edges. > > Everyone, > > I second Matt's reply. > > How is the DMNetwork preallocating for the Jacobian? Does it take into > account coupling between neighboring vertices/edges? Or does it assume no > coupling. Or assume full coupling. If it assumes no coupling and the user has > a good amount of coupling it will be very slow. > > There would need to be a way for the user provide the coupling information > between neighboring vertices/edges if it assumes no coupling. > > Barry > > > > On May 8, 2019, at 7:44 AM, Matthew Knepley via petsc-users > > <petsc-users@mcs.anl.gov> wrote: > > > > On Wed, May 8, 2019 at 4:45 AM Justin Chang via petsc-users > > <petsc-users@mcs.anl.gov> wrote: > > Hi guys, > > > > I have a fully working distribution system solver written using DMNetwork, > > The idea is that each electrical bus can have up to three phase nodes, and > > each phase node has two unknowns: voltage magnitude and angle. In a > > completely balanced system, each bus has three nodes, but in an unbalanced > > system some of the buses can be either single phase or two-phase. > > > > The working DMNetwork code I developed, loosely based on the SNES > > network/power.c, essentially represents each vertex as a bus. > > DMNetworkAddNumVariables() function will add either 2, 4, or 6 unknowns to > > each vertex. If every single bus had the same number of variables, the mat > > block size = 2, 4, or 6, and my code is both fast and scalable. However, if > > the unknowns per DMNetwork vertex unknowns are not the same across, then my > > SNESFormJacobian function becomes extremely extremely slow. Specifically, > > the MatSetValues() calls when the col/row global indices contain an offset > > value that points to a neighboring bus vertex. > > > > I have never seen MatSetValues() be slow unless it is allocating. Did you > > confirm that you are not allocating, with -info? > > > > Thanks, > > > > MAtt > > > > Why is that? Is it because I no longer have a uniform block structure and > > lose the speed/optimization benefits of iterating through an AIJ matrix? I > > see three potential workarounds: > > > > 1) Treat every vertex as a three phase bus and "zero out" all the unused > > phase node dofs and put a 1 in the diagonal. The problem I see with this is > > that I will have unnecessary degrees of freedom (aka non-zeros in the > > matrix). From the distribution systems I've seen, it's possible that > > anywhere from 1/3 to 1/2 of the buses will be two-phase or less, meaning I > > may have nearly twice the amount of dofs than necessary if I wanted to > > preserve the block size = 6 for the AU mat. > > > > 2) Treat every phase node as a vertex aka solve a single-phase power flow > > solver. That way I guarantee to have a block size = 2, this is what > > Domenico's former student did in his thesis work. The problem I see with > > this is that I have a larger graph, which can take more time to setup and > > parallelize. > > > > 3) Create a "fieldsplit" where I essentially have three "blocks" - one for > > buses with all three phases, another for buses with only two phases, one > > for single-phase buses. This way each block/fieldsplit will have a > > consistent block size. I am not sure if this will solve the MatSetValues() > > issues, but it's, but can anyone give pointers on how to go about achieving > > this? > > > > Thanks, > > Justin > > > > > > -- > > What most experimenters take for granted before they begin their > > experiments is infinitely more interesting than any results to which their > > experiments lead. > > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > > > -- > What most experimenters take for granted before they begin their experiments > is infinitely more interesting than any results to which their experiments > lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > -- > What most experimenters take for granted before they begin their experiments > is infinitely more interesting than any results to which their experiments > lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > -- > What most experimenters take for granted before they begin their experiments > is infinitely more interesting than any results to which their experiments > lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/
Re: [petsc-users] Extremely slow DMNetwork Jacobian assembly
Smith, Barry F. via petsc-users Wed, 08 May 2019 22:19:47 -0700
- Re: [petsc-users] Extremely slow DMN... Smith, Barry F. via petsc-users
- Re: [petsc-users] Extremely slo... Justin Chang via petsc-users
- Re: [petsc-users] Extremely... Zhang, Hong via petsc-users
- Re: [petsc-users] Extremely... Zhang, Hong via petsc-users
- Re: [petsc-users] Extre... Abhyankar, Shrirang G via petsc-users
- Re: [petsc-users] E... Abhyankar, Shrirang G via petsc-users
- Re: [petsc-use... Smith, Barry F. via petsc-users
- Re: [petsc... Abhyankar, Shrirang G via petsc-users
- Re: [petsc... Zhang, Hong via petsc-users
- Re: [petsc... Smith, Barry F. via petsc-users
- Re: [petsc-users] Extre... Smith, Barry F. via petsc-users