Thanks Stefano.

Reading the manual pages a bit more carefully,
I think I can see what I should be doing.  Which should be roughly to

1. Set up target Seq vectors on PETSC_COMM_SELF
2. Use ISCreateGeneral to create ISs for the target Vecs  and the source Vec which will be MPI on PETSC_COMM_WORLD.
3. Create the scatter context with VecScatterCreate
4. Call VecScatterBegin/End on each process (instead of using my prior routine).

Lingering questions:

a. Is there any performance advantage/disadvantage to creating a single parallel target Vec instead
of multiple target Seq Vecs (in terms of the scatter operation)?

b. The data that ends up in the target on each processor needs to be in an application array.  Is there a clever way to 'move' the data from the scatter target to the array (short
of just running a loop over it and copying)?

-sanjay



On 5/31/19 12:02 PM, Stefano Zampini wrote:


On May 31, 2019, at 9:50 PM, Sanjay Govindjee via petsc-users <petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov>> wrote:

Matt,
  Here is the process as it currently stands:

1) I have a PETSc Vec (sol), which come from a KSPSolve

2) Each processor grabs its section of sol via VecGetOwnershipRange and VecGetArrayReadF90 and inserts parts of its section of sol in a local array (locarr) using a complex but easily computable mapping.

3) The routine you are looking at then exchanges various parts of the locarr between the processors.


You need a VecScatter object https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Vec/VecScatterCreate.html#VecScatterCreate

4) Each processor then does computations using its updated locarr.

Typing it out this way, I guess the answer to your question is "yes."  I have a global Vec and I want its values
sent in a complex but computable way to local vectors on each process.

-sanjay
On 5/31/19 3:37 AM, Matthew Knepley wrote:
On Thu, May 30, 2019 at 11:55 PM Sanjay Govindjee via petsc-users <petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov>> wrote:

    Hi Juanchao,
    Thanks for the hints below, they will take some time to absorb
    as the vectors that are being moved around
    are actually partly petsc vectors and partly local process vectors.


Is this code just doing a global-to-local map? Meaning, does it just map all the local unknowns to some global unknown on some process? We have an even simpler interface for that, where we make the VecScatter
automatically,

https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/IS/ISLocalToGlobalMappingCreate.html#ISLocalToGlobalMappingCreate

Then you can use it with Vecs, Mats, etc.

  Thanks,

     Matt

    Attached is the modified routine that now works (on leaking
    memory) with openmpi.

    -sanjay
    On 5/30/19 8:41 PM, Zhang, Junchao wrote:

    Hi, Sanjay,
      Could you send your modified data exchange code (psetb.F)
    with MPI_Waitall? See other inlined comments below. Thanks.

    On Thu, May 30, 2019 at 1:49 PM Sanjay Govindjee via
    petsc-users <petsc-users@mcs.anl.gov
    <mailto:petsc-users@mcs.anl.gov>> wrote:

        Lawrence,
        Thanks for taking a look!  This is what I had been
        wondering about -- my
        knowledge of MPI is pretty minimal and
        this origins of the routine were from a programmer we hired
        a decade+
        back from NERSC.  I'll have to look into
        VecScatter.  It will be great to dispense with our
        roll-your-own
        routines (we even have our own reduceALL scattered around
        the code).

    Petsc VecScatter has a very simple interface and you definitely
    should go with.  With VecScatter, you can think in familiar
    vectors and indices instead of the low level MPI_Send/Recv.
    Besides that, PETSc has optimized VecScatter so that
    communication is efficient.


        Interestingly, the MPI_WaitALL has solved the problem when
        using OpenMPI
        but it still persists with MPICH. Graphs attached.
        I'm going to run with openmpi for now (but I guess I really
        still need
        to figure out what is wrong with MPICH and WaitALL;
        I'll try Barry's suggestion of
        --download-mpich-configure-arguments="--enable-error-messages=all

        --enable-g" later today and report back).

        Regarding MPI_Barrier, it was put in due a problem that
        some processes
        were finishing up sending and receiving and exiting the
        subroutine
        before the receiving processes had completed (which
        resulted in data
        loss as the buffers are freed after the call to the routine).
        MPI_Barrier was the solution proposed
        to us.  I don't think I can dispense with it, but will
        think about some
        more.

    After MPI_Send(), or after MPI_Isend(..,req) and MPI_Wait(req),
    you can safely free the send buffer without worry that the
    receive has not completed. MPI guarantees the receiver can get
    the data, for example, through internal buffering.


        I'm not so sure about using MPI_IRecv as it will require a
        bit of
        rewriting since right now I process the received
        data sequentially after each blocking MPI_Recv -- clearly
        slower but
        easier to code.

        Thanks again for the help.

        -sanjay

        On 5/30/19 4:48 AM, Lawrence Mitchell wrote:
        > Hi Sanjay,
        >
        >> On 30 May 2019, at 08:58, Sanjay Govindjee via
        petsc-users <petsc-users@mcs.anl.gov
        <mailto:petsc-users@mcs.anl.gov>> wrote:
        >>
        >> The problem seems to persist but with a different
        signature.  Graphs attached as before.
        >>
        >> Totals with MPICH (NB: single run)
        >>
        >> For the CG/Jacobi data_exchange_total = 41,385,984;
        kspsolve_total = 38,289,408
        >> For the GMRES/BJACOBI data_exchange_total = 41,324,544;
        kspsolve_total = 41,324,544
        >>
        >> Just reading the MPI docs I am wondering if I need some
        sort of MPI_Wait/MPI_Waitall before my MPI_Barrier in the
        data exchange routine?
        >> I would have thought that with the blocking receives and
        the MPI_Barrier that everything will have fully completed
        and cleaned up before
        >> all processes exited the routine, but perhaps I am wrong
        on that.
        >
        > Skimming the fortran code you sent you do:
        >
        > for i in ...:
        >     call MPI_Isend(..., req, ierr)
        >
        > for i in ...:
        >     call MPI_Recv(..., ierr)
        >
        > But you never call MPI_Wait on the request you got back
        from the Isend. So the MPI library will never free the data
        structures it created.
        >
        > The usual pattern for these non-blocking communications
        is to allocate an array for the requests of length
        nsend+nrecv and then do:
        >
        > for i in nsend:
        >     call MPI_Isend(..., req[i], ierr)
        > for j in nrecv:
        >     call MPI_Irecv(..., req[nsend+j], ierr)
        >
        > call MPI_Waitall(req, ..., ierr)
        >
        > I note also there's no need for the Barrier at the end of
        the routine, this kind of communication does neighbourwise
        synchronisation, no need to add (unnecessary) global
        synchronisation too.
        >
        > As an aside, is there a reason you don't use PETSc's
        VecScatter to manage this global to local exchange?
        >
        > Cheers,
        >
        > Lawrence




--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>



Reply via email to