On Fri, Jun 5, 2020 at 3:39 PM Hudson, Stephen Tobias P via petsc-users < petsc-users@mcs.anl.gov> wrote:
> It seems I do have to bypass Python's multiprocessing somewhat limited > interface. E.g. > > self.process._popen._send_signal(signal.SIGINT) > > which works, but I am by-passing the API. > > I would support allowing the user to configure at run-time the signal > handling for SIGTERM to exit without MPI_ABORT. I think I understand > MPI_ABORT being the default, I've experienced hangs due to errors on single > processes. > “hangs due to errors on single processes". If the single processes call exit(), then there will be no hang. > ------------------------------ > *From:* Hudson, Stephen Tobias P <shud...@anl.gov> > *Sent:* Friday, June 5, 2020 2:41 PM > *To:* Lisandro Dalcin <dalc...@gmail.com> > *Cc:* Balay, Satish <ba...@mcs.anl.gov>; petsc-users@mcs.anl.gov < > petsc-users@mcs.anl.gov> > *Subject:* Re: Terminating a process running petsc via petsc4py without > mpi_abort > > Thanks, I will experiment with this. > > I am working through the multiprocessing interface, but I can see that the > routines provided there are pretty much wrappers to the process signal > functions. > > I guess the alternative is SIGKILL. > > Steve > ------------------------------ > *From:* Lisandro Dalcin <dalc...@gmail.com> > *Sent:* Thursday, June 4, 2020 4:54 PM > *To:* Hudson, Stephen Tobias P <shud...@anl.gov> > *Cc:* Balay, Satish <ba...@mcs.anl.gov>; petsc-users@mcs.anl.gov < > petsc-users@mcs.anl.gov> > *Subject:* Re: Terminating a process running petsc via petsc4py without > mpi_abort > > (1) You can use PETSc.Sys.pushErrorHandler("abort"), but it will not help > you. What you really need is to override PETSc's default signal handling > > (2) While it is true that PETSc overrides the signal handler, you can > override it again from python after from petsc4py import PETSc. > > For implementing (2), maybe you should try sending SIGINT and not SIGTERM, > such that you can do the following. > > from petsc4py import PETSc > > import signal > signal.signal(signal.SIGINT, signal.default_int_handler) > > ... > > if __name__ == "__main__": > try: > main() > except KeyboardInterrupt: # Triggered if Ctrl+C or signaled with > SIGINT > ... # do cleanup if needed > > Otherwise, you just need signal.signal(signal.SIGINT, signal.SIG_DFL) > > > PS: I'm not in favor of changing current PETSc's signal handling behavior. > This particular issue is fixable with two lines of Python code: > > from signal import signal, SIGINT, SIG_DFL > signal(SIGINT, SIG_DFL) > > > > On Thu, 4 Jun 2020 at 23:39, Hudson, Stephen Tobias P <shud...@anl.gov> > wrote: > > Lisandro, > > I don't see an interface to set this through petsc4py. Is it possible? > > Thanks, > Steve > ------------------------------ > *From:* Hudson, Stephen Tobias P <shud...@anl.gov> > *Sent:* Thursday, June 4, 2020 2:47 PM > *To:* Balay, Satish <ba...@mcs.anl.gov> > *Cc:* petsc-users@mcs.anl.gov <petsc-users@mcs.anl.gov>; Lisandro Dalcin < > dalc...@gmail.com> > *Subject:* Re: Terminating a process running petsc via petsc4py without > mpi_abort > > Sounds good. I will have a look at how to set this through petsc4py. > > Thanks > Steve > ------------------------------ > *From:* Satish Balay <ba...@mcs.anl.gov> > *Sent:* Thursday, June 4, 2020 2:32 PM > *To:* Hudson, Stephen Tobias P <shud...@anl.gov> > *Cc:* petsc-users@mcs.anl.gov <petsc-users@mcs.anl.gov>; Lisandro Dalcin < > dalc...@gmail.com> > *Subject:* Re: Terminating a process running petsc via petsc4py without > mpi_abort > > I don't completely understand the issue here. How is sequential run > different than parallel run? > > In both cases - a PetscErrorHandler is likely getting invoked. One can > change this behavior with: > > > https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PetscPushErrorHandler.html > > And there are a few default error handlers to choose > > > PETSC_EXTERN PetscErrorCode PetscTraceBackErrorHandler(MPI_Comm,int,const > char*,const char*,PetscErrorCode,PetscErrorType,const char*,void*); > PETSC_EXTERN PetscErrorCode PetscIgnoreErrorHandler(MPI_Comm,int,const > char*,const char*,PetscErrorCode,PetscErrorType,const char*,void*); > PETSC_EXTERN PetscErrorCode > PetscEmacsClientErrorHandler(MPI_Comm,int,const char*,const > char*,PetscErrorCode,PetscErrorType,const char*,void*); > PETSC_EXTERN PetscErrorCode PetscMPIAbortErrorHandler(MPI_Comm,int,const > char*,const char*,PetscErrorCode,PetscErrorType,const char*,void*); > PETSC_EXTERN PetscErrorCode PetscAbortErrorHandler(MPI_Comm,int,const > char*,const char*,PetscErrorCode,PetscErrorType,const char*,void*); > PETSC_EXTERN PetscErrorCode > PetscAttachDebuggerErrorHandler(MPI_Comm,int,const char*,const > char*,PetscErrorCode,PetscErrorType,const char*,void*); > PETSC_EXTERN PetscErrorCode PetscReturnErrorHandler(MPI_Comm,int,const > char*,const char*,PetscErrorCode,PetscErrorType,const char*,void*); > > Some of the are accessible via command line option. for ex: > -on_error_abort or -on_error_mpiabort > > Or perhaps you want to completely disable error handler with: > -no_signal_handler > > cc: petsc-users > > Satish > > On Thu, 4 Jun 2020, Hudson, Stephen Tobias P wrote: > > > Satish, > > > > We are having issues caused by MPI_abort getting called when we try to > terminate a sub-process running petsc4py. Ideally we would always use a > serial build of petsc/petsc4py in this mode, but many users will have a > parallel build. We need to be able to send a terminate signal that just > kills the process. > > > > Is there a way to turn off the mpi_abort? > > > > Thanks, > > > > Steve > > > > > > > > -- > Lisandro Dalcin > ============ > Research Scientist > Extreme Computing Research Center (ECRC) > King Abdullah University of Science and Technology (KAUST) > http://ecrc.kaust.edu.sa/ >