Dear Jeff, Ralf and  Manuel

There are some good news,
I added -pthread  to both the compilation and link for running
az_tutorial_with_MPI.f, and I also compiled aztec with -pthread
Now the code runs O.K for np=1,2.

Now bad news: when I try running with 3,4 or more processors I get a similar error message:

mpirun -np 3 sample

[cluster:25805] *** Process received signal ***
[cluster:25805] Signal: Segmentation fault (11)
[cluster:25805] Signal code:  (128)
[cluster:25805] Failing at address: (nil)
[cluster:25805] [ 0] /lib/libpthread.so.0 [0x7fbe20cb5a80]
[cluster:25805] [ 1] /shared/lib/libmpi.so.0 [0x7fbe221325f7]
[cluster:25805] [ 2] /shared/lib/libmpi.so.0(PMPI_Wait+0x38) [0x7fbe22160a48]
[cluster:25805] [ 3] sample(md_wrap_wait+0x17) [0x41ccba]
[cluster:25805] [ 4] sample(AZ_find_procs_for_externs+0x5bf) [0x4177e7]
[cluster:25805] [ 5] sample(AZ_transform+0x1c3) [0x418372]
[cluster:25805] [ 6] sample(az_transform_+0x84) [0x407943]
[cluster:25805] [ 7] sample(MAIN__+0x19a) [0x407708]
[cluster:25805] [ 8] sample(main+0x2c) [0x44e00c]
[cluster:25805] [ 9] /lib/libc.so.6(__libc_start_main+0xe6) [0x7fbe209721a6]
[cluster:25805] [10] sample [0x4073b9]
[cluster:25805] *** End of error message ***
--------------------------------------------------------------------------
mpirun noticed that process rank 1 with PID 25805 on node cluster exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------

When I try running on 4 4pcessors I get a double message (from 2 processors).
    mpirun -np 4 sample

[cluster:25946] *** Process received signal ***
[cluster:25946] Signal: Segmentation fault (11)
[cluster:25946] Signal code:  (128)
[cluster:25946] Failing at address: (nil)
[cluster:25947] *** Process received signal ***
[cluster:25947] Signal: Segmentation fault (11)
[cluster:25947] Signal code:  (128)
[cluster:25947] Failing at address: (nil)
[cluster:25946] [ 0] /lib/libpthread.so.0 [0x7f4ae4c6ba80]
[cluster:25946] [ 1] /shared/lib/libmpi.so.0 [0x7f4ae60e85f7]
[cluster:25946] [ 2] /shared/lib/libmpi.so.0(PMPI_Wait+0x38) [0x7f4ae6116a48]
[cluster:25946] [ 3] sample(md_wrap_wait+0x17) [0x41ccba]
[cluster:25946] [ 4] sample(AZ_find_procs_for_externs+0x5bf) [0x4177e7]
[cluster:25947] [ 0] /lib/libpthread.so.0 [0x7f7dc5350a80]
[cluster:25946] [ 5] sample(AZ_transform+0x1c3) [0x418372]
[cluster:25946] [ 6] sample(az_transform_+0x84) [0x407943]
[cluster:25946] [ 7] sample(MAIN__+0x19a) [0x407708]
[cluster:25946] [ 8] sample(main+0x2c) [0x44e00c]
[cluster:25946] [ 9] /lib/libc.so.6(__libc_start_main+0xe6) [0x7f4ae49281a6]
[cluster:25946] [10] sample [0x4073b9]
[cluster:25946] *** End of error message ***
[cluster:25947] [ 1] /shared/lib/libmpi.so.0 [0x7f7dc67cd5f7]
[cluster:25947] [ 2] /shared/lib/libmpi.so.0(PMPI_Wait+0x38) [0x7f7dc67fba48]
[cluster:25947] [ 3] sample(md_wrap_wait+0x17) [0x41ccba]
[cluster:25947] [ 4] sample(AZ_find_procs_for_externs+0x5bf) [0x4177e7]
[cluster:25947] [ 5] sample(AZ_transform+0x1c3) [0x418372]
[cluster:25947] [ 6] sample(az_transform_+0x84) [0x407943]
[cluster:25947] [ 7] sample(MAIN__+0x19a) [0x407708]
[cluster:25947] [ 8] sample(main+0x2c) [0x44e00c]
[cluster:25947] [ 9] /lib/libc.so.6(__libc_start_main+0xe6) [0x7f7dc500d1a6]
[cluster:25947] [10] sample [0x4073b9]
[cluster:25947] *** End of error message ***
--------------------------------------------------------------------------
mpirun noticed that process rank 1 with PID 25946 on node cluster exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------




Attached is the file found in AZTEC named:  md_wrap_mpi_c.c
This might give you some further hint.



Rachel

  Dr.  Rachel Gordon
  Senior Research Fellow                Phone: +972-4-8293811
  Dept. of Aerospace Eng.               Fax:   +972 - 4 - 8292030
  The Technion, Haifa 32000, Israel     email: rgor...@tx.technion.ac.il


On Thu, 2 Sep 2010, Ralf Wildenhues wrote:

Hello Rachel, Jeff,

* Rachel Gordon wrote on Thu, Sep 02, 2010 at 01:35:37PM CEST:
The cluster I am trying to run on has only the openmpi MPI version.
So, mpif77 is equivalent to mpif77.openmpi and mpicc is equivalent
to mpicc.openmpi

I changed the Makefile, replacing gfortran by mpif77 and gcc by mpicc.
The compilation and linkage stage ran with no problem:

mpif77 -O   -I../lib -DMAX_MEM_SIZE=16731136 -DCOMM_BUFF_SIZE=200000
-DMAX_CHUNK_SIZE=200000  -c -o az_tutorial_with_MPI.o
az_tutorial_with_MPI.f
mpif77 az_tutorial_with_MPI.o -O -L../lib -laztec      -o sample

Can you retry but this time add -pthread to both compile and link
command?

There were other reports on the OpenMPI devel list that some pthread
flags have gone missing somewhere.  It might well be that that caused
its libraries to already be built wrongly, or just the application,
I'm not sure.  But the segfault inside libpthread is suspicious.

Thanks,
Ralf

But again when I try to run 'sample' I get:

mpirun -np 1 sample


[cluster:24989] *** Process received signal ***
[cluster:24989] Signal: Segmentation fault (11)
[cluster:24989] Signal code: Address not mapped (1)
[cluster:24989] Failing at address: 0x100000098
[cluster:24989] [ 0] /lib/libpthread.so.0 [0x7f5058036a80]
[cluster:24989] [ 1] /shared/lib/libmpi.so.0(MPI_Comm_size+0x6e)
[0x7f50594ce34e]
[cluster:24989] [ 2] sample(parallel_info+0x24) [0x41d2ba]
[cluster:24989] [ 3] sample(AZ_set_proc_config+0x2d) [0x408417]
[cluster:24989] [ 4] sample(az_set_proc_config_+0xc) [0x407b85]
[cluster:24989] [ 5] sample(MAIN__+0x54) [0x407662]
[cluster:24989] [ 6] sample(main+0x2c) [0x44e8ec]
[cluster:24989] [ 7] /lib/libc.so.6(__libc_start_main+0xe6)
[0x7f5057cf31a6]
[cluster:24989] [ 8] sample [0x407459]
[cluster:24989] *** End of error message ***
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 24989 on node cluster
exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------

/*====================================================================
 * ------------------------
 * | CVS File Information |
 * ------------------------
 *
 * $RCSfile: md_wrap_mpi_c.c,v $
 *
 * $Author: tuminaro $
 *
 * $Date: 1998/12/21 19:36:24 $
 *
 * $Revision: 5.3 $
 *
 * $Name:  $
 *====================================================================*/
#ifndef lint
static char *cvs_wrapmpi_id =
  "$Id: md_wrap_mpi_c.c,v 5.3 1998/12/21 19:36:24 tuminaro Exp $";
#endif


/*******************************************************************************
 * Copyright 1995, Sandia Corporation.  The United States Government retains a *
 * nonexclusive license in this software as prescribed in AL 88-1 and AL 91-7. *
 * Export of this program may require a license from the United States         *
 * Government.                                                                 *
 ******************************************************************************/


#include <stdlib.h>
#include <stdio.h>
#include <mpi.h>

int gl_rbuf = 3;
int gl_sbuf = 3;
/******************************************************************************/
/******************************************************************************/
/******************************************************************************/
int the_proc_name = -1;

void get_parallel_info(int *proc, int *nprocs, int *dim)

{

  /* local variables */

  int i;

  MPI_Comm_size(MPI_COMM_WORLD, nprocs);
  MPI_Comm_rank(MPI_COMM_WORLD, proc);
  *dim = 0;
the_proc_name = *proc;

} /* get_parallel_info */

/******************************************************************************/
/******************************************************************************/
/******************************************************************************/

/******************************************************************************/
/******************************************************************************/
/******************************************************************************/

int md_read(char *buf, int bytes, int *source, int *type, int *flag)

{

  int        err, buffer = 1;
  MPI_Status status;

  if (*type   == -1) *type   = MPI_ANY_TAG;
  if (*source == -1) *source = MPI_ANY_SOURCE;

  if (bytes == 0) {
    err = MPI_Recv(&gl_rbuf, 1, MPI_BYTE, *source, *type, MPI_COMM_WORLD,
                   &status);
  }
  else {
    err = MPI_Recv(buf, bytes, MPI_BYTE, *source, *type, MPI_COMM_WORLD,
                   &status);
  }

  if (err != 0) (void) fprintf(stderr, "MPI_Recv error = %d\n", err);
  MPI_Get_count(&status,MPI_BYTE,&buffer);
  *source = status.MPI_SOURCE;
  *type   = status.MPI_TAG;
  if (bytes != 0) bytes = buffer;

  return bytes;

} /* md_read */


/******************************************************************************/
/******************************************************************************/
/******************************************************************************/

int md_write(char *buf, int bytes, int dest, int type, int *flag)

{

  int err;

  if (bytes == 0) {
    err = MPI_Send(&gl_sbuf, 1, MPI_BYTE, dest, type, MPI_COMM_WORLD);
  }
  else {
    err = MPI_Send(buf, bytes, MPI_BYTE, dest, type, MPI_COMM_WORLD);
  }

  if (err != 0) (void) fprintf(stderr, "MPI_Send error = %d\n", err);

  return 0;

} /* md_write */



/******************************************************************************/
/******************************************************************************/
/******************************************************************************/

int md_wrap_iread(void *buf, int bytes, int *source, int *type,
                  MPI_Request *request)


/*******************************************************************************

  Machine dependent wrapped message-reading communication routine for MPI.

  Author:          Scott A. Hutchinson, SNL, 9221
  =======

  Return code:     int
  ============

  Parameter list:
  ===============

  buf:             Beginning address of data to be sent.

  bytes:           Length of message in bytes.

  source:          Source processor number.

  type:            Message type

*******************************************************************************/

{

  int err = 0;

  if (*type   == -1) *type   = MPI_ANY_TAG;
  if (*source == -1) *source = MPI_ANY_SOURCE;

  if (bytes == 0) {
    err = MPI_Irecv(&gl_rbuf, 1, MPI_BYTE, *source, *type, MPI_COMM_WORLD,
                    request);
  }
  else {
    err = MPI_Irecv(buf, bytes, MPI_BYTE, *source, *type, MPI_COMM_WORLD,
                    request);
  }

  return err;

} /* md_wrap_iread */


/******************************************************************************/
/******************************************************************************/
/******************************************************************************/

int md_wrap_write(void *buf, int bytes, int dest, int type, int *flag)

/*******************************************************************************

  Machine dependent wrapped message-sending communication routine for MPI.

  Author:          Scott A. Hutchinson, SNL, 9221
  =======

  Return code:     int
  ============

  Parameter list:
  ===============

  buf:             Beginning address of data to be sent.

  bytes:           Length of message in bytes.

  dest:            Destination processor number.

  type:            Message type

  flag:

*******************************************************************************/

{

  int err = 0;

  if (bytes == 0) {
    err = MPI_Send(&gl_sbuf, 1, MPI_BYTE, dest, type, MPI_COMM_WORLD);
  }
  else {
    err = MPI_Send(buf, bytes, MPI_BYTE, dest, type, MPI_COMM_WORLD);
  }

  return err;

} /* md_wrap_write */



/******************************************************************************/
/******************************************************************************/
/******************************************************************************/

int md_wrap_wait(void *buf, int bytes, int *source, int *type, int *flag,
                 MPI_Request *request)

/*******************************************************************************

  Machine dependent wrapped message-wait communication routine for MPI.

  Author:          Scott A. Hutchinson, SNL, 9221
  =======

  Return code:     int
  ============

  Parameter list:
  ===============

  buf:             Beginning address of data to be sent.

  bytes:           Length of message in bytes.
  dest:            Destination processor number.

  type:            Message type

  flag:

*******************************************************************************/

{

  int        err, count;
  MPI_Status status;

  if ( MPI_Wait(request, &status) ) {
    (void) fprintf(stderr, "MPI_Wait error\n");
    exit(-1);
  }

  MPI_Get_count(&status, MPI_BYTE, &count);
  *source = status.MPI_SOURCE;
  *type   = status.MPI_TAG;

  /* return the count, which is in bytes */

  return count;

} /* md_wrap_wait */

/******************************************************************************/
/******************************************************************************/
/******************************************************************************/

int md_wrap_iwrite(void *buf, int bytes, int dest, int type, int *flag,
                  MPI_Request *request)

/*******************************************************************************

  Machine dependent wrapped message-sending (nonblocking) communication 
  routine for MPI.

  Author:          Scott A. Hutchinson, SNL, 9221
  =======

  Return code:     int
  ============

  Parameter list:
  ===============

  buf:             Beginning address of data to be sent.

  bytes:           Length of message in bytes.

  dest:            Destination processor number.

  type:            Message type

  flag:

*******************************************************************************/

{

  int err = 0;

  if (bytes == 0) {
    err = MPI_Isend(&gl_sbuf, 1, MPI_BYTE, dest, type, MPI_COMM_WORLD,
                  request);
  }
  else {
    err = MPI_Isend(buf, bytes, MPI_BYTE, dest, type, MPI_COMM_WORLD,
                  request);
  }

  return err;

} /* md_wrap_write */


/********************************************************************/
/*     NEW WRAPPERS to handle MPI Communicators                     */
/********************************************************************/

void parallel_info(int *proc,int *nprocs,int *dim, MPI_Comm comm)
{

  /* local variables */

  int i;

  MPI_Comm_size(comm, nprocs);
  MPI_Comm_rank(comm, proc);
  *dim = 0;
the_proc_name = *proc;

} /* get_parallel_info */
/******************************************************************************/
/******************************************************************************/
/******************************************************************************/

int md_mpi_iread(void *buf, int bytes, int *source, int *type,
                  MPI_Request *request, int *icomm)


/*******************************************************************************

  Machine dependent wrapped message-reading communication routine for MPI.

  Author:          Scott A. Hutchinson, SNL, 9221
  =======

  Return code:     int
  ============

  Parameter list:
  ===============

  buf:             Beginning address of data to be sent.

  bytes:           Length of message in bytes.

  source:          Source processor number.

  type:            Message type

  icomm:           MPI Communicator
*******************************************************************************/

{

  int err = 0;
  MPI_Comm *comm;

  comm = (MPI_Comm *) icomm;

  if (*type   == -1) *type   = MPI_ANY_TAG;
  if (*source == -1) *source = MPI_ANY_SOURCE;

  if (bytes == 0) {
    err = MPI_Irecv(&gl_rbuf, 1, MPI_BYTE, *source, *type, *comm,
                    request);
  }
  else {
    err = MPI_Irecv(buf, bytes, MPI_BYTE, *source, *type, *comm,
                    request);
  }

  return err;

} /* md_mpi_iread */


/******************************************************************************/
/******************************************************************************/
/******************************************************************************/

int md_mpi_write(void *buf, int bytes, int dest, int type, int *flag,
                  int *icomm)

/*******************************************************************************

  Machine dependent wrapped message-sending communication routine for MPI.

  Author:          Scott A. Hutchinson, SNL, 9221
  =======

  Return code:     int
  ============

  Parameter list:
  ===============

  buf:             Beginning address of data to be sent.

  bytes:           Length of message in bytes.

  dest:            Destination processor number.

  type:            Message type

  flag:

  icomm:           MPI Communicator

*******************************************************************************/

{

  int err = 0;
  MPI_Comm *comm;

  comm = (MPI_Comm *) icomm;

  if (bytes == 0) {
    err = MPI_Send(&gl_sbuf, 1, MPI_BYTE, dest, type, *comm);
  }
  else {
    err = MPI_Send(buf, bytes, MPI_BYTE, dest, type, *comm);
  }

  return err;

} /* md_wrap_write */

/******************************************************************************/
/******************************************************************************/
/******************************************************************************/

int md_mpi_wait(void *buf, int bytes, int *source, int *type, int *flag,
                 MPI_Request *request, int *icomm)

/*******************************************************************************

  Machine dependent wrapped message-wait communication routine for MPI.

  Author:          Scott A. Hutchinson, SNL, 9221
  =======

  Return code:     int
  ============

  Parameter list:
  ===============

  buf:             Beginning address of data to be sent.

  bytes:           Length of message in bytes.
  dest:            Destination processor number.

  type:            Message type

  flag:

  icomm:           MPI Communicator

*******************************************************************************/

{

  int        err, count;
  MPI_Status status;

  if ( MPI_Wait(request, &status) ) {
    (void) fprintf(stderr, "MPI_Wait error\n");
    exit(-1);
  }

  MPI_Get_count(&status, MPI_BYTE, &count);
  *source = status.MPI_SOURCE;
  *type   = status.MPI_TAG;

  /* return the count, which is in bytes */

  return count;

} /* md_mpi_wait */

/******************************************************************************/
/******************************************************************************/
/******************************************************************************/

int md_mpi_iwrite(void *buf, int bytes, int dest, int type, int *flag,
                  MPI_Request *request, int *icomm)

/*******************************************************************************

  Machine dependent wrapped message-sending (nonblocking) communication
  routine for MPI.

  Author:          Scott A. Hutchinson, SNL, 9221
  =======

  Return code:     int
  ============

  Parameter list:
  ===============

  buf:             Beginning address of data to be sent.

  bytes:           Length of message in bytes.

  dest:            Destination processor number.

  type:            Message type

  flag:

  icomm:           MPI Communicator

*******************************************************************************/
{

  int err = 0;
  MPI_Comm *comm;

  comm = (MPI_Comm *) icomm ;
  if (bytes == 0)
    err = MPI_Isend(&gl_sbuf, 1, MPI_BYTE, dest, type, *comm, request);
  else
    err = MPI_Isend(buf, bytes, MPI_BYTE, dest, type, *comm, request);

  return err;

} /* md_mpi_write */

Reply via email to