On 3/24/20 12:49 PM, Barrett, Richard F via Chapel-users wrote:
Greetings, Chaps,

I have some questions regarding parallel IO, in the context of some basic (1d 
block decomposed) arrays. Apologies in advance if I'm missing this information 
in the documentation or examples:

1) Can order be enforced for writing? Writing within a forall loop, each core 
writes its part of an array, apparently simultaneously, and so things are 
interleaved. I tried this:

                            const IOHINT_PARALLEL = QIO_HINT_PARALLEL;

                            var MatrixOutput = open ( output_matrix_filename, 
iomode.cw, IOHINT_PARALLEL );

                            var AdjMatChannel = MatrixOutput.writer();

                            forall i in AdjMatrix.dom_nnz with (ref AdjMatrix.rowidx ) { 
//  AdjMatrix is a record; can't "ref" a field? Guess it doesn't really matter.
                                  AdjMatChannel.writeln ( AdjMatrix.rowidx[i], " 
", AdjMatrix.colidx[i] );
I'd recommend that you use the start/end offsets when creating the output array. So that each task in the forall is writing to a different portion of the file. This kind of thing is easier to do with binary I/O but can be done with text I/O by first computing the number of bytes that would be written by each task, doing a scan to compute the starting offsets, and then actually writing in a second pass. It would be nice if the I/O system supported "appending" and issue #9992 covers some future work in this area. (But I'm not so sure it would do what you want in that setting either).

2) So far only running on-node. Any expectations/tips for multinode, in 
particular useful means your found for controlling writing, reading, and 
otherwise managing?

The I/O system currently does support writing from a remote node to a channel/file on another node but it is slow. I have long had a TODO to make "remote file local channel" support where you could buffer locally but operate on a remote file. That would help a lot with the performance in such a case. But until we have that, for now, you need to create a file per locale (and probably a channel per task).

3) For multinode, is it possible to configure to write to N files, where N=one 
per node, one per subset of nodes, or one global file?

        a) I do intend to read the file back in and operate on it, where if the 
same number of locales I expect N=one per node to work, otherwise I expect N=1 
necessary. Correct?

In addition to using the built-in IO functionality, the HDF5 module https://chapel-lang.org/docs/latest/modules/packages/HDF5.html provides an interface for reading/writing arrays from/to HDF5 files in parallel. It can read/write a block distributed array into one file in parallel using: https://chapel-lang.org/docs/master/modules/packages/HDF5/IOusingMPI.html This requires both HDF5 and MPI as it uses MPI I/O to do the parallel/distributed operations.

4) At this point I’m writing text, but will switch to binary once confident 
things are working. Any tips in this regard?

If you use the HDF5 interface it will write the arrays in binary by default. If you're using the built-in I/O system switching to binary should make things easier since the array elements will all be the same size.

David

Suggestions, experiences, etc much appreciated.

Richard





_______________________________________________
Chapel-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/chapel-users

_______________________________________________
Chapel-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/chapel-users
  • Chapel IO Barrett, Richard F via Chapel-users
    • Re: Chapel IO David Iten

Reply via email to