http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56981



--- Comment #3 from Tobias Burnus <burnus at gcc dot gnu.org> 2013-04-17 
09:39:58 UTC ---

(In reply to comment #2)

> There is a seek inside next_record_w_unf. That function is used for DIRECT 
> I/O.

> Looks conceptually wrong to me for sequential unformatted.  I won't have time

> for a few days to look at this further.



Well, what gfortran does is:



* write place-holder record length in the heading record marker

* write actual data

* write tailing record marker (1st call to write_us_marker in

next_record_w_unf)

* write actual length of this record, i.e. seek back + write_us_marker + see to

past the tailing record marker (all in next_record_w_unf)





I think what other compilers do is to make use of the following item in the

Fortran standard:



"The value of the RECL= specifier shall be positive. It specifies the length of

each record in a file being connected for direct access, or specifies the

maximum length of a record in a file being connected for sequential access."

(F2008, "9.5.6.15 RECL= specifier in the OPEN statement")





I tried the following program:

-------------------------------

integer, allocatable :: array(:)

integer :: rl, i

open(99,file="/dev/null",form="unformatted")

inquire(99,recl=rl)

allocate(array(1024*1024*100))

array = 0

print *,rl, size(array)/4

write(99) (array, i=1,1000)

close(99)

end

-------------------------------



With gfortran, it takes only: 0.203s and one has:

     19 mmap

     26 open

    392 lseek

   1784 write



The question is why there are that many seeks. There should be only a single

record!





With pathf95, it fails after 0.099s with the error:

 "This request exceeds the maximum record size."



And with g95, it takes 4.946s (!) until it fails with "Writing more data than

the record size (RECL)":

     11 close

     17 fstat

     20 mprotect

     21 stat

     25 mmap

     30 write

     47 open

where the mmap+munmap pairs seem to take the lion share of the time.





However, one can do better: NAG f95 only needs 0.007s and does:

      5 read

      6 lseek

      8 mprotect

     10 fstat

     23 stat

     29 mmap

     40 open

   2003 write





Maybe something like the following would work:

* Create a reasonable sized buffer

* Use it to buffer the writes, and if it fits, write the length, the buffer,

the length.

* If the argument is a (too) big array, write the length of data in the buffer

plus array byte size, then the data - and only if another item comes, seek to

the beginning and update the length.



That should take care of:

  write(99) i, j, k

  write(99) i, j, k, small_array

  write(99) big_array

and even

  write(99) i, j, k, big_array

but it will not help for

  write(99) big_array1, big_array2



I think that covers the most important cases. One question is how large the

buffer should be initially, whether it should be resizable - and how long it

should remain allocated. Even a small buffer of 1024 kbyte (= 128 real(8)

values) will help when writing small data like in the example of comment 0.



If it is larger, the issue of freeing the data and/or resizing becomes more

important - and one needs to be careful not to require huge amount of memory

and/or do do very frequent memory allocation+freeing, which causes the problems

with g95.



 * * *



Closer look at NAG: It does the following (allocate moved before open, inquire

removed):



open("/dev/null", O_RDWR)               = 3

mmap(NULL, 3856, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =

0x2aaaaab0e000

mmap(NULL, 1048576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =

0x2aaaaab22000

fstat(3, {st_mode=S_IFCHR|0666, st_rdev=makedev(1, 3), ...}) = 0

ioctl(3, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS,

0x7fff645b1c90) = -1 ENOTTY (Inappropriate ioctl for device)

fstat(3, {st_mode=S_IFCHR|0666, st_rdev=makedev(1, 3), ...}) = 0

ioctl(3, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS,

0x7fff645b2200) = -1 ENOTTY (Inappropriate ioctl for device)

mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =

0x2aaaaac22000

lseek(3, 0, SEEK_CUR)                   = 0



write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,

4096) = 4096

write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,

419426304) = 419426304



(and 999 further write lines)



lseek(3, 0, SEEK_SET)                   = 0

read(3, "", 4096)                       = 0

lseek(3, 12, SEEK_CUR)                  = 0

write(3, "\0\0\0\250", 4)               = 4

lseek(3, 18446744072233156608, SEEK_SET) = 0

read(3, "", 4096)                       = 0

lseek(3, 20, SEEK_CUR)                  = 0

lseek(3, 0, SEEK_CUR)                   = 0

ftruncate(3, 0)                         = -1 EINVAL (Invalid argument)

close(3)                                = 0

munmap(0x2aaaaac22000, 4096)            = 0

munmap(0x2aaaab7a3000, 419430400)       = 0

Reply via email to