I would guess your first example doesn't show what you expect
because you truncated the file.

you can use the glob module to get all .gz files and iterate trough:
something like below (adjusting for your windows paths). I think you have to
.read() the entire file into memory and use numpy's fromstring() since
it doesn't
recognize the gzip file handle in fromfile.


import gzip
import glob
import numpy as np

num_lon = 10
num_lat = 10

def read_bin(gz, num_lon, num_lat):
    arr = np.fromstring(gz.read(), dtype='f').reshape(num_lon, num_lat)
    return arr

for gz in (gzip.open(f) for f in glob.glob("/tmp/*.gz")):
    arr = read_bin(gz, num_lon, num_lat)
    print arr[5, 3]



On Fri, Jul 29, 2011 at 4:45 AM, Hanlie Pretorius
<[email protected]> wrote:
> Hi,
>
> I'm working on Windows XP with Python 2.6.
>
> I need to read and process hundreds of GSMaP binary files that are in the
> .gz archive format.
>
> I found a site (http://www.doughellmann.com/PyMOTW/gzip/) and tried
> their code with two files: one of the hundreds of files that I need to
> process (f1 below) and one that I created with 7-Zip from a text file
> that contains the text 'Text to test gzip module.' (f2 below). The
> code and the output follow:
>
> [code]
> import gzip
>
> f1 = 'GSMaP_MVK+.20050101.00.0.1deg.hourly.v484.gz'
> f2 = ''text.txt.gz'
> if1 = gzip.open(f1, 'rb')
> if2 = gzip.open(f2,'rb')
> try:
>  print if1.read()
>  print 'done with f1'
>  print if2.read()
>  print 'done with f2'
> finally:
>  if1.close()
>  if2.close()
> [/code]
>
> [output]
> done with f1
> Text to test gzip module.
> done with f2
> [/output]
>
> This seems to indicate that something is wrong with f1 (the GSMaP
> file), but I can unzip the file manually and read it with a python
> script (code and output pasted after signature). I have hundreds of
> GSMAP files that have unique archived file names, but they all unzip
> to the same binary file, so I have to process the archived files in
> the python script.
>
> I would be grateful if someone could help me achieve this.
>
> Regards
> Hanlie
>
> [code]
> import numpy as np
> import array
> import gzip
>
> # define longitude and latitude grid resolution
> num_lon = 3600
> num_lat = 1200
>
> # read binary file and store its data in an array
> fp = 'C:\\out.00'
> tmpfile = open(fp, 'rb')
> binvalues = array.array('f')
> binvalues.read(tmpfile, num_lon * num_lat)
> data = np.array(binvalues)
> data = np.reshape(data, (num_lat, num_lon))
>
> tmpfile.close()
>
> # read one value from array
> rows=[584]
> cols=[30]
> print data[rows,cols]
> [/code]
>
> [output]
>  0.12709916
> [/output]
>

Reply via email to