memory error

2011-09-13 Thread questions anon
Hello All,
I keep coming across a memory error when processing many netcdf files. I
assume it has something to do with how I loop things and maybe need to close
things off properly.
In the code below I am looping through a bunch of netcdf files (each file is
hourly data for one month) and within each netcdf file I am outputting a
*png file every three hours.
This works for one netcdf file but when it begins to process the next netcdf
file I receive this memory error:

*Traceback (most recent call last):
  File
"d:/plot_netcdf_merc_multiplot_across_multifolders_mkdirs_memoryerror.py",
line 44, in 
TSFC=ncfile.variables['T_SFC'][:]
  File "netCDF4.pyx", line 2473, in netCDF4.Variable.__getitem__
(netCDF4.c:23094)
MemoryError*

To reduce processing requirements I have tried making the LAT and LON to
only use [0] but I also receive an error:

*Traceback (most recent call last):
  File
"d:/plot_netcdf_merc_multiplot_across_multifolders_mkdirs_memoryerror.py",
line 75, in 
x,y=map(*N.meshgrid(LON,LAT))
  File "C:\Python27\lib\site-packages\numpy\lib\function_base.py", line
3256, in meshgrid
numRows, numCols = len(y), len(x)  # yes, reversed
TypeError: len() of unsized object*

finally I have added gc.collect() in a couple of places but that doesn't
seem to do anything to help.
I am using :*Python 2.7.2 |EPD 7.1-2 (32-bit)| (default, Jul  3 2011,
15:13:59) [MSC v.1500 32 bit (Intel)] on win32*
Any feedback will be greatly appreciated!


from netCDF4 import Dataset
import numpy
import numpy as N
import matplotlib.pyplot as plt
from numpy import ma as MA
from mpl_toolkits.basemap import Basemap
from netcdftime import utime
from datetime import datetime
import os
import gc

print "start processing"

inputpath=r'E:/GriddedData/Input/'
outputpath=r'E:/GriddedData/Validation/'
shapefile1="E:/test_GIS/DSE_REGIONS"
for (path, dirs, files) in os.walk(inputpath):
for dir in dirs:
print dir
sourcepath=os.path.join(path,dir)
relativepath=os.path.relpath(sourcepath,inputpath)
newdir=os.path.join(outputpath,relativepath)
if not os.path.exists(newdir):
os.makedirs(newdir)

for ncfile in files:
if ncfile[-3:]=='.nc':
print "dealing with ncfiles:", ncfile
ncfile=os.path.join(sourcepath,ncfile)
#print ncfile
ncfile=Dataset(ncfile, 'r+', 'NETCDF4')
TSFC=ncfile.variables['T_SFC'][:,:,:]
TIME=ncfile.variables['time'][:]
LAT=ncfile.variables['latitude'][:]
LON=ncfile.variables['longitude'][:]
fillvalue=ncfile.variables['T_SFC']._FillValue
TSFC=MA.masked_values(TSFC, fillvalue)
ncfile.close()
gc.collect()
print "garbage collected"


for TSFC, TIME in zip((TSFC[1::3]),(TIME[1::3])):
print TSFC, TIME
#convert time from numbers to date and prepare it to have no
symbols for saving to filename
cdftime=utime('seconds since 1970-01-01 00:00:00')
ncfiletime=cdftime.num2date(TIME)
print ncfiletime
timestr=str(ncfiletime)
d = datetime.strptime(timestr, '%Y-%m-%d %H:%M:%S')
date_string = d.strftime('%Y%m%d_%H%M')

#Set up basemap using mercator projection
http://matplotlib.sourceforge.net/basemap/doc/html/users/merc.html
map = Basemap(projection='merc',llcrnrlat=-40,urcrnrlat=-33,

llcrnrlon=139.0,urcrnrlon=151.0,lat_ts=0,resolution='i')

# compute map projection coordinates for lat/lon grid.
x,y=map(*N.meshgrid(LON,LAT))
map.drawcoastlines(linewidth=0.5)
map.readshapefile(shapefile1, 'DSE_REGIONS')
map.drawstates()

plt.title('Surface temperature at %s UTC'%ncfiletime)
ticks=[-5,0,5,10,15,20,25,30,35,40,45,50]
CS = map.contourf(x,y,TSFC, ticks, cmap=plt.cm.jet)
l,b,w,h =0.1,0.1,0.8,0.8
cax = plt.axes([l+w+0.025, b, 0.025, h], )
cbar=plt.colorbar(CS, cax=cax, drawedges=True)

#save map as *.png and plot netcdf file

plt.savefig((os.path.join(newdir,'TSFC'+date_string+'UTC.png')))
plt.close()
gc.collect()
print "garbage collected again"
print "end of processing"
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: memory error

2011-09-28 Thread questions anon
efile1,
'DSE_REGIONS')
map.drawstates()

plt.title(Title+' %s
UTC'%ncfiletime)

CS = map.contourf(x,y,variable,
ticks, cmap=cmap)
l,b,w,h =0.1,0.1,0.8,0.8
cax = plt.axes([l+w+0.025, b,
0.025, h], )
cbar=plt.colorbar(CS, cax=cax,
drawedges=True)

    #save map as *.png and plot
netcdf file

plt.savefig((os.path.join(OutputFolder,
ncvariablename+date_string+'UTC.png')))
#plt.show()
plt.close()


##



On Wed, Sep 14, 2011 at 4:08 PM, questions anon wrote:

> Hello All,
> I keep coming across a memory error when processing many netcdf files. I
> assume it has something to do with how I loop things and maybe need to close
> things off properly.
> In the code below I am looping through a bunch of netcdf files (each file
> is hourly data for one month) and within each netcdf file I am outputting a
> *png file every three hours.
> This works for one netcdf file but when it begins to process the next
> netcdf file I receive this memory error:
>
> *Traceback (most recent call last):
>   File
> "d:/plot_netcdf_merc_multiplot_across_multifolders_mkdirs_memoryerror.py",
> line 44, in 
> TSFC=ncfile.variables['T_SFC'][:]
>   File "netCDF4.pyx", line 2473, in netCDF4.Variable.__getitem__
> (netCDF4.c:23094)
> MemoryError*
>
> To reduce processing requirements I have tried making the LAT and LON to
> only use [0] but I also receive an error:
>
> *Traceback (most recent call last):
>   File
> "d:/plot_netcdf_merc_multiplot_across_multifolders_mkdirs_memoryerror.py",
> line 75, in 
> x,y=map(*N.meshgrid(LON,LAT))
>   File "C:\Python27\lib\site-packages\numpy\lib\function_base.py", line
> 3256, in meshgrid
> numRows, numCols = len(y), len(x)  # yes, reversed
> TypeError: len() of unsized object*
>
> finally I have added gc.collect() in a couple of places but that doesn't
> seem to do anything to help.
> I am using :*Python 2.7.2 |EPD 7.1-2 (32-bit)| (default, Jul  3 2011,
> 15:13:59) [MSC v.1500 32 bit (Intel)] on win32*
> Any feedback will be greatly appreciated!
>
>
> from netCDF4 import Dataset
> import numpy
> import numpy as N
> import matplotlib.pyplot as plt
> from numpy import ma as MA
> from mpl_toolkits.basemap import Basemap
> from netcdftime import utime
> from datetime import datetime
> import os
> import gc
>
> print "start processing"
>
> inputpath=r'E:/GriddedData/Input/'
> outputpath=r'E:/GriddedData/Validation/'
> shapefile1="E:/test_GIS/DSE_REGIONS"
> for (path, dirs, files) in os.walk(inputpath):
> for dir in dirs:
> print dir
> sourcepath=os.path.join(path,dir)
> relativepath=os.path.relpath(sourcepath,inputpath)
> newdir=os.path.join(outputpath,relativepath)
> if not os.path.exists(newdir):
> os.makedirs(newdir)
>
> for ncfile in files:
> if ncfile[-3:]=='.nc':
> print "dealing with ncfiles:", ncfile
> ncfile=os.path.join(sourcepath,ncfile)
> #print ncfile
> ncfile=Dataset(ncfile, 'r+', 'NETCDF4')
> TSFC=ncfile.variables['T_SFC'][:,:,:]
> TIME=ncfile.variables['time'][:]
> LAT=ncfile.variables['latitude'][:]
> LON=ncfile.variables['longitude'][:]
> fillvalue=ncfile.variables['T_SFC']._FillValue
> TSFC=MA.masked_values(TSFC, fillvalue)
> ncfile.close()
> gc.collect()
> print "garbage collected"
>
>
> for TSFC, TIME in zip((TSFC[1::3]),(TIME[1::3])):
> print TSFC, TIME
> #convert time from numbers to date and prepare it to have no
> symbols for saving to filename
> cdftime=utime('seconds since 1970-01-01 00:00:00')
> ncfiletime=cdftime.num2date(TIME)
> print ncfiletime
> timestr=str(ncfiletime)
> d = datetime.strptime(timestr, '%Y-%m-%d %H:%M:%S')
> date_string = d.strftime('%Y%m%d_%H%M')
>
> #Set up basemap using mercator projection
> http://matplotlib.sourceforge.net/basemap/doc/html/user

mask one array using another array

2011-11-21 Thread questions anon
I am trying to mask one array using another array.

I have created a masked array using
mask=MA.masked_equal(myarray,0),
that looks something like:
[1  -  -  1,
 1  1  -  1,
 1  1  1  1,
 -   1  -  1]

I have an array of values that I want to mask whereever my mask has a a '-'.
how do I do this?
I have looked at
http://www.cawcr.gov.au/bmrc/climdyn/staff/lih/pubs/docs/masks.pdf but the
command:

d = array(a, mask=c.mask()

results in this error:
TypeError: 'numpy.ndarray' object is not callable

I basically want to do exactly what that article does in that equation.

Any feedback will be greatly appreciated.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: mask one array using another array

2011-11-21 Thread questions anon
thank you, that makes sense.
 I should have posted this on another list (which I have now). and the
change required is:

If your new array is x, you can use:

numpy.ma.masked_array(x, mask=mask.mask)

On Tue, Nov 22, 2011 at 11:48 AM, MRAB  wrote:

> On 21/11/2011 21:42, questions anon wrote:
>
>> I am trying to mask one array using another array.
>>
>> I have created a masked array using
>> mask=MA.masked_equal(myarray,**0),
>> that looks something like:
>> [1  -  -  1,
>>  1  1  -  1,
>>  1  1  1  1,
>>  -   1  -  1]
>>
>> I have an array of values that I want to mask whereever my mask has a a
>> '-'.
>> how do I do this?
>> I have looked at
>> http://www.cawcr.gov.au/bmrc/**climdyn/staff/lih/pubs/docs/**masks.pdf<http://www.cawcr.gov.au/bmrc/climdyn/staff/lih/pubs/docs/masks.pdf>but
>> the command:
>>
>> d = array(a, mask=c.mask()
>>
>> results in this error:
>> TypeError: 'numpy.ndarray' object is not callable
>>
>> I basically want to do exactly what that article does in that equation.
>>
>> Any feedback will be greatly appreciated.
>>
>>  The article is using the Numeric module, but your error says that you're
> using
> the numpy module. They're not the same.
> --
> http://mail.python.org/**mailman/listinfo/python-list<http://mail.python.org/mailman/listinfo/python-list>
>
-- 
http://mail.python.org/mailman/listinfo/python-list


loop through arrays and find maximum

2011-12-05 Thread questions anon
I would like to calculate the max and min across many netcdf files.
I know how to create one big array and then concatenate and find the
numpy.max but when I run this on 1000's of arrays I have a memory error.
What I would prefer is to loop through the arrays and produce the maximum
without having the make a big array.
Does anyone have any ideas as to how I could achieve this?
My idea goes something like:

netCDF_list=[]
maxarray=[]

for dir in glob.glob(MainFolder + '*/01/')+ glob.glob(MainFolder +
'*/02/')+ glob.glob(MainFolder + '*/12/'):
for ncfile in glob.glob(dir + '*.nc'):
netCDF_list.append(ncfile)
for filename in netCDF_list:
ncfile=netCDF4.Dataset(
filename)
TSFC=ncfile.variables['T_SFC'][:]
fillvalue=ncfile.variables['T_SFC']._FillValue
TSFC=MA.masked_values(TSFC, fillvalue)
for i in TSFC:
if i == N.max(TSFC, axis=0):
maxarray.append(i)
else:
pass

print maxarray
-- 
http://mail.python.org/mailman/listinfo/python-list