Memory Problem

2007-09-18 Thread Christoph Scheit
Hi,

I have a short script/prog in order to read out binary files from a numerical 
simulation. This binary files still need some post-processing, which is
summing up results from different cpu's, filtering out non-valid entrys
and bringing the data in some special order.

Reading the binary data in using the struct-module works fine - I read
one chunk of data into a tuple, this tupel I append to a list.
At the end of reading, I return the list.

Then the data is added to a table, which I use for the actual Post-Processing.
The table is actually a Class with several "Columns", each column internally
being represented by array.
Now adding all the data from the simulation results to the table makes the
memory usage exploding. So I would like to know, where exactly the memory
is vasted.

Here the code to add the data of one file (I have to add the data of various 
files to the same table in total)

# create reader
breader = BDBReader("", "", "#")
 
# read data
bData = breader.readDB(dbFileList[0])

# create table
dTab = DBTable(breader.headings, breader.converters, [1,2])
addRows(bData, dTab)

Before I add a new entry to the table, I check if there is already an entry 
like this. To do so, I store keys for all the entries with row-number in a
dictionary. What about the memory consumption of the dictionary?

Here the code for adding a new row to the table:

# check if data already exists
if (self.keyDict.has_key(key)):
rowIdx = self.keyDict[key]
for i in self.mutableCols:
self.cols[i][rowIdx] += rowData[i]
return

 # key is still available - insert row to table
 self.keyDict[key] = self.nRows

 # insert data to the columns
 for i in range(0, self.nCols):
 self.cols[i].add(rowData[i])

 # add row i and increment number of rows
 self.rows.append(DBRow(self, self.nRows))
 self.nRows += 1

Maybe somebody can help me. If you need, I can give more implementation
details.

Thanks in advance,

Christoph
-- 

====
M.Sc. Christoph Scheit
Institute of Fluid Mechanics
FAU Erlangen-Nuremberg
Cauerstrasse 4
D-91058 Erlangen
Phone: +49 9131 85 29508

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Memory Problem

2007-09-18 Thread Christoph Scheit
On Tuesday 18 September 2007 15:10, Marc 'BlackJack' Rintsch wrote:
> On Tue, 18 Sep 2007 14:06:22 +0200, Christoph Scheit wrote:
> > Then the data is added to a table, which I use for the actual
> > Post-Processing. The table is actually a Class with several "Columns",
> > each column internally being represented by array.
>
> Array or list?

array

More details:
class DBTable:
# the class DBTable has a list, each list entry referencing a DBColu  
bject
self.cols  = []

self.dict = {, -1} #the dictionary is used to look up if an entry
# already exists

class DBColumn:
# has a name (string and a datatype (int, float, e.g.) as attribute plus
self.data = array('f')  # an array of type float

I have to deal with several millions of data, actually I'm trying an example 
with
360 grid points and 1 time steps, i.e. 3 600 000 entries (and each row 
consits of 4 int and one float)

Of course, the more keys the bigger is the dictionary, but is there a way to 
evaluate the actual size of the dictionary?

Greets and Thanks,

Chris
>
> > # create reader
> > breader = BDBReader("", "", "#")
> >
> > # read data
> > bData = breader.readDB(dbFileList[0])
> >
> > # create table
> > dTab = DBTable(breader.headings, breader.converters, [1,2])
> > addRows(bData, dTab)
> >
> > Before I add a new entry to the table, I check if there is already an
> > entry like this. To do so, I store keys for all the entries with
> > row-number in a dictionary. What about the memory consumption of the
> > dictionary?
>
> The more items you put into the dictionary the more memory it uses.  ;-)
>
> > Here the code for adding a new row to the table:
> >
> > # check if data already exists
> > if (self.keyDict.has_key(key)):
> > rowIdx = self.keyDict[key]
> > for i in self.mutableCols:
> > self.cols[i][rowIdx] += rowData[i]
> > return
> >
> >  # key is still available - insert row to table
> >  self.keyDict[key] = self.nRows
> >
> >  # insert data to the columns
> >  for i in range(0, self.nCols):
> >  self.cols[i].add(rowData[i])
> >
> >  # add row i and increment number of rows
> >  self.rows.append(DBRow(self, self.nRows))
> >  self.nRows += 1
> >
> > Maybe somebody can help me. If you need, I can give more implementation
> > details.
>
> IMHO That's not enough code and/or description of the data structure(s).
> And you also left out some information like the number of rows/columns and
> the size of the data.
>
> Have you already thought about using a database?
>
> Ciao,
>   Marc 'BlackJack' Rintsch

-- 


M.Sc. Christoph Scheit
Institute of Fluid Mechanics
FAU Erlangen-Nuremberg
Cauerstrasse 4
D-91058 Erlangen
Phone: +49 9131 85 29508

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Memory Problem

2007-09-18 Thread Christoph Scheit
Hi, Thank you all very much,

so I will consider using a database. Anyway I would like
how to detect cycles, if there are.

> >> >  # add row i and increment number of rows
> >> >  self.rows.append(DBRow(self, self.nRows))
> >> >  self.nRows += 1
>
> This looks suspicious, and may indicate that your structure contains
> cycles, and Python cannot always recall memory from those cycles, and you
> end using much more memory than needed.
>
> --
> Gabriel Genellina

How can I detect if there are cycles?

self.rows is a list containing DBRow-objects,
each itself being an integer pointer (index) to the i-th row.
Im using this list in order to sort the table by sorting the index-list
instead of realy sorting the entries. (or to filter).

-- 


M.Sc. Christoph Scheit
Institute of Fluid Mechanics
FAU Erlangen-Nuremberg
Cauerstrasse 4
D-91058 Erlangen
Phone: +49 9131 85 29508

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python-list Digest, Vol 48, Issue 301

2007-09-20 Thread Christoph Scheit
Hello,

is there a way to do something like
for i,j in l1, l2:
print i,j

?
Thanks in advance,
Chris

-- 


M.Sc. Christoph Scheit
Institute of Fluid Mechanics
FAU Erlangen-Nuremberg
Cauerstrasse 4
D-91058 Erlangen
Phone: +49 9131 85 29508

-- 
http://mail.python.org/mailman/listinfo/python-list