I think someone needs to sort out distribution lists, im getting lots of unstructured emails under many different titles
On 11/17/06, Francesc Altet <[EMAIL PROTECTED]> wrote: > A Dijous 16 Novembre 2006 22:28, Erin Sheldon escrigué: > > Hi Francesc - > > > > Unless I missed something, I think what you have > > shown is that the combination of > > (getting data from database into python lists) + > > (converting to arrays) > > is what is taking time. I would guess the first takes > > significantly longer than the second. > > Seriously, I don't think I have demonstrated nothing really solid in > that regard with so little evidences. But we can try looking for more > of those :) > > For example, I'd split the times in: > > t1 (getting data from database) + > t2 (python lists) + > t3 (converting to arrays) = > tt (total time) > > We don't know t1, but we do know tt. Now, we can try to get a guess > for t1 and t2. Perhaps I'm wrong, but the next could be good > estimates. > > For t1 (creating the python list of tuples): > In [44]: Timer("[(x,x) for x in np.arange(500000, dtype='float64')]", "import > numpy as np").repeat(3,1) > Out[44]: [0.55968594551086426, 0.48462891578674316, 0.4855189323425293] > > For t2 (converting to recarrays): > In [49]: Timer("np.fromiter(lot, dtype=dtype)", "import numpy as np; > lot=[(x,x) for x in np.arange(500000, dtype='float64')]; > dtype=np.dtype([('x', 'float64'), ('y', 'float64')])").repeat(3,1) > Out[49]: [0.50310707092285156, 0.50920987129211426, 0.50304579734802246] > > So, it seems that t1 and t2 are similar and they take aproximately 0.5 > seconds each. > > Now, let me remember the timings for reading the databases on my > laptop at work (a Pentium4 @ 2 GHz): > > setup SQLite took 23.5661110878 seconds > retrieve SQLite took 3.26717996597 seconds > setup PyTables took 0.139157056808 seconds > retrieve PyTables took 0.13444685936 seconds > > So, in our case, tt for SQLite3 was 3.26 seconds. With that, we can > derive its t1 (getting data from database): > > t1 = tt - t1 - t2 =~ 2.26 seconds > > However, this is still far more than tt for PyTables (~ 0.14 sec), so > I'm not completely sure what's going on. Honest, I don't think that > HDF5 (the underlying library for doing I/O in PyTables) would be > almost 20x faster than SQLite3 for reading purposes. So my guess is > that there should be more factors contributing tt for SQLite3 that > I've not taken in account. Anyone can find them? > > Cheers, > > -- > >0,0< Francesc Altet http://www.carabos.com/ > V V Cárabos Coop. V. Enjoy Data > "-" > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion@scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion