On  2 Sep 2006, [EMAIL PROTECTED] wrote:

>
> Hello all. I post this here since is my first attempt to solve a  
> problem with python.
>
> I have fairly big log files that I'm doing some pre-processing to, to  
> cleanup the data before they go into a MySQL database. After  
> processing the files look something like this:
>
> 17, , 2006-8-21 12:04:29, 0, 3.0846, 25.105, 918, -0.12183, 0.20305,  
> 25.389, 25.254, 180
> 18, , 2006-8-21 12:05:20, 17, 3.1705, 23.62, 949, 0.015228, 0.040609,  
> 24.984, 110.2, 186
> 17, , 2006-8-21 12:07:30, 0, 3.0846, 25.353, 939, -0.1269, 0.20305,  
> 25.254, 25.254, 293
> 18, , 2006-8-21 12:08:23, 17, 3.1705, 23.538, 958, 0.015228,  
> 0.045685, 24.984, 110.2, 188
> 16, , 2006-8-21 12:09:21, 17, 3.0922, 24.691, 969, 0.26904, 0.10152,  
> 25.389, 25.389, 175
>
> then I have written another script to which I pass the filename as  
> argument to insert that data into the db. The problem I'm getting is  
> that some of the files contain 300000 records aprox. I have tried two  
> approaches to inset the data but both have failed

I would recommend a third approach: use  "LOAD DATA INFILE" from MySql.
You can find its syntax e.g here:
http://dev.mysql.com/doc/refman/5.0/en/load-data.html
That is very fast and you need only one statement.  You could also use
mysqlimport from the commandline to import the data.

If you want to insert the data with INSERT statements I would read the
data file line by line and insert that data. That is simple not as fast
as LOAD DATA INFILE but for most situations fast enough. Then you
shouldn't have the problem with max_allowed_packet (you can increase its
value if you need to; for MySql 5.0 its maximum value is 1GB IIRC.
If you need speed use mysqlimport from the commandline or LOAD DATA
INFILE; insert statements even with multiple values will never be as
fast as that.


   Karl
-- 
Please do *not* send copies of replies to me.
I read the list
_______________________________________________
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Reply via email to