Re: [Tutor] Load Entire File into memory

2013-11-05 Thread eryksun
On Mon, Nov 4, 2013 at 11:26 AM, Amal Thomas wrote: > @Dave: thanks.. By the way I am running my codes on a server with about > 100GB ram but I cant afford my code to use 4-5 times the size of the text > file. Now I am using read() / readlines() , these seems to be more > efficient in memory usag

Re: [Tutor] Load Entire File into memory

2013-11-05 Thread William Ray Wing
On Nov 5, 2013, at 11:12 AM, Alan Gauld wrote: > On 05/11/13 02:02, Danny Yoo wrote: > >> To visualize the sheer scale of the problem, see: >> >> http://i.imgur.com/X1Hi1.gif >> >> which would normally be funny, except that it's not quite a joke. :P > > I think I'm missing something. All I s

Re: [Tutor] Load Entire File into memory

2013-11-05 Thread Steven D'Aprano
On Tue, Nov 05, 2013 at 04:12:51PM +, Alan Gauld wrote: > On 05/11/13 02:02, Danny Yoo wrote: > > >To visualize the sheer scale of the problem, see: > > > >http://i.imgur.com/X1Hi1.gif > > > >which would normally be funny, except that it's not quite a joke. :P > > I think I'm missing somethi

Re: [Tutor] Load Entire File into memory

2013-11-05 Thread Alan Gauld
On 05/11/13 02:02, Danny Yoo wrote: To visualize the sheer scale of the problem, see: http://i.imgur.com/X1Hi1.gif which would normally be funny, except that it's not quite a joke. :P I think I'm missing something. All I see in Firefox is a vertical red bar. And in Chrome I don't even get t

Re: [Tutor] Load Entire File into memory

2013-11-05 Thread Oscar Benjamin
On 5 November 2013 13:20, Amal Thomas wrote: > On Mon, Nov 4, 2013 at 10:00 PM, Steven D'Aprano > wrote: >> > >> >> import os >> filename = "YOUR FILE NAME HERE" >> print("File size:", os.stat(filename).st_size) >> f = open(filename) >> content = f.read() >> print("Length of content actually read

Re: [Tutor] Load Entire File into memory

2013-11-05 Thread Oscar Benjamin
On 4 November 2013 17:41, Amal Thomas wrote: > @Steven: Thank you...My input data is basically AUGC and newlines... I would > like to know about bytearray technique. Please suggest me some links or > reference.. I will go through the profiler and check whether the code > maintains linearity with t

Re: [Tutor] Load Entire File into memory

2013-11-05 Thread Amal Thomas
On Mon, Nov 4, 2013 at 10:00 PM, Steven D'Aprano wrote: > > > import os > filename = "YOUR FILE NAME HERE" > print("File size:", os.stat(filename).st_size) > f = open(filename) > content = f.read() > print("Length of content actually read:", len(content)) > print("Current file position:", f.tell(

Re: [Tutor] Load Entire File into memory

2013-11-04 Thread Stefan Behnel
Amal Thomas, 04.11.2013 14:55: > I have checked the execution time manually as well as I found it through my > code. During execution of my code, at start, I stored my initial time(start > time) to a variable and at the end calculated time taken to run the code = > end time - start time. There was

Re: [Tutor] Load Entire File into memory

2013-11-04 Thread Steven D'Aprano
On Mon, Nov 04, 2013 at 06:02:47PM -0800, Danny Yoo wrote: > To visualize the sheer scale of the problem, see: > > http://i.imgur.com/X1Hi1.gif > > which would normally be funny, except that it's not quite a joke. :P Nice visualisation! Was that yours? > So you want to minimize hard disk

Re: [Tutor] Load Entire File into memory

2013-11-04 Thread Steven D'Aprano
I mustly agree with Alan, but a couple of little quibbles: On Tue, Nov 05, 2013 at 01:10:39AM +, ALAN GAULD wrote: > >@Alan: Thanks.. I have checked the both ways( reading line by line by not > >loading into ram ,  > > other loading entire file to ram and then reading line by line)  for file

Re: [Tutor] Load Entire File into memory

2013-11-04 Thread Danny Yoo
> > You _must_ avoid swap at all costs here. You may not understand the > point, so a little more explanation: touching swap is several orders of > magnitude more expensive than anything else you are doing in your program. > > CPU operations are on the order of nanoseconds. (10^-9) > > Dis

Re: [Tutor] Load Entire File into memory

2013-11-04 Thread Danny Yoo
> > > > Also as I have mentioned I cant afford to run my code using 4-5 times > memory. > > Total resource available in my server is about 180 GB memory (approx 64 > GB RAM + 128GB swap). > > OK, There is a huge difference between having 100G of RAM and having > 64G+128G swap. > swap is basically d

Re: [Tutor] Load Entire File into memory

2013-11-04 Thread ALAN GAULD
Forwarding to tutor list. Please use Reply All in responses. From: Amal Thomas >To: Alan Gauld >Sent: Monday, 4 November 2013, 17:26 >Subject: Re: [Tutor] Load Entire File into memory > > > >@Alan: Thanks.. I have checked the both ways( reading line by line by n

Re: [Tutor] Load Entire File into memory

2013-11-04 Thread Danny Yoo
On Mon, Nov 4, 2013 at 9:41 AM, Amal Thomas wrote: > @Steven: Thank you...My input data is basically AUGC and newlines... I > would like to know about bytearray technique. Please suggest me some links > or reference.. I will go through the profiler and check whether the code > maintains linearity

Re: [Tutor] Load Entire File into memory

2013-11-04 Thread Dave Angel
On 4/11/2013 11:26, Amal Thomas wrote: > @Dave: thanks.. By the way I am running my codes on a server with about > 100GB ram but I cant afford my code to use 4-5 times the size of the text > file. Now I am using read() / readlines() , these seems to be more > efficient in memory usage than io.Str

Re: [Tutor] Load Entire File into memory

2013-11-04 Thread Dave Angel
On Tue, 5 Nov 2013 02:53:41 +1100, Steven D'Aprano wrote: Dave, do you have a reference for that? As far as I can tell, read() will read to EOF unless you open the file in non-blocking mode. No. I must be just remembering something from another language. Sorry. -- DaveA ___

Re: [Tutor] Load Entire File into memory

2013-11-04 Thread Amal Thomas
@Steven: Thank you...My input data is basically AUGC and newlines... I would like to know about bytearray technique. Please suggest me some links or reference.. I will go through the profiler and check whether the code maintains linearity with the input files. > > It's probably worth putting so

Re: [Tutor] Load Entire File into memory

2013-11-04 Thread Steven D'Aprano
On Mon, Nov 04, 2013 at 04:54:16PM +, Alan Gauld wrote: > On 04/11/13 16:34, Amal Thomas wrote: > >@Joel: The code runs for weeks..input file which I have to process in > >very huge(in 50 gbs). So its not a matter of hours.its matter of days > >and weeks.. > > OK, but that's not down to readin

Re: [Tutor] Load Entire File into memory

2013-11-04 Thread Alan Gauld
On 04/11/13 16:34, Amal Thomas wrote: @Joel: The code runs for weeks..input file which I have to process in very huge(in 50 gbs). So its not a matter of hours.its matter of days and weeks.. OK, but that's not down to reading the file from disk. Reading a 50G file will only take a few minutes if

Re: [Tutor] Load Entire File into memory

2013-11-04 Thread Steven D'Aprano
On Mon, Nov 04, 2013 at 11:27:52AM -0500, Joel Goldstick wrote: > If you are new to python why are you so concerned about the speed of > your code. Amal is new to Python but he's not new to biology, he's a 4th year student. With a 50GB file, I expect he is analysing something to do with DNA seq

Re: [Tutor] Load Entire File into memory

2013-11-04 Thread Amal Thomas
@Steven: Thanks... Right now I cant access the files. I will send you the output when I can. -- Please try this little bit of code, replacing the file name with the actual name of your 50GB data file: import os filename = "YOUR FILE NAME HERE" print("File size:", os.stat(filename).st_size) f

Re: [Tutor] Load Entire File into memory

2013-11-04 Thread Amal Thomas
@Joel: The code runs for weeks..input file which I have to process in very huge(in 50 gbs). So its not a matter of hours.its matter of days and weeks..I was using C++. Recently I switched over to Python. I am trying to optimize my code to get the outputs in less time and memory efficiently. On Mo

Re: [Tutor] Load Entire File into memory

2013-11-04 Thread Steven D'Aprano
On Mon, Nov 04, 2013 at 07:00:29PM +0530, Amal Thomas wrote: > Yes I have found that after loading to RAM and then reading lines by lines > saves a huge amount of time since my text files are very huge. This is remarkable, and quite frankly incredible. I wonder whether you are misinterpreting wha

Re: [Tutor] Load Entire File into memory

2013-11-04 Thread Joel Goldstick
"I am new to python. I am working in computational biology and I have to deal with text files of huge size. I know how to read line by line from a text file. I want to know the best method in python3 to load the enire file into ram and do the operations.(since this saves time)" If you are new t

Re: [Tutor] Load Entire File into memory

2013-11-04 Thread Amal Thomas
@Dave: thanks.. By the way I am running my codes on a server with about 100GB ram but I cant afford my code to use 4-5 times the size of the text file. Now I am using read() / readlines() , these seems to be more efficient in memory usage than io.StringIO(f.read()). On Mon, Nov 4, 2013 at 9:23 P

Re: [Tutor] Load Entire File into memory

2013-11-04 Thread Steven D'Aprano
On Mon, Nov 04, 2013 at 02:48:11PM +, Dave Angel wrote: > Now I understand. Processing line by line is slower because it actually > reads the whole file. The code you showed earlier: > > >I am currently using this method to load my text file: > > *f = open("output.txt") > > content=io.S

Re: [Tutor] Load Entire File into memory

2013-11-04 Thread Dave Angel
On 4/11/2013 09:04, Amal Thomas wrote: > @William: > Thanks, > > My Line size varies from 40 to 550 characters. Please note that text file > which I have to process is in gigabytes ( approx 50 GB ) . This was the > code which i used to process line by line without loading into memory. Now I under

Re: [Tutor] Load Entire File into memory

2013-11-04 Thread Amal Thomas
@William: Thanks, My Line size varies from 40 to 550 characters. Please note that text file which I have to process is in gigabytes ( approx 50 GB ) . This was the code which i used to process line by line without loading into memory. *for lines in open('uniqname.txt'): * * * On Mon, Nov 4, 201

Re: [Tutor] Load Entire File into memory

2013-11-04 Thread Amal Thomas
Hi, @Peter: I have checked the execution time manually as well as I found it through my code. During execution of my code, at start, I stored my initial time(start time) to a variable and at the end calculated time taken to run the code = end time - start time. There was a significance difference

Re: [Tutor] Load Entire File into memory

2013-11-04 Thread William Ray Wing
On Nov 4, 2013, at 8:30 AM, Amal Thomas wrote: > Yes I have found that after loading to RAM and then reading lines by lines > saves a huge amount of time since my text files are very huge. > [huge snip] > -- > AMAL THOMAS > Fourth Year Undergraduate Student > Department of Biotechnology > II

Re: [Tutor] Load Entire File into memory

2013-11-04 Thread Peter Otten
Amal Thomas wrote: > Yes I have found that after loading to RAM and then reading lines by lines > saves a huge amount of time since my text files are very huge. How exactly did you find out? You should only see a speed-up if you iterate over the data at least twice.

Re: [Tutor] Load Entire File into memory

2013-11-04 Thread Amal Thomas
Yes I have found that after loading to RAM and then reading lines by lines saves a huge amount of time since my text files are very huge. On Mon, Nov 4, 2013 at 6:46 PM, Alan Gauld wrote: > On 04/11/13 13:06, Amal Thomas wrote: > > Present code: >> >> >> *f = open("output.txt") >> content=f.rea

Re: [Tutor] Load Entire File into memory

2013-11-04 Thread Alan Gauld
On 04/11/13 13:06, Amal Thomas wrote: Present code: *f = open("output.txt") content=f.read().split('\n') f.close() If your objective is to save time, then you should replace this with f.readlines() which will save you reprocesasing the entire file to remove the newlines. for lines in con

Re: [Tutor] Load Entire File into memory

2013-11-04 Thread Amal Thomas
Hi, Thanks Alan. Now I have made changes in code : Present code: *f = open("output.txt")content=f.read().split('\n') f.close()for lines in content:* * * *content.clear()* Previous code: *f = open("output.txt") content=io.StringIO(f.read()) f.close()for lines in content: * *content.

Re: [Tutor] Load Entire File into memory

2013-11-04 Thread Alan Gauld
On 04/11/13 11:07, Amal Thomas wrote: I am currently using this method to load my text file: *f = open("output.txt") content=io.StringIO(f.read()) f.close()* But I have found that this method uses 4 times the size of text file. So why not use f = open("output.txt") content=f.read() f.clo

[Tutor] Load Entire File into memory

2013-11-04 Thread Amal Thomas
Hi, I am new to python. I am working in computational biology and I have to deal with text files of huge size. I know how to read line by line from a text file. I want to know the best method in *python3* to load the enire file into ram and do the operations.(since this saves time) I am curren