Re: [Rd] Memory allocation in read.table

2013-08-28 Thread Simon Urbanek
On Aug 28, 2013, at 2:24 PM, Hadley Wickham wrote: >> Yup - parsing is the most expensive part. That's why for high-throughput >> data you don't want to use ASCII representation. It's amazing that the disk >> speeds are now so high that CPUs are the bottlenecks now, not vice versa. > > Do you h

Re: [Rd] Memory allocation in read.table

2013-08-28 Thread Hadley Wickham
> Yup - parsing is the most expensive part. That's why for high-throughput data > you don't want to use ASCII representation. It's amazing that the disk speeds > are now so high that CPUs are the bottlenecks now, not vice versa. Do you have any recommendations for binary formats? For R, is there

Re: [Rd] Memory allocation in read.table

2013-08-28 Thread Simon Urbanek
On Aug 28, 2013, at 1:59 PM, Hadley Wickham wrote: >>> Why do those lines need any allocations? I thought class<- and attr<- >>> were primitives, and hence would modify in place. >>> >> >> .. but only if there is no other reference to the data (i.e. NAMED < 2). If >> there are two references,

Re: [Rd] Memory allocation in read.table

2013-08-28 Thread Hadley Wickham
>> Why do those lines need any allocations? I thought class<- and attr<- >> were primitives, and hence would modify in place. >> > > .. but only if there is no other reference to the data (i.e. NAMED < 2). If > there are two references, they have to copy, because it would change the > other copy.

Re: [Rd] Memory allocation in read.table

2013-08-28 Thread Simon Urbanek
On Aug 28, 2013, at 12:17 PM, Hadley Wickham wrote: > Hi all, > > I've been trying to learn more about memory profiling in R and I've > been trying memory profiling out on read.table. I'm getting a bit of a > strange result, and I hope that someone might be able to explain why. > > After running

[Rd] Memory allocation in read.table

2013-08-28 Thread Hadley Wickham
Hi all, I've been trying to learn more about memory profiling in R and I've been trying memory profiling out on read.table. I'm getting a bit of a strange result, and I hope that someone might be able to explain why. After running Rprof("read-table.prof", memory.profiling = TRUE, line.profiling