Hello I have recently had cause to compare performance of running the R language on my 10+-year-old PC running Buster (Intel Core i7-920 CPU) and in the cloud on AWS. I got a surprising result, and I am wondering if the R packages on Debian have been built with any flags that account for the difference.
My PC was a mean machine when it was built, but that was in 2009. I'd expect it would be outperformed by up to date hardware. I have a script in R which I wrote which performs a moderately involved calculation column-by-column on a 4000-row, 10000-column matrix. On my Buster PC, performing the calculation on a single column takes 9.5 seconds. The code does not use any multi-cpu capabilities so it uses just one of the 8 avaialable virtual CPUs in my PC while doing so. (4 cores, with hyperthreading = 8 virtual CPUs) Running the same code on the same data on a fairly high-spec AWS EC2 server in the cloud, (the r5a-4xlarge variety for those who know about AWS) the same calculation takes 2 minutes and 6 seconds. Obviously there is virtualisation involved here, but at low load with just one instance running and the machine not being asked to do anything else I would have expected the AWS machine to be much closer to local performance if not better, given the age of my PC. In the past I have run highly parallel Java programs in the two environments and have seen much better results from using AWS in Java-land. That led me to wonder if it is something about how R is configured. I am not getting anywhere in the AWS forums (unless you pay a lot of money you basically don't get a lot of attention) so I was wondering if anyone was familiar with how the R packages are configured in Debian who might know if anything has been done to optimise performance, that might explain why it is so much faster in Debian? Is it purely local hardware versus virtualised? I am struggling to believe that because I don't see the same phenomenon in Java programs. Thanks for any ideas Mark