On Fri, Jun 26, 2015 at 2:34 PM, <to...@tuxteam.de> wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On Fri, Jun 26, 2015 at 11:58:40AM +0200, Dan wrote: >> On Fri, Jun 26, 2015 at 11:46 AM, <to...@tuxteam.de> wrote: > > [...] > >> No I do not know that. I am a scientist, and I use the computers as a >> tool to do simulations that I write in C++ with (Threading Building >> Blocks). I have a limited knowledge of the computer architecture. That >> would mean that a calculation that fits in 64GB will run slower with >> 256GB? or that means that when I increase the size the calculation >> will be slower. > > Note that I'm deep in hand-waving territory here. > > Suppose you could reduce your problem size to one-fourth its current > size, so that it fits in your 64 GB. Let's say it needs now a time > T_0. > > Now consider your (real) four-fold problem. Leaving out all effects > of swapping and all that (and only considering the "pure" algorithm), > your run time will be T_1, which most probably is (depending on > the algorithm's properties) bigger than T_0. For a linear algorithm: > T_1 >= 4 * T_0 (you're lucky!), for a quadratic one T_1 >= 16 * T_0 > and so on (if you're _very_ lucky, you have a sublinear algorithm, > but given all you've written before I wouldn't bet on that). > > Taking into account the effects of RAM, you get for 64 GB and your > problem ("real" size) some time T_1_64 which is most probably > significantly bigger than T_1: T_1_64 >> T_1, due to all the caching > overhead. How much bigger will depend on how cache-friendly the > algorithm is: if it is random-accessing data from all over the > place, the slowdown will be horrible (in the limit, of the order > of magnitude of the relation of the SSD speed to the RAM speed > (yeah, latency, bandwidth. Some combination of both. Pick the worst > of them ;-) > > Now to the 256 GB case. Ideally, the thing fits in there: ideally > the time would be T_1_256 =~ T_1, since no swapping overhead, > etc. > > What I was saying is that you might quite well get T_1_256 > T_1, > because there are other factors (the CPU has a whole hierarchy > of caches between itself and the RAM, because the RAM is horribly > slow from the POV of the CPU). Those caches might be more > overwhelmed by the bigger addressable memory. > > Now how much bigger, that is a tough question. Most probably you > get > T_1_64 >> T_1_256 > T_1 > > so the extra RAM will help, but in some cases the slowdown from > the "ideal" T_1 to the "real" T_1_256 might prove disappointing. > > Sometimes, partitioning the problem might give you more speed > than throwing RAM at it. Sometimes! > > I think you have no choice but to try it out.
Hi, I took a closer look into this. I found out that that it is important to have at least one DIMM per channel. But if you have a several DIMMs per channel there can be a hit in the performance. Many servers clock down the memory to a lower speed when you add a second or a third DPC. https://marchamilton.wordpress.com/2012/02/07/optimizing-hpc-server-memory-configurations/ http://frankdenneman.nl/2015/02/20/memory-deep-dive/ I did some research and it seems that "in general" there is no drop with 2 dimms per channel but there is a drop with 3 dimms per channel. I can buy 8x 32 Gb or 16 x 16Gb. The first option is more expensive than the second one, but I will have only one DIMM per channel. Any suggestions or experience with this? Thanks, Dan -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/CAK00fOKAGqoWp7SbXS7dwNL1kDO=fiw77erhhhqpxjcz2_y...@mail.gmail.com