sizing / capacity planning tipps related to expected request or transactions per second
Hi, are there any nice rules of thumb about capacity planning in relation the expected amount of transactions or request per second? For example, if I have around 100 000 transactions per second on a 5 TB database. With what amount of Memory and CPUs/Cores and which settings would you basically Start to evaluate the performance. Or are there any other recommendations or experiences here? Thanks and best regards Dirk
Re: sizing / capacity planning tipps related to expected request or transactions per second
Hi Dirk, There are a bunch of other things to consider besides just TPS and size of database. Since PG is process-bound, I would consider connection activity: How many active connections at any one time? This greatly affects your CPUs. SQL workload is another big factor: a lot of complex queries may use up or want to use up large amounts of work_mem, which greatly affects your memory capacity. Bunch of other stuff, but these are my top 2. Regards, Michael Vitale Dirk Krautschick wrote on 8/24/2020 12:39 PM: Hi, are there any nice rules of thumb about capacity planning in relation the expected amount of transactions or request per second? For example, if I have around 100 000 transactions per second on a 5 TB database. With what amount of Memory and CPUs/Cores and which settings would you basically Start to evaluate the performance. Or are there any other recommendations or experiences here? Thanks and best regards Dirk
Re: sizing / capacity planning tipps related to expected request or transactions per second
Hi po 24. 8. 2020 v 18:40 odesílatel Dirk Krautschick < [email protected]> napsal: > Hi, > > are there any nice rules of thumb about capacity planning in relation the > expected > amount of transactions or request per second? > > For example, if I have around 100 000 transactions per second on a 5 TB > database. > With what amount of Memory and CPUs/Cores and which settings would you > basically > Start to evaluate the performance. > You have to know the duration of a typical query - if it is 1ms, then one cpu can do 1000 tps and you need 100 cpu. If duration is 10 ms, then you need 1000 cpu. as minimum RAM for OLTP is 10% of database size, in your case 500GB RAM. Any time, when I see a request higher than 20-30K tps, then it is good to think about horizontal scaling or about sharding. Regards Pavel > Or are there any other recommendations or experiences here? > > Thanks and best regards > > Dirk >
Re: CPU hogged by concurrent SELECT..FOR UPDATE SKIP LOCKED
On Tue, Aug 18, 2020 at 8:22 PM Jim Jarvie wrote: > I've tuned the LIMIT value both up and down. As I move the limit up, the > problem becomes substantially worse; 300 swamps it and the selects take > 1 > hour to complete; at 600 they just all lock everything up and it stops > processing. I did try 1,000 but it basically resulted in nothing being > processed. > You've only described what happens when you turn the LIMIT up. What happens when you turn it down? Why did you pick 250 in the first place? I don't see the rationale for having 250*256 rows locked simultaneously. I can see reasons you might want a LIMIT as high as 250, or for having 256 processes. I just don't see why you would want to do both in the same system. > Less processes does not give the throughput required because the queue > sends data elsewhre which has a long round trip time but does permit over > 1K concurrent connections as their work-round for throughput. I'm stuck > having to scale up my concurrent processes in order to compensate for the > long processing time of an individual queue item. > You've tied the database concurrency to the external process concurrency. While this might be convenient, there is no reason to think it will be optimal. If you achieve concurrency by having 256 processes, why does each process need to lock 250 rows at time. Having 64,000 rows locked to obtain 256-fold concurrency seems like a poor design. With modern tools it should not be too hard to have just one process obtain 1000 rows, and launch 1000 concurrent external tasks. Either with threads (making sure only one thread deals with the database), or with asynchronous operations. (Then the problem would be how to harvest the results, it couldn't unlock the rows until all external tasks have finished, which would be a problem if some took much longer than others). It is easy to reproduce scaling problems when you have a large number of processes trying to do ORDER BY id LIMIT 250 FOR UPDATE SKIP LOCKED without all the partitioning and stuff. I don't know if the problems are as severe as you describe with your very elaborate setup--or even if they have the same bottleneck. But in the simple case, there seems to be a lot of spin-lock contention, as every selecting query needs to figure out if every marked-as-locked row is truly locked, by asking if the apparently-locking transaction is still valid. Cheers, Jeff >
