Low performance between datacenters
hi All,
I am new on the list.
I hope someone can give me an adequate answer or good advice about my
problem.
I have a client (normally a web service, for testing the psql client) in
GCP. There is a PSQL server in another DC. The ping response time is 20ms.
I measured the bandwidth via scp and it is more than 1Gb/s which is more
than enough IMO.
The psql connection between the DCs for me was unexpectedly slow.
I would expect a bit slower query without data ('select now()') due to the
increased latency and somewhat similar speed of data transfer.
What I see is that
select now() increased from 0.7ms to 20ms which is OK.
And 'select *' on a table with 3082 rows (so it's a small table) increased
from 10ms to 800ms.
Is this normal? Can I improve it somehow?
Thank you,
Re: Low performance between datacenters
El dom, 7 jul 2024 a la(s) 3:17 p.m., Tamás PAPP ([email protected]) escribió: > hi All, > > I am new on the list. > I hope someone can give me an adequate answer or good advice about my > problem. > > I have a client (normally a web service, for testing the psql client) in > GCP. There is a PSQL server in another DC. The ping response time is 20ms. > I measured the bandwidth via scp and it is more than 1Gb/s which is more > than enough IMO. > > Hi Tamás, Bandwidth and Latency are two different concepts. Latency is the round-trip-time of a data-packet travelling over the network while bandwidth is the amount of data you can move concurrently. Latency will add up time over each conversation between the client and the server. For example, establishing a connection has a bit of back and forth dialogue going on. Each of these communications bits will be affected by latency as either client or server needs to wait for the other party to receive the message and then respond to it. Latency will have an impact even when transmitting data like in a plain SELECT * FROM table. Bear in mind that psql uses TCP which sends an ACK packet at regular intervals to assert the integrity of the transmitted data. > The psql connection between the DCs for me was unexpectedly slow. > I would expect a bit slower query without data ('select now()') due to the > increased latency and somewhat similar speed of data transfer. > What I see is that > > select now() increased from 0.7ms to 20ms which is OK. > And 'select *' on a table with 3082 rows (so it's a small table) increased > from 10ms to 800ms. > Since you aren't providing much evidence, I can speculate a lot might be going on to explain this kind of increase in the query delay: busy database, busy network, locks, latency, network unreliability, etc. > Is this normal? Can I improve it somehow? > Move your web service physically as close as possible to the database server.
Re: Low performance between datacenters
Often it happens because of the low batch size for fetching data. This
makes client wait unnecessarily while reading rows. I am not sure which
client are you using, but in java this can be controlled on per-statement
level, see
https://jdbc.postgresql.org/documentation/query/
I believe there is also a connection parameter to set the default value,
but I don’t remember out of top of my head. You can definitely set it on
connection, see
https://jdbc.postgresql.org/documentation/publicapi/org/postgresql/jdbc/PgConnection.html#setDefaultFetchSize-int-
нд, 7 лип. 2024 р. о 10:17 Tamás PAPP пише:
> hi All,
>
> I am new on the list.
> I hope someone can give me an adequate answer or good advice about my
> problem.
>
> I have a client (normally a web service, for testing the psql client) in
> GCP. There is a PSQL server in another DC. The ping response time is 20ms.
> I measured the bandwidth via scp and it is more than 1Gb/s which is more
> than enough IMO.
>
>
> The psql connection between the DCs for me was unexpectedly slow.
> I would expect a bit slower query without data ('select now()') due to the
> increased latency and somewhat similar speed of data transfer.
> What I see is that
>
> select now() increased from 0.7ms to 20ms which is OK.
> And 'select *' on a table with 3082 rows (so it's a small table) increased
> from 10ms to 800ms.
>
>
> Is this normal? Can I improve it somehow?
>
>
> Thank you,
>
>
Re: Hash Right join and seq scan
Sorry for confusion, it's from attached explain output of the SQL. please check attached. my questions is : for nestloop of two partition tables , they use same partition key and equal join on partition key, the cost could be "outer tables estimated rows" * (average index scan of only one partition of inner table) , instead of "outer tables estimated rows" * (index scans of all partitions), is it possible ? or it's still need running time partition pruning enhancement? random_page_cost = 1.1, seq_page_cost=1.0, effective_cache_size=0.75*physical memory size. set random_page_cost=0.9 make optimizer to choose index scan instead of seq scan. Thanks, James David Rowley 於 2024年7月6日週六 上午8:33寫道: > On Sat, 6 Jul 2024 at 02:43, James Pang wrote: > >for nest loop path, since the first one estimated only "8" > rows , and they use partitionkeyid as joinkey and all are hash partitions , > is it better to estimate cost to 8 (loop times) * 1600 = 12800 (each one > loop map to only 1 hash partition bitmap scan ,avg one partition cost), > that's much less than 398917.29 of all partitions ? > > I'm not really sure where you're getting the numbers from here. The > outer side of the deepest nested loop has an 8 row estimate, not the > nested loop itself. I'm unsure where the 1600 is from. I only see > 1669. > > As of now, we don't do a great job of costing for partition pruning > that will happen during execution. We won't be inventing anything to > fix that in existing releases of PostgreSQL, so you'll need to either > adjust the code yourself, or find a workaround. > > You've not shown us your schema, but perhaps enable_partitionwise_join > = on might help you. Other things that might help are further lowering > random_page_cost or raising effective_cache_size artificially high. > It's hard to tell from here how much random I/O is being costed into > the index scans. You could determine this by checking if the nested > loop plan costs change as a result of doing further increases to > effective_cache_size. You could maybe nudge it up enough for it to win > over the hash join plan. It is possible that this won't work, however. > > > for secondary Nest Loop Anti join could be rows 299118 rows * > 15.78(avg index scan cost of one partition) = 4,720,082 that still much > less than 132168227.57 ? > > for Hash Right join, is it possible to estimate by 8 seq > partition scan instead of all 32 hash partitions since the first query > estimated 8 rows only ? > > extend statistics may help estimate count(partitionkeyid) based > on other columns bind variables, but looks like that did not help table > join case. > > I can't quite follow this. You'll need to better explain where you're > getting these numbers for me to be able to understand. > > David > --SQL text prepare slowsql10(bigint,bigint,timestamp,timestamp) AS SELECT m.partitionkeyid, COUNT(w.ctinfo) AS __count FROM testaa.wmdata m LEFT JOIN testaa.wmvtee w ON m.partitionkeyid = w.partitionkeyid WHERE m.sstype = 'lltype' AND m.llid = $1 AND m.hhid = $2 AND m.stime BETWEEN $3 AND $4 AND NOT EXISTS (SELECT 1 FROM testaa.wmpct P WHERE P.partitionkeyid = m.partitionkeyid AND LOWER(P.ctinfo) = LOWER(w.ctinfo)) GROUP BY m.partitionkeyid generally speaking, it run in less than 2 seconds, but sometimes, it suddenly go to bad plan. schema: all three tables are hash partitioned tables(partitionkeyid column is the top index key too), generally speaking with index scan and partition prunning with partitionkeyid, it's good plan. but sometimes, optimizer suddenly go to "Hash Right Join" and seq scan on that large table "testaa.wmpct, that made very bad plan. from the cost, looks like optimizer try to calculate cost for all partitions bitmapindex scan that show huge cost but actually only a small part of partition needed other (never executed). and similar to estimate the "rows" for leftoutjoin, that made huge difference too. no idea how to make optimizer to consider partition prunning for this SQL case ? --the bad plan explain output, with analyze mode, it's hanging there and consumes huge temp space, so cancel that. test=> explain execute slowsql10(1234567,678901234,'2024-06-15 03:00:00','2024-07-02 03:00:00');
Re: Hash Right join and seq scan
Is the query fast with some bind parameters but slow with others?
If so, it'd be better to show an explain with 'fast' and 'slow' bind
params, rather than the same bind params with enable_*=off.
Or is the change because autoanalyze runs on some table and changes the
statistics enough to change the plan ? Investigate by setting
log_autovacuum_min_duration=0 or by checking
pg_stat_all_tables.last_{auto,}{vacuum,analyze}.
Maybe your llid/hhid are correlated, and you should CREATE STATISTICS.
Or maybe the answer will be to increase the stats target.
