Re: pg_dump 3 times as slow after 8.4 -> 9.5 upgrade
On Thu, Dec 7, 2017 at 2:31 PM, Laurenz Albe wrote: > Gunther wrote: >> Something is wrong with the dump thing. And no, it's not SSL or whatever, >> I am doing it on a local system with local connections. Version 9.5 >> something. > > That's a lot of useful information. > > Try to profile where the time is spent, using "perf" or similar. > > Do you connect via the network, TCP localhost or UNIX sockets? > The last option should be the fastest. You can use SSL over a local TCP connection. Whether it's the case is the thing. In my experience, SSL isn't a problem, but compression *is*. With a modern-enough openssl, enabling compression is tough, it's forcefully disabled by default due to the vulnerabilities that were discovered related to its use lately. So chances are, no matter what you configured, compression isn't being used. I never measured it compared to earlier versions, but pg_dump is indeed quite slow, and the biggest offender is formatting the COPY data to be transmitted over the wire. That's why parallel dump is so useful, you can use all your cores and achieve almost perfect multicore acceleration. Compression of the archive is also a big overhead, if you want compression but want to keep the overhead to the minimum, set the minimum compression level (1). Something like: pg_dump -Fd -j 8 -Z 1 -f target_dir yourdb
Re: Need Help on wal_compression
On Tue, Jan 9, 2018 at 3:53 AM, Rambabu V wrote:
> Hi Team,
>
> Daily 4000 Archive files are generating and these are occupying more
> space, we are trying to compress wall files with using wal_compression
> parameter, but we are not seeing any change in wal files count, could you
> please help us on this.
>
That's very little information to go on.
You'll probably want to inspect WAL record stats before and after enabling
wal_compression to see whether it makes sense to do so. Take a look at
pg_xlogdump --stats
For example:
$ pg_xlogdump --stats -p /path/to/pg_xlog 00010002C36400F0
00010002C36400FA
Type N (%) Record
size (%) FPI size (%)Combined size (%)
- ---
--- --- ---
- ---
XLOG 0 (
0.00)0 ( 0.00)0 (
0.00)0 ( 0.00)
Transaction 11 ( 0.00)
352 ( 0.00)0 ( 0.00) 352 ( 0.00)
Storage0 (
0.00)0 ( 0.00)0 (
0.00)0 ( 0.00)
CLOG 0 (
0.00)0 ( 0.00)0 (
0.00)0 ( 0.00)
Database 0 (
0.00)0 ( 0.00)0 (
0.00)0 ( 0.00)
Tablespace 0 (
0.00)0 ( 0.00)0 (
0.00)0 ( 0.00)
MultiXact 4 ( 0.00)
208 ( 0.00)0 ( 0.00) 208 ( 0.00)
RelMap 0 (
0.00)0 ( 0.00)0 (
0.00)0 ( 0.00)
Standby2 ( 0.00)
116 ( 0.00)0 ( 0.00) 116 ( 0.00)
Heap2 2504 ( 0.18)
78468 ( 0.20) 1385576 ( 3.55) 1464044 ( 1.89)
Heap 667619 ( 48.23)
19432159 ( 50.47) 28641357 ( 73.35) 48073516 (
61.99)
Btree 712093 ( 51.45)
18643846 ( 48.42) 9021270 ( 23.10) 27665116 (
35.67)
Hash 0 (
0.00)0 ( 0.00)0 (
0.00)0 ( 0.00)
Gin0 (
0.00)0 ( 0.00)0 (
0.00)0 ( 0.00)
Gist 0 (
0.00)0 ( 0.00)0 (
0.00)0 ( 0.00)
Sequence1918 ( 0.14)
349076 ( 0.91)0 ( 0.00) 349076 ( 0.45)
SPGist 0 (
0.00)0 ( 0.00)0 (
0.00)0 ( 0.00)
BRIN 0 (
0.00)0 ( 0.00)0 (
0.00)0 ( 0.00)
CommitTs 0 (
0.00)0 ( 0.00)0 (
0.00)0 ( 0.00)
ReplicationOrigin 0 (
0.00)0 ( 0.00)0 (
0.00)0 ( 0.00)
Total1384151
38504225 [49.65%] 39048203 [50.35%] 77552428 [100%]
That shows 50% of that are full page writes. This is with compression
enabled. WAL compression will only help FPW, so if you don't have a large
volume of FPW, or they don't compress well, you won't benefit much.
Re: Batch insert heavily affecting query performance.
On Wed, Dec 27, 2017 at 2:10 PM, Mike Sofen wrote: > In my experience, that 77ms will stay quite constant even if your db grew > to > 1TB. Postgres IS amazing. BTW, for a db, you should always have > provisioned IOPS or else your performance can vary wildly, since the SSDs > are shared. > > > > Re Lambda: another team is working on a new web app using Lambda calls > and they were also experiencing horrific performance, just like yours (2 > seconds per call). They discovered it was the Lambda connection/spin-up > time causing the problem. They solved it by keeping several Lambda’s > “hot”, for an instant connection…solved the problem, the last I heard. > Google for that topic, you’ll find solutions. > You should try to implement an internal connection pool in your lambda. Lambda functions are reused. You have no guarantees as to how long these processes will live, but they will live for more than one request. So if you keep a persistent connection in your lambda code, the first invocation may be slow, but further invocations will be fast. Lambda will try to batch several calls at once. In fact, you can usually configure batching in the event source to try to maximize this effect. In my experience, your lambda will be most probably network-bound. Increase the lambda's memory allocation, to get a bigger chunk of the available network bandwidth (why they decided to call that "memory" nobody will ever be able to tell).
Re: 8.2 Autovacuum BUG ?
On Tue, Jan 23, 2018 at 11:39 AM, Pavan Teja wrote: > Yes so many wals are continuing to be produced. > > Deleting the wals after a backup of the database. > > Yes archiving mode is on. And the warning message in log file is > > " checkpoints are frequently occurring (1second apart). Consider > increasing checkpoint_segements parameter". > > My doubt is previously the same are the parameters which are reflected as > of now. Then what is the point in considering altering those values. > Correct me if I am wrong. > You can use pg_xlogdump to inspect those logs and see which relations/transactions are generating so much WAL. Then you can hunt within your apps which code is responsible for that traffic, or whether it in fact is autovacuum.
Re: 8.2 Autovacuum BUG ?
On Tue, Jan 23, 2018 at 1:16 PM, Pavan Teja wrote: > On Jan 23, 2018 9:37 PM, "Claudio Freire" wrote: > > > > On Tue, Jan 23, 2018 at 11:39 AM, Pavan Teja > wrote: > >> Yes so many wals are continuing to be produced. >> >> Deleting the wals after a backup of the database. >> >> Yes archiving mode is on. And the warning message in log file is >> >> " checkpoints are frequently occurring (1second apart). Consider >> increasing checkpoint_segements parameter". >> >> My doubt is previously the same are the parameters which are reflected as >> of now. Then what is the point in considering altering those values. >> Correct me if I am wrong. >> > > You can use pg_xlogdump to inspect those logs and see which > relations/transactions are generating so much WAL. > > Then you can hunt within your apps which code is responsible for that > traffic, or whether it in fact is autovacuum. > > > > Hi Claudio, > > Is pg_xlogdump available for postgres 9.1, as my current production is > postgres 9.1. > Right, it was added in 9.3 I'm unsure whether it can parse pre-9.3 WAL. I know technically speaking, WAL doesn't have to stay compatible across versions, but it might be for the limited purposes of xlogdump. Yes investigated in that area, found DML's and also autovacuum statements > for some relations. And the DML's are the same before this huge WAL traffic > and normal WAL traffic. > > Anyways, thanks for your timely response 😊 > While looking at current query activity makes sense, if you can't identify a culprit doing that, inspecting the WAL directly will let you know with precision what is causing all that WAL. Hence the suggestion. If xlogdump doesn't work in 9.1, I'm not sure what you can do. One idea that pops to mind, though there's probably a better one, you may want to consider attaching an strace to a recovery process on a replica. Preferrably one you're not worried about slowing down. Analyzing output from that is much harder, but it may give you some insight. You'll have to correlate file handles to file names to relations manually, which can be quite a chore.
Re: 8.2 Autovacuum BUG ?
On Wed, Jan 24, 2018 at 3:54 AM, pavan95 wrote: > Hi Claudio, > > We didn't configure any replication to our production server. Which strace > are you talking about? > This one: https://linux.die.net/man/1/strace You can attach it to a process (assuming you have the necessary permissions) and it will report all the syscalls the process does. That does slow down the process though. Then lsof ( https://linux.die.net/man/8/lsof ) can be used to map file descriptor numbers to file paths. You have to do it as soon as you read the output, because files get closed and file descriptors reused. So it's better to have a script that directly reads from /proc/pid/fd or fdinfo, but that takes some programming. It is nontrivial, but sometimes it's the only tool in your belt. You may want to try something else first though. > We did a keen observation that only at the time 9'th minute of the hour and > 39'th minute of the hour the so called archive logs are generated even > when nobody is connecting from application(off the business hours). Well, if you don't know what happens at those times (and only at those times), it's not that useful. Since you don't know what is causing this for certain, first thing you have to do is ascertain that. Try increasing logging as much as you can, especially around those times, and see what turns on then and not at other times. You can monitor autovacuum processes as well in pg_stat_activity, so make sure you check that as well, as autovacuum will only log once it's done. You do know autovacuum is running at those times, you have to check whether it isn't when WAL isn't being generated, and whether autovacuum is vacuuming the same tables over and over or what. Your earlier mails show autoanalyze runs, not vacuum. Those shouldn't cause so much WAL, but if it's running very often and you have lots of stats, then maybe. You can also try pg_stat_statements: https://www.postgresql.org/docs/9.1/static/pgstatstatements.html Again, concentrate on the differential - what happens at those times, that doesn't at other times. Another idea would be to check for freeze runs in autovacuum. Ie, what's described here: https://wiki.postgresql.org/wiki/VacuumHeadaches#FREEZE There's a nice blog post with some queries to help you with that here: http://www.databasesoup.com/2012/09/freezing-your-tuples-off-part-1.html (and it's continuation here: http://www.databasesoup.com/2012/10/freezing-your-tuples-off-part-2.html ). I'm not saying you should tune those parameters, what you were showing was autoanalyze activity, not vacuum freeze, but you should check whether you need to anyway.
Re: 8.2 Autovacuum BUG ?
On Wed, Jan 24, 2018 at 8:50 AM, pavan95 wrote: > Hello all, > > One more interesting observation made by me. > > I have ran the below query(s) on production: > > SELECT > relname, > age(relfrozenxid) as xid_age, > pg_size_pretty(pg_table_size(oid)) as table_size > FROM pg_class > WHERE relkind = 'r' and pg_table_size(oid) > 1073741824 > ORDER BY age(relfrozenxid) DESC ; > relname | > xid_age | table_size > > +-+ > *hxx* | > 7798262 | 3245 MB > hrx | > 7797554 | 4917 MB > irxx| > 7796771 | 2841 MB > hr_ | 7744262 | > 4778 MB > reimbxxx | 6767712 | 1110 MB > > show autovacuum_freeze_max_age; > autovacuum_freeze_max_age > --- > 2 > (1 row) > You seem to be rather far from the freeze_max_age. Unless you're consuming txids at a very high rate, I don't think that's your problem.
Re: 8.2 Autovacuum BUG ?
On Tue, Jan 30, 2018 at 10:55 AM, pavan95 wrote: > Hello all, > > Will a sudden restart(stop/start) of a postgres database will generate this > huge WAL? Shouldn't
Re: effective_io_concurrency on EBS/gp2
On Wed, Jan 31, 2018 at 1:57 PM, Vitaliy Garnashevich wrote: > More tests: > > io1, 100 GB: > > effective_io_concurrency=0 > Execution time: 40333.626 ms > effective_io_concurrency=1 > Execution time: 163840.500 ms In my experience playing with prefetch, e_i_c>0 interferes with kernel read-ahead. What you've got there would make sense if what postgres thinks will be random I/O ends up being sequential. With e_i_c=0, the kernel will optimize the hell out of it, because it's a predictable pattern. But with e_i_c=1, the kernel's optimization gets disabled but postgres isn't reading much ahead, so you get the worst possible case.
Re: effective_io_concurrency on EBS/gp2
On Wed, Jan 31, 2018 at 11:21 PM, hzzhangjiazhi wrote: > HI > > I think this parameter will be usefull when the storage using RAID > stripe , otherwise turn up this parameter is meaningless when only has one > device。 Not at all. Especially on EBS, where keeping a relatively full queue is necessary to get max thoughput out of the drive. Problem is, if you're scanning a highly correlated index, the mechanism is counterproductive. I had worked on some POC patches for correcting that, I guess I could work something out, but it's low-priority for me. Especially since it's actually a kernel "bug" (or shortcoming), that could be fixed in the kernel rather than worked around by postgres.
Re: effective_io_concurrency on EBS/gp2
On Sat, Feb 3, 2018 at 8:05 PM, Vitaliy Garnashevich wrote: > Looks like this behavior is not caused by, and does not depend on: > - variable performance in the cloud > - order of rows in the table > - whether the disk is EBS (backed by SSD or HDD), or ordinary SSD > - kernel version > > Does this mean that the default setting for eic on Linux is just inadequate > for how the modern kernels behave? Or am I missing something else in the > tests? > > Regards, > Vitaliy I have analyzed this issue quite extensively in the past, and I can say with high confidence that you're analysis on point 2 is most likely wrong. Now, I don't have all the information to make that a categorical assertion, you might have a point, but I believe you're misinterpreting the data. I mean, that the issue is indeed affected by the order of rows in the table. Random heap access patterns result in sparse bitmap heap scans, whereas less random heap access patterns result in denser bitmap heap scans. Dense scans have large portions of contiguous fetches, a pattern that is quite adversely affected by the current prefetch mechanism in linux. This analysis does point to the fact that I should probably revisit this issue. There's a rather simple workaround for this, pg should just avoid issuing prefetch orders for sequential block patterns, since those are already much better handled by the kernel itself.
Re: effective_io_concurrency on EBS/gp2
On Mon, Feb 5, 2018 at 8:26 AM, Vitaliy Garnashevich wrote: >> I mean, that the issue is indeed affected by the order of rows in the >> table. Random heap access patterns result in sparse bitmap heap scans, >> whereas less random heap access patterns result in denser bitmap heap >> scans. Dense scans have large portions of contiguous fetches, a >> pattern that is quite adversely affected by the current prefetch >> mechanism in linux. >> > > Thanks for your input. > > How can I test a sparse bitmap scan? Can you think of any SQL commands which > would generate data and run such scans? > > Would a bitmap scan over expression index ((aid%1000)=0) do a sparse bitmap > scan? If you have a minimally correlated index (ie: totally random order), and suppose you have N tuples per page, you need to select less (much less) than 1/Nth of the table.
Re: blending fast and temp space volumes
On Wed, Feb 21, 2018 at 4:50 PM, Peter Geoghegan wrote: > On Wed, Feb 21, 2018 at 7:53 AM, Rick Otten wrote: >> side note: The disadvantage of local SSD is that it won't survive "hitting >> the virtual power button" on an instance, nor can it migrate automatically >> to other hardware. (We have to hit the power button to add memory/cpu to >> the system, and sometimes the power button might get hit by accident.) This >> is OK for temp space. I never have my database come up automatically on >> boot, and I have scripted the entire setup of the temp space volume and data >> structures. I can run that script before starting the database. I've done >> some tests and it seems to work great. I don't mind rolling back any >> transaction that might be in play during a power failure. > > It sounds like you're treating a temp_tablespaces tablespace as > ephemeral, which IIRC can have problems that an ephemeral > stats_temp_directory does not have. For instance? I've been doing that for years without issue. If you're careful to restore the skeleton directory structure at server boot up, I haven't had any issues. On Wed, Feb 21, 2018 at 4:22 PM, Craig James wrote: > > On Wed, Feb 21, 2018 at 7:53 AM, Rick Otten >> I was wondering if there anyone had ideas for how to make that possible. >> I don't think I want to add the SAN disk to the same LVM volume group as the >> local disk, but maybe that would work, since I'm already building it with a >> script anyhow ... Is LVM smart enough to optimize radically different disk >> performances? > > > Couldn't you configure both devices into a single 6T device via RAID0 using > md? That would probably perform as slow as the slowest disk.
Re: blending fast and temp space volumes
On Wed, Feb 21, 2018 at 5:09 PM, Peter Geoghegan wrote: > On Wed, Feb 21, 2018 at 12:07 PM, Claudio Freire > wrote: >> On Wed, Feb 21, 2018 at 4:50 PM, Peter Geoghegan wrote: >>> On Wed, Feb 21, 2018 at 7:53 AM, Rick Otten >>> wrote: >>>> side note: The disadvantage of local SSD is that it won't survive "hitting >>>> the virtual power button" on an instance, nor can it migrate automatically >>>> to other hardware. (We have to hit the power button to add memory/cpu to >>>> the system, and sometimes the power button might get hit by accident.) >>>> This >>>> is OK for temp space. I never have my database come up automatically on >>>> boot, and I have scripted the entire setup of the temp space volume and >>>> data >>>> structures. I can run that script before starting the database. I've >>>> done >>>> some tests and it seems to work great. I don't mind rolling back any >>>> transaction that might be in play during a power failure. >>> >>> It sounds like you're treating a temp_tablespaces tablespace as >>> ephemeral, which IIRC can have problems that an ephemeral >>> stats_temp_directory does not have. >> >> For instance? >> >> I've been doing that for years without issue. If you're careful to >> restore the skeleton directory structure at server boot up, I haven't >> had any issues. > > Then you clearly know what I mean already. That's not documented as > either required or safe anywhere. Ah, ok. But the OP did mention he was doing that already. So it should be safe.
