Re: Why could different data in a table be processed with different performance?
On 25/09/18 00:28, Vladimir Ryabtsev wrote:
>
> > it is not unusual to have 1GB cache or more... and do not forget to drop
> > the cache between tests + do a sync
> I conducted several long runs of dd, so I am sure that this numbers are
> fairly correct. However, what worries me is that I test sequential read speed
> while during my experiments Postgres might need to read from random places
> thus reducing real read speed dramatically. I have a feeling that this can be
> the reason.
> I also reviewed import scripts and found the import was done in DESCENDING
> order of IDs. It was so to get most recent records sooner, may be it caused
> some inefficiency in the storage... But again, it was so for both ranges.
>
> > - how big is your index?
> pg_table_size('articles_pkey') = 1561 MB
>
> > - how big is the table?
> pg_table_size('articles') = 427 GB
> pg_table_size('pg_toast.pg_toast_221558') = 359 GB
>
Since you have a very big toast table, given you are using spinning disks, I
think that increasing the block size will bring benefits. (Also partitioning is
not a bad idea.)
If my understanding of TOAST is correct, if data will fit blocks of let's say
16 or 24 KB then one block retrieval from Postgres will result in less seeks on
the disk and less possibility data gets sparse on your disk. (a very quick and
dirty calculation, shows your average block size is 17KB)
One thing you might want to have a look at, is again the RAID controller and
your OS. You might want to have all of them aligned in block size, or maybe
have Postgres ones a multiple of what OS and RAID controller have.
> > - given the size of shared_buffers, almost 2M blocks should fit, but you
> > say 2 consecutive runs still are hitting the disk. That's strange indeed
> > since you are using way more than 2M blocks.
> TBH, I cannot say I understand your calculations with number of blocks...
shared_buffers = 15GB IIRC (justpaste link is gone)
15 * 1024 *1024 = 15728640 KB
using 8KB blocks = 1966080 total blocks
if you query shared_buffers you should get the same number of total available
blocks
> But to clarify: consecutive runs with SAME parameters do NOT hit the disk,
> only the first one does, consequent ones read only from buffer cache.
>
I m a bit confused.. every query you pasted contains 'read':
Buffers: shared hit=50 read=2378
and 'read' means you are reading from disk (or OS cache). Or not?
> > - As Laurenz suggested (VACUUM FULL), you might want to move data around.
> > You can try also a dump + restore to narrow the problem to data or disk
> I launched VACUUM FULL, but it ran very slowly, according to my calculation
> it might take 17 hours. I will try to do copy data into another table with
> the same structure or spin up another server, and let you know.
>
cool, that should also clarify if the reverse order matters or not
> > - You might also want to try to see the disk graph of Windows, while you
> > are running your tests. It can show you if data (and good to know how much)
> > is actually fetching from disk or not.
> I wanted to do so but I don't have access to Hyper-V server, will try to
> request credentials from admins.
>
> Couple more observations:
> 1) The result of my experiment is almost not affected by other server load.
> Another user was running a query (over this table) with read speed ~130 MB/s,
> while with my query read at 1.8-2 MB/s.
> 2) iotop show higher IO % (~93-94%) with slower read speed (though it is not
> quite clear what this field is). A process from example above had ~55% IO
> with 130 MB/s while my process had ~93% with ~2MB/s.
>
I think because you are looking at 'IO' column which indicates (from manual)
'..the percentage of time the thread/process spent [..] while waiting on
I/O.'
> Regards,
> Vlad
>
regards,
fabio pardi
Re: link to Slow_Query_Questions from wiki/Main Page
I asked few weeks ago [0] but didn't get a response on -docs so resending here for wider review/discussion/. [0] https://www.postgresql.org/message-id/flat/20180908012957.GA15350%40telsasoft.com On Fri, Sep 07, 2018 at 08:29:57PM -0500, Justin Pryzby wrote: > Hi, > > I'm suggesting to link to: > https://wiki.postgresql.org/wiki/Slow_Query_Questions > > From either: > https://wiki.postgresql.org/wiki/Main%20Page > or: > https://wiki.postgresql.org/wiki/Performance_Optimization > > I know it's a wiki, but it looks like I'm not allowed to edit the 'Main' page, > so I'm asking here, which is prolly for the best anyway. Feel free to forward > to or ask for opinion on the -perform list. > > Thanks, > Justin
Re: link to Slow_Query_Questions from wiki/Main Page
On 2018-Sep-25, Justin Pryzby wrote: > I asked few weeks ago [0] but didn't get a response on -docs so resending here > for wider review/discussion/. I support the idea of adding a link to "Performance Optimization". That's not a protected page, so you should be able to do it. > [0] > https://www.postgresql.org/message-id/flat/20180908012957.GA15350%40telsasoft.com > > On Fri, Sep 07, 2018 at 08:29:57PM -0500, Justin Pryzby wrote: > > Hi, > > > > I'm suggesting to link to: > > https://wiki.postgresql.org/wiki/Slow_Query_Questions > > > > From either: > > https://wiki.postgresql.org/wiki/Main%20Page > > or: > > https://wiki.postgresql.org/wiki/Performance_Optimization > > > > I know it's a wiki, but it looks like I'm not allowed to edit the 'Main' > > page, > > so I'm asking here, which is prolly for the best anyway. Feel free to > > forward > > to or ask for opinion on the -perform list. > > > > Thanks, > > Justin > -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Why could different data in a table be processed with different performance?
> 1) Which file system are you using? >From Linux's view it's ext4. Real vmdx file on Hyper-V is stored on NTFS, as far as I know. > 2) What is the segment layout of the LVM PVs and LVs? I am a bit lost with it. Is that what you are asking about? master: # pvs --segments PV VGFmt Attr PSize PFree Start SSize /dev/sda5 ubuntu-vg lvm2 a-- 19.76g 20.00m 0 4926 /dev/sda5 ubuntu-vg lvm2 a-- 19.76g 20.00m 4926 127 /dev/sda5 ubuntu-vg lvm2 a-- 19.76g 20.00m 5053 5 # lvs --segments LV VGAttr #Str Type SSize root ubuntu-vg -wi-ao---1 linear 19.24g swap_1 ubuntu-vg -wi-ao---1 linear 508.00m slave: # pvs --segments PV VG Fmt Attr PSize PFree Start SSize /dev/sda3 postgresnlpslave-vg lvm2 a-- 429.77g0 0 110021 /dev/sda5 postgresnlpslave-vg lvm2 a-- 169.52g0 0 28392 /dev/sda5 postgresnlpslave-vg lvm2 a-- 169.52g0 28392 2199 /dev/sda5 postgresnlpslave-vg lvm2 a-- 169.52g0 30591 2560 /dev/sda5 postgresnlpslave-vg lvm2 a-- 169.52g0 33151 10246 /dev/sdb1 postgresnlpslave-vg lvm2 a-- 512.00g0 0 131071 # lvs --segments LV VG Attr #Str Type SSize root postgresnlpslave-vg -wi-ao1 linear 110.91g root postgresnlpslave-vg -wi-ao1 linear 40.02g root postgresnlpslave-vg -wi-ao1 linear 10.00g root postgresnlpslave-vg -wi-ao1 linear 429.77g root postgresnlpslave-vg -wi-ao1 linear 512.00g swap_1 postgresnlpslave-vg -wi-ao1 linear 8.59g > 3) Do you use LVM for any "extra" features, such as snapshots? I don't think so, but how to check? vgs gives #SN = 0, is that it? > 4) You can try using seekwatcher to see where on the disk the slowness is occurring. You get a chart similar to this http://kernel.dk/dd-md0-xfs-pdflush.png > 5) BCC is a collection of tools that might shed a light on what is happening. https://github.com/iovisor/bcc Will look into it. Regards, Vlad
Re: To keep indexes in memory, is large enough effective_cache_size enough?
On Tue, 25 Sep 2018 at 18:36, Sam R. wrote: > Regarding double buffering: I do not know how much double buffering would > slow down operations. > It could also be possible to turn off kernel page cache on our DB server, to > avoid double buffering. Although, we may still keep it in use. I think you've misunderstood double buffering. The double buffering itself does not slow anything down. If the buffer is in shared buffers already then it does not need to look any further for it. Double buffering only becomes an issue when buffers existing 2 times in memory causes other useful buffers to appear 0 times. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
