Question on what Duration in the log

2025-04-09 Thread Vitale, Anthony, Sony Music
Hello All

In my PG Version 14 log I have set to log the duration of sql (Example of one 
log row below).

2025-04-08 07:31:57 UTC:XXX(55994):XXX:[8949]:LOG:  duration: 6555.756 ms  
execute S_381: {Actual Parameterized Sql executed}


This Shows the Execution of a JDBC Server Side Prepared statement (i.e S_381) 
which took 6.5 seconds.

When I run the Sql using explain analyze it runs shows an execution time of 73 
ms

Planning Time: 0.773 ms
Execution Time: 73.578 ms

I am trying to understand the duration of 6.5 seconds and if it represents the 
time to execute the sql (Since the plan already exists)  itself or the full 
time needed to execute the sql as well as for the last row to be fetched by the 
program.

The result set of the query does return 50,000+ rows

Is there anywhere that explains what exactly the duration in the log represents.

Thanks
Anthony Vitale




RE: Question on what Duration in the log

2025-04-09 Thread Vitale, Anthony, Sony Music
Thanks Tom

Perfect.

Anthony Vitale
 
This message is only for the use of the persons(s) to whom it is intended. It 
may contain privileged and confidential information within the meaning of 
applicable law. If you are not the intended recipient, please do not use this 
information for any purpose, destroy this message and inform the sender 
immediately. The views expressed in this communication may not necessarily be 
the views held by Sony Music Entertainment

-Original Message-
From: Tom Lane  
Sent: Wednesday, April 9, 2025 11:02 AM
To: Vitale, Anthony, Sony Music 
Cc: [email protected]
Subject: Re: Question on what Duration in the log

EXTERNAL SENDER


"Vitale, Anthony, Sony Music"  writes:
> In my PG Version 14 log I have set to log the duration of sql (Example of one 
> log row below).
> 2025-04-08 07:31:57 UTC:XXX(55994):XXX:[8949]:LOG:  duration: 6555.756 
> ms  execute S_381: {Actual Parameterized Sql executed} This Shows the 
> Execution of a JDBC Server Side Prepared statement (i.e S_381) which took 6.5 
> seconds.

> When I run the Sql using explain analyze it runs shows an execution 
> time of 73 ms Planning Time: 0.773 ms Execution Time: 73.578 ms

They're not measuring the same thing really, because EXPLAIN isn't counting the 
time needed to transmit data to the client.

This is not just a matter of network overhead (though that can be
substantial) but also the time required to format the values into text.  If you 
are talking about large values, which I bet you are, there is also the time 
needed to fetch them from out-of-line TOAST storage, which is another thing 
that doesn't happen in EXPLAIN.

Since v17, EXPLAIN has grown a SERIALIZE option that causes it to do everything 
except the network transmission, which might be of interest in analyzing this 
further.

regards, tom lane


This email originated from outside of Sony Music.  Do not click links or open 
attachments unless you recognize the sender and know the content is safe.




Re: Question on what Duration in the log

2025-04-09 Thread Tom Lane
"Vitale, Anthony, Sony Music"  writes:
> In my PG Version 14 log I have set to log the duration of sql (Example of one 
> log row below).
> 2025-04-08 07:31:57 UTC:XXX(55994):XXX:[8949]:LOG:  duration: 6555.756 ms  
> execute S_381: {Actual Parameterized Sql executed}
> This Shows the Execution of a JDBC Server Side Prepared statement (i.e S_381) 
> which took 6.5 seconds.

> When I run the Sql using explain analyze it runs shows an execution time of 
> 73 ms
> Planning Time: 0.773 ms
> Execution Time: 73.578 ms

They're not measuring the same thing really, because EXPLAIN isn't
counting the time needed to transmit data to the client.

This is not just a matter of network overhead (though that can be
substantial) but also the time required to format the values into
text.  If you are talking about large values, which I bet you are,
there is also the time needed to fetch them from out-of-line TOAST
storage, which is another thing that doesn't happen in EXPLAIN.

Since v17, EXPLAIN has grown a SERIALIZE option that causes it to
do everything except the network transmission, which might be
of interest in analyzing this further.

regards, tom lane




Re: parallel pg_restore blocks on heavy random read I/O on all children processes

2025-04-09 Thread Hannu Krosing
You may be interested in a patch "Adding pg_dump flag for parallel
export to pipes"[1] which allows using pipes in directory former
parallel dump and restore.
There the offsets are implicitly taken care of by the file system.

[1] 
https://www.postgresql.org/message-id/CAH5HC97p4kkpikar%2BswuC0Lx4YTVkE30sTsFX94tyzih7Cc_%3Dw%40mail.gmail.com

On Sun, Mar 23, 2025 at 4:46 PM Tom Lane  wrote:
>
> Dimitrios Apostolou  writes:
> > On Thu, 20 Mar 2025, Tom Lane wrote:
> >> I am betting that the problem is that the dump's TOC (table of
> >> contents) lacks offsets to the actual data of the database objects,
> >> and thus the readers have to reconstruct that information by scanning
> >> the dump file.  Normally, pg_dump will back-fill offset data in the
> >> TOC at completion of the dump, but if it's told to write to an
> >> un-seekable output file then it cannot do that.
>
> > Further questions:
>
> > * Does the same happen in an uncompressed dump? Or maybe the offsets are
> >pre-filled because they are predictable without compression?
>
> Yes; no.  We don't know the size of a table's data as-dumped until
> we've dumped it.
>
> > * Should pg_dump print some warning for generating a lower quality format?
>
> I don't think so.  In many use-cases this is irrelevant and the
> warning would just be an annoyance.
>
> > * The seeking pattern in pg_restore seems non-sensical to me: reading 4K,
> >jumping 8-12K, repeat for the whole file? Consuming 15K IOPS for an
> >hour. /Maybe/ something to improve there... Where can I read more about
> >the format?
>
> It's reading data blocks (or at least the headers thereof), which have
> a limited size.  I don't think that size has changed since circa 1999,
> so maybe we could consider increasing it; but I doubt we could move
> the needle very far that way.
>
> > * Why doesn't it happen in single-process pg_restore?
>
> A single-process restore is going to restore all the data in the order
> it appears in the archive file, so no seeking is required.  Of course,
> as soon as you ask for parallelism, that doesn't work too well.
>
> Hypothetically, maybe the algorithm for handing out tables-to-restore
> to parallel workers could pay attention to the distance to the data
> ... except that in the problematic case we don't have that
> information.  I don't recall for sure, but I think that the order of
> the TOC entries is not necessarily a usable proxy for the order of the
> data entries.  It's unclear to me that overriding the existing
> heuristic (biggest tables first, I think) would be a win anyway.
>
> regards, tom lane
>
>