Re: [Beginner Question]A question about yacc & lex

2023-06-02 Thread Tom Lane
"=?ISO-8859-1?B?V2VuIFlp?=" <896634...@qq.com> writes:
> I don't know why are there many extra spaces in the output, and why the error 
> message 'syntax error' is showed.

You didn't say exactly what you typed at it, but:

Your flex lexer lacks actions for many possible input characters,
notably spaces.  I seem to recall that the default action in
such cases is to print the character on stdout.

The grammar lacks any way to deal with any input after the
first newline.  Anything except EOF will draw a syntax error.

regards, tom lane




Re: Pg 16: will pg_dump & pg_restore be faster?

2023-06-02 Thread Jonathan S. Katz

On 5/30/23 10:05 PM, David Rowley wrote:


My understanding had been that concurrency was required, but I see the
commit message for 00d1e02be mentions:


Even single threaded
COPY is measurably faster, primarily due to not dirtying pages while
extending, if supported by the operating system (see commit 4d330a61bb1).


If that's the case then maybe the beta release notes could be edited
slightly to reflect this. Maybe something like:

"Relation extensions have been improved allowing faster bulk loading
of data using COPY. These improvements are more significant when
multiple processes are concurrently loading data into the same table."

The current text of "PostgreSQL 16 can also improve the performance of
concurrent bulk loading of data using COPY up to 300%." does lead me
to believe that nothing has been done to improve things when only a
single backend is involved.


Typically once a release announcement is out, we'll only edit it if it's 
inaccurate. I don't think the statement in the release announcement is 
inaccurate, as it specifies that concurrent bulk loading is faster.


I had based the description on what Andres described in the original 
discussion and through reading[1], which showed a "measurable" 
improvement as the commit message said, but it was not to the same 
degree as concurrently loading. It does still seem impactful -- the 
results show up to 20% improvement on a single backend -- but the bigger 
story was around the concurrency.


I'm -0.5 for revising the announcement, but I also don't want people to 
miss out on testing this. I'd be OK with this:


"PostgreSQL 16 can also improve the performance of bulk loading of data, 
with some tests showing using up to 300% improvement when concurrently 
executing `COPY` commands."


Thanks,

Jonathan

[1] 
https://www.postgresql.org/message-id/20221029025420.eplyow6k7tgu6...@awork3.anarazel.de




OpenPGP_signature
Description: OpenPGP digital signature


Interconnected views

2023-06-02 Thread Oliver Kohll
Hi,

Just wondering, does anyone else create apps which might not have 'big'
data, but quite complex arrangements of views joining to each other?

If so, do you have scripts to aid refactoring them e.g. drop/recreate/test
them in the right order etc.?

I'm really impressed with the way Postgres handles these multi-layered
views. Some of our explain analyze outputs could fill a book each! Tools
like Depesz' can be very useful. Sometimes a little tweaking or a judicious
index is necessary, but the performance is very reliable and scalable.

Blog post about it here:
https://blog.agilebase.co.uk/2023/05/21/refactoring-sql-views/

Oliver

-- 
See us at the Bath Digital Festival , 12th July


Re: Interconnected views

2023-06-02 Thread Chris Travers
On Fri, Jun 2, 2023, 09:36 Oliver Kohll  wrote:

> Hi,
>
> Just wondering, does anyone else create apps which might not have 'big'
> data, but quite complex arrangements of views joining to each other?
>
> If so, do you have scripts to aid refactoring them e.g. drop/recreate/test
> them in the right order etc.?
>

In Ledgersmb, we did this and also same with functions.  We used versioned
scripts, database schema change management tools, and reload utilities as
well as Pgtap for testing.

>
> I'm really impressed with the way Postgres handles these multi-layered
> views. Some of our explain analyze outputs could fill a book each! Tools
> like Depesz' can be very useful. Sometimes a little tweaking or a judicious
> index is necessary, but the performance is very reliable and scalable.
>
> Blog post about it here:
> https://blog.agilebase.co.uk/2023/05/21/refactoring-sql-views/
>

Yeah, I have always been impressed by Postgres here too.

>
> Oliver
>
> --
> See us at the Bath Digital Festival , 12th July
>


custom background worker task for \copy?

2023-06-02 Thread Joe Carlson
Hi Folks,

I’m writing a little utility for dumping parts of tables into files which will 
be later slurped into another primary instance. The primary source is our 
referential data (big, and subject to heavy loads when adding data), the second 
is a smaller version used in driving our web app. (smaller, portable, less 
prone to lags.)

Yes, a replication strategy can work but since the web app version is so much 
smaller (10% of the size) I thought the partial snapshot would be easier to 
manage.

I have SQL that does it with \copy (select * from  where…) … And that is 
fine. But it would be nice to be able to run the \copy commands in parallel. So 
I was thinking of writing a background worker. 

Never having done that before, I’m curious:
1) Is a background worker that I can execute in parallel appropriate for this 
job
2) Are there non-trivial examples of background workers out there to copy learn 
from?
3) Will doing multiple \copy’s in parallel just be of no benefit. Since pg_dump 
and pg_restore have the options of running multiple instances in parallel I 
thought the answer was it should help.

Thanks

Joe

Re: Interconnected views

2023-06-02 Thread Guillermo González de Agüero
Hace you taken a look at dbt (https://www.getdbt.com/)? It's aimed at
analytics but can be used for any purpose.


Regards,

Guillermo González de Agüero

El vie, 2 jun 2023 15:36, Oliver Kohll  escribió:

> Hi,
>
> Just wondering, does anyone else create apps which might not have 'big'
> data, but quite complex arrangements of views joining to each other?
>
> If so, do you have scripts to aid refactoring them e.g. drop/recreate/test
> them in the right order etc.?
>
> I'm really impressed with the way Postgres handles these multi-layered
> views. Some of our explain analyze outputs could fill a book each! Tools
> like Depesz' can be very useful. Sometimes a little tweaking or a judicious
> index is necessary, but the performance is very reliable and scalable.
>
> Blog post about it here:
> https://blog.agilebase.co.uk/2023/05/21/refactoring-sql-views/
>
> Oliver
>
> --
> See us at the Bath Digital Festival , 12th July
>


pg_upgrade and schema complexity...

2023-06-02 Thread Ron

Ran into this when upgrading from 13.11 to 15.3...

The pg_restore phase failed with "ERROR: out of shared memory", and 
recommended that I increase max_locks_per_transaction.  Doing so let the 
process run to completion.


It took 12.5 minutes to upgrade a 13GB instance.  Soon after, I upgraded a 
78GB cluster, and it only took 3.1 minutes.


(Both are VMs (same number of CPUs and RAM) connected to the same SAN.)

A "pg_dump --schema-only" of the two systems shows that the small-but-slow 
schema is 5.9M lines.


Is this to be expected of such a huge schema?

--
Born in Arizona, moved to Babylonia.




Re: pg_upgrade and schema complexity...

2023-06-02 Thread Adrian Klaver

On 6/2/23 17:44, Ron wrote:

Ran into this when upgrading from 13.11 to 15.3...

The pg_restore phase failed with "ERROR: out of shared memory", and 
recommended that I increase max_locks_per_transaction.  Doing so let the 
process run to completion.


It took 12.5 minutes to upgrade a 13GB instance.  Soon after, I upgraded 
a 78GB cluster, and it only took 3.1 minutes.


Where/how did you measure  those sizes?



(Both are VMs (same number of CPUs and RAM) connected to the same SAN.)

A "pg_dump --schema-only" of the two systems shows that the 
small-but-slow schema is 5.9M lines.


Anything special you are doing in this cluster to create all those lines?

What is the line count for the other instance?



Is this to be expected of such a huge schema?



--
Adrian Klaver
adrian.kla...@aklaver.com





Re: pg_upgrade and schema complexity...

2023-06-02 Thread Ron

On 6/2/23 19:58, Adrian Klaver wrote:

On 6/2/23 17:44, Ron wrote:

Ran into this when upgrading from 13.11 to 15.3...

The pg_restore phase failed with "ERROR: out of shared memory", and 
recommended that I increase max_locks_per_transaction. Doing so let the 
process run to completion.


It took 12.5 minutes to upgrade a 13GB instance.  Soon after, I upgraded 
a 78GB cluster, and it only took 3.1 minutes.


Where/how did you measure  those sizes?


Does it really matter?

START_SECS=$(date +"%s")
pg_upgrade ...
FINISH_SECS=$(date +"%s")
ET=`echo "scale=2;(${FINISH_SECS} - ${START_SECS})/60" | bc`
date +"%F %T pg_upgrade finished.  Elapsed time: ${ET} minutes."

(Text copied between air-gapped computers, so there might be errors.)



(Both are VMs (same number of CPUs and RAM) connected to the same SAN.)

A "pg_dump --schema-only" of the two systems shows that the 
small-but-slow schema is 5.9M lines.


Anything special you are doing in this cluster to create all those lines?


I do nothing; the schema is provided by the vendor.



What is the line count for the other instance?


227K rows.



Is this to be expected of such a huge schema?



--
Born in Arizona, moved to Babylonia.




Re: pg_upgrade and schema complexity...

2023-06-02 Thread Christophe Pettus



> On Jun 2, 2023, at 17:44, Ron  wrote:
> Is this to be expected of such a huge schema?

pg_upgrade time is pretty much proportional to the number of database objects 
in the schema, so a much larger schema taking much longer is to be expected.



Re: pg_upgrade and schema complexity...

2023-06-02 Thread Adrian Klaver

On 6/2/23 18:06, Ron wrote:

On 6/2/23 19:58, Adrian Klaver wrote:

On 6/2/23 17:44, Ron wrote:

Ran into this when upgrading from 13.11 to 15.3...

The pg_restore phase failed with "ERROR: out of shared memory", and 
recommended that I increase max_locks_per_transaction. Doing so let 
the process run to completion.


It took 12.5 minutes to upgrade a 13GB instance.  Soon after, I 
upgraded a 78GB cluster, and it only took 3.1 minutes.


Where/how did you measure  those sizes?


Does it really matter?

START_SECS=$(date +"%s")
pg_upgrade ...
FINISH_SECS=$(date +"%s")
ET=`echo "scale=2;(${FINISH_SECS} - ${START_SECS})/60" | bc`
date +"%F %T pg_upgrade finished.  Elapsed time: ${ET} minutes."


Unless I'm not mistaken the above is how the elapsed time was measured. 
I was looking for the procedure for determining the size.




(Text copied between air-gapped computers, so there might be errors.)



(Both are VMs (same number of CPUs and RAM) connected to the same SAN.)

A "pg_dump --schema-only" of the two systems shows that the 
small-but-slow schema is 5.9M lines.


Anything special you are doing in this cluster to create all those lines?


I do nothing; the schema is provided by the vendor.


Alright so it is not your design, but you do have an idea of what is in 
the database correct?







What is the line count for the other instance?


227K rows.



Is this to be expected of such a huge schema?





--
Adrian Klaver
adrian.kla...@aklaver.com





Re: custom background worker task for \copy?

2023-06-02 Thread Laurenz Albe
On Fri, 2023-06-02 at 11:36 -0700, Joe Carlson wrote:
> I have SQL that does it with \copy (select * from  where…) … And that 
> is fine.
> But it would be nice to be able to run the \copy commands in parallel.
> So I was thinking of writing a background worker.

Why don't you go the easy way of using several database connections to run the
parallel COPY statements?

Yours,
Laurenz Albe