Re: algo for canceling a deadlocked transaction

2018-04-09 Thread Christophe Pettus

> On Apr 9, 2018, at 07:33, Thomas Poty  wrote:
> 
> ok, and long  answer ? is it random?

It's not literally random, but from the application point of view, it's not 
predictable.  For example, it's not always the one that opened first, or any 
other consistent measure.
--
-- Christophe Pettus
   x...@thebuild.com




.ready files being created on secondaries

2018-05-08 Thread Christophe Pettus
We've encountered another system that has a situation very much like the one 
described here:

https://www.postgresql.org/message-id/20140904175036.310c6466%40erg

In particular, a secondary is restoring each WAL file using restore_command, 
creating a .ready file for it, and then never cleaning it up.  This is 
PostgreSQL 9.6.6 on Windows using the EDB installer.  We can't seem to see 
anything unusual about the configuration of the server (archive_mode = on, but 
it's otherwise properly operating as a secondary).

--
-- Christophe Pettus
   x...@thebuild.com




Re: Code of Conduct plan

2018-06-05 Thread Christophe Pettus


> On Jun 5, 2018, at 08:49, Benjamin Scherrey  wrote:
> I thought the same thing as a member of the Django community. It adopted a 
> CoC that I vocally warned was dangerous and far more likely to be abused than 
> provide any benefit. I was shocked when the very first time it was ever 
> invoked it was by one of the founders of the project (whom I previously 
> personally respected) and it was absolutely used in the manner that I had 
> feared which was to shut someone up whose opinion he did not like rather than 
> any legitimate concern.

Speaking as someone who has served on the board of the Django Software 
Foundation:

1. The Django Code of Conduct is considered a success and a valuable asset to 
the growth and health of the community.
2. Others involved in the event mentioned above would not describe it in the 
same terms

--
-- Christophe Pettus
   x...@thebuild.com




Re: Code of Conduct plan

2018-06-05 Thread Christophe Pettus


> On Jun 5, 2018, at 12:06, Benjamin Scherrey  wrote:
> Doesn't that 20 years of results pretty clearly demonstrate that this 
> community does not gain an advantage for adopting a CoC?

Not at all.  The need for a CoC is not theoretical.  Real people, recently, 
have left the community due to harassment, and there was no system within the 
community to report and deal with that harassment.

What we do have is 20 years of people demonstrating reasonable good judgment, 
which we can conclude will apply to a CoC committee as well.
--
-- Christophe Pettus
   x...@thebuild.com




Re: Code of Conduct plan

2018-06-05 Thread Christophe Pettus


> On Jun 5, 2018, at 12:20, Benjamin Scherrey  wrote:
> I'm not trying to harp on you personally, it's just that you're the unlucky 
> umpteenth time I've seen this claim made with zero satisfaction.

Given that we are talking about human beings here, who (unlike code commits) 
have careers and a reasonable expectation of privacy, it's possible that the 
reason you have heard this upteen times is that there are issues in the 
community that you are not aware of.  I would say that it more likely that bad 
faith and conniving on the part of upteen people.

--
-- Christophe Pettus
   x...@thebuild.com




Re: Code of Conduct plan

2018-06-05 Thread Christophe Pettus


> On Jun 5, 2018, at 15:20, Peter Geoghegan  wrote:
> I don't follow. Practically any organized group has rules around
> conduct, with varying degrees of formality, means of enforcement, etc.

I believe the objection is to setting up a separate CoC committee, rather than 
using the core team as the enforcement mechanism.

This is more important than may be obvious.  Having a separation of the CoC 
committee and the organization that sets up and supervises the CoC committee is 
very important to prevent the perception, or the fact, that the CoC enforcement 
mechanism is a Star Chamber that is answerable only to itself.  It also allows 
for an appeal mechanism.

--
-- Christophe Pettus
   x...@thebuild.com




Re: Code of Conduct plan

2018-06-05 Thread Christophe Pettus


> On Jun 5, 2018, at 17:07, Jan Claeys  wrote:
> 
> For example: having some people who have a background in something like
> psychology, sociology, education, law, human resources, marketing, etc.
> (in addition to the likely much easier to find developers, DBAs and IT
> managers) would be valuable too.

While it's good for the CoC committee to reach out for professional expertise 
if they need it, it should be on an engagement basis (if the CoC committee 
needs a lawyer, they find and retain a lawyer).  The damage that someone smart 
who thinks they know another profession can do is substantial.

--
-- Christophe Pettus
   x...@thebuild.com




Re: Code of Conduct plan

2018-06-07 Thread Christophe Pettus


> On Jun 7, 2018, at 02:55, Gavin Flower  wrote:
> The Americans often seem to act as though most people lived in the USA, 
> therefore we should all be bound by what they think is correct!

I have to say that this seems like a red herring to me.

1. The CoC committee handles actual incidents involving real people.  It's not 
their job to boil the ocean and create a new world; they deal with the matters 
brought before them.  I have no reason to believe that they will not apply good 
sense and judgement to the handling of the specific cases.

2. I don't think that there is a country where someone being driven out of a 
technical community by harassment is an acceptable local value.

3. The only actual real-life example of a culture clash that I've seen offered 
up here is the ability to say "c*nt" on a technical mailing list about 
databases.  That seems a very strange and specific hill to choose to die on in 
this discussion.

--
-- Christophe Pettus
   x...@thebuild.com




Re: Code of Conduct plan

2018-06-07 Thread Christophe Pettus


> On Jun 7, 2018, at 21:00, Gavin Flower  wrote:
> 
>> I have to say that this seems like a red herring to me.
> Not entirely.  American web sites tend to insist on weird date format, and 
> insist on the archaic imperial units rather than the metric system that most 
> people in the world use. 

Then you will be pleased to know that neither writing dates day-of-month first, 
nor using meters, will be Code of Conduct violations. :)

> For example try defining something simple, like what is a car!
[...]
> 
> Try defining success at university

It is equally unlikely that the Code of Conduct committee will need to decide 
what a car is, or whether or not someone has succeeded at university.

I'm not trying to be snide, but this does seem to be exactly what I was talking 
about: When asked for examples of cultural differences that might run afoul of 
the CoC, the examples don't seem to be either relevant (i.e., they are not 
things the CoC committee will have to address), or are clearly contextual in a 
way that a human will have no trouble understanding.

> I've called a friend of mine a bastard, but he took it as a mark of respect 
> in the context of our discussion.

This is why we have human beings, rather than a regex, forming the Code of 
Conduct committee.  It's important to remember that the CoC committee is not 
going to be going around policing the community for potential violations; their 
job is to resolve actual situations between real people.  It's not their job to 
define values; it's their job to resolve situations.  In my experience in 
dealing with CoC issues, the situations (while often complex) are rarely of the 
form, "This word does not mean anything bad where I come from."

--
-- Christophe Pettus
   x...@thebuild.com




Re: Do we need yet another IDE (SQL development assistant) for PostgreSQL?

2018-07-15 Thread Christophe Pettus


> On Jul 15, 2018, at 16:06, Dmitry Igrishin  wrote:
> 
> The cross-platform GUI toolkit will be the challenge.
> This is why I've consider GUI for the Windows only. And if I'll not find an 
> adequate GUI
> toolkit (at reasonable price and/or license), there is an option to make the 
> GUI available
> on Windows only and provide the Linux version without a GUI (at least at the 
> first time).

I'm not sure I quite understand an PostgreSQL IDE without a GUI.  Isn't that 
psql, to a first approximation?

I'm also curious how you see this IDE comparing to, say, pgAdmin4.  There's no 
reason we can't have multiple IDEs, of course, but when I think of an 
"integrated development environment," I think of something (along the lines of 
the JetBrains family) that handles the full stack, including debugging.

--
-- Christophe Pettus
   x...@thebuild.com




Re: width_bucket issue

2018-07-24 Thread Christophe Pettus


> On Jul 24, 2018, at 13:02, Raphaël Berbain  wrote:
> I'd expect b1 = b2 = 2. What am I missing?

The problem appears to be due to rounding during the intermediate calculations. 
 In compute_bucket() in numeric.c:

div_var(&operand_var, &bound1_var, result_var,
select_div_scale(&operand_var, &bound1_var), 
true);

... produces 0. for that particular value, instead of 1, 
and the subsequent +1 and FLOOR() result in 1 instead of 2.
--
-- Christophe Pettus
   x...@thebuild.com




Re: Publication/Subscription Questions

2018-07-28 Thread Christophe Pettus


> On Jul 27, 2018, at 01:34, xOChilpili  wrote:
> 
> Why ? If i remove rows, from Server B and refresh publication, why data is 
> not re-sync ?

ALTER SUBSCRIPTION ... REFRESH PUBLICATION doesn't do another initial copy of 
the data for existing tables in the publication.  Its function is to add tables 
that were added to the publication after the subscription was last created or 
refreshed.  It does (by default) copy the data from newly-added tables, but it 
does not resync the data from the existing tables.

--
-- Christophe Pettus
   x...@thebuild.com




Re: Pg_rewind cannot load history wal

2018-08-04 Thread Christophe Pettus


> On Aug 4, 2018, at 06:13, Michael Paquier  wrote:
> 
> Well, since its creation we have the tool behave this way.  I am not
> sure either that we can have pg_rewind create a checkpoint on the source
> node each time a rewind is done, as it may not be necessary, and it
> would enforce WAL segment recycling more than necessary, so if we were 
> to back-patch something like that I am pretty much convinced that we
> would get complains from people already using the tool, with existing
> failover flows which are broken. 

Would having pg_rewind do a checkpoint on the source actually cause anything to 
break, as opposed to a delay while the checkpoint completes?  The current 
situation can create a corrupted target, which seems far worse than just 
slowing down pg_rewind.

--
-- Christophe Pettus
   x...@thebuild.com




Re: Pg_rewind cannot load history wal

2018-08-04 Thread Christophe Pettus


> On Aug 4, 2018, at 13:50, Michael Paquier  wrote:
> 
> Hm? 

The specific situation is if pg_rewind is attached to the target before the 
forced post-recovery checkpoint completes, the target can be corrupted:


https://www.postgresql.org/message-id/ece3b665-e9dd-43ff-b6a6-734e74352...@thebuild.com

--
-- Christophe Pettus
   x...@thebuild.com




Re: Immutable function WAY slower than Stable function?

2018-08-07 Thread Christophe Pettus


> On Aug 7, 2018, at 11:42, Ken Tanzer  wrote:
> I assume that's "for all users and all sessions," but either in theory or in 
> practice is there a limit to how long a stale value might persist?  And, if 
> you were to drop and recreate a function with the same name & parameters, 
> would it start fresh at that point?  And is there a way to flush any caching? 
>  (It's surely best to just declare Stable, but I'm wondering about cases that 
> might have _very_ infrequently-changed values.)

Well, the extreme case is an IMMUTABLE function used to create an expression 
index; then, the value lasts as long as the index does.  The best way to think 
of an IMMUTABLE is that it is a pure function, unchanged by system state.  
(This is one of the reasons that datetime-related functions are often STABLE 
rather than IMMUTABLE, due to time zone changes.)

--
-- Christophe Pettus
   x...@thebuild.com




Re: Postgresql

2018-08-19 Thread Christophe Pettus


> On Aug 19, 2018, at 10:56, Sonam Sharma  wrote:
> 
> Thank you! Can you please help me with any advantages/disadvantages.. my db 
> size is less than 10gb. I am very new to this.

That's a topic far too broad for a simple mailing list thread.  PostgreSQL is 
extremely feature-complete, but how it performs on any particular database will 
vary considerably.

The good news is that for a database of that size, it's easy to install it and 
test it against your workload.  That's really the only practical way of telling 
if it will be suitable for you.

--
-- Christophe Pettus
   x...@thebuild.com




Re: Slow shutdowns sometimes on RDS Postgres

2018-09-14 Thread Christophe Pettus


> On Sep 14, 2018, at 08:43, Jeremy Schneider  wrote:
> So yeah, it's not common...

In our experience, it's actually quite common that an RDS shutdown (or even 
just applying parameter changes) can take a while.  What's particularly 
concerning is that it's not predictable, and that can make it hard to schedule 
and manage maintenance windows.  What we were told previously is that RDS 
queues the operations, and it can take a variable amount of time for the 
operation to be worked on from the queue.  Is that not the case?

--
-- Christophe Pettus
   x...@thebuild.com




Re: How to watch for schema changes

2018-09-17 Thread Christophe Pettus


> On Sep 17, 2018, at 07:09, Igor Korot  wrote:
> 
> Is there a way to query a server for a place where the log file is?

SHOW log_directory;

It's either relative to the PGDATA directory, or an absolute path.

--
-- Christophe Pettus
   x...@thebuild.com




Re: Upgrade PostgreSQL 9.6 to 10.6

2020-01-08 Thread Christophe Pettus



> On Jan 8, 2020, at 15:56, github kran  wrote:
> 
> Great I see its supported based on the link but the problem is we are locked 
> into a Aurora RDS and we can't use logical replication on that engine. 

You can use pglogical on RDS PostgreSQL 9.6.

--
-- Christophe Pettus
   x...@thebuild.com





Re: POLL: Adding transaction status to default psql prompt

2020-02-05 Thread Christophe Pettus



> On Feb 5, 2020, at 18:54, Vik Fearing  wrote:
> Please answer +1 if you want or don't mind seeing transaction status by
> default in psql or -1 if you would prefer to keep the current default.

+1.





Re: Can we have multiple tablespaces with in a database.

2020-02-20 Thread Christophe Pettus



> On Feb 20, 2020, at 22:23, Daulat Ram  wrote:
> 
> That will be great if you  share any doc where it’s mentioned that we can’t 
> use multiple tablespace for a single database. I have to assist my Dev team 
> regarding tablespaces.

A single PostgreSQL database can have any number of tablespaces.  Each table 
has to be in one specific tablespace, although a table can be in one tablespace 
and its indexes in a different one.

If a PostgreSQL table is partitioned, each partition can be in a different 
tablespace.

Oracle "style" tends to involve a lot of tablespaces in a database; this is 
much less commonly done in PostgreSQL.  In general, you only need to create 
tablespace in a small number of circumstances:

(a) You need more space than the current database volume allows, and moving the 
database to a larger volume is inconvenient;
(b) You have multiple volumes with significantly different access 
characteristics (like an HDD array and some SSDs), and you want to distribute 
database objects to take advantage of that (for example, put commonly-used 
large indexes on the SSDs).

PostgreSQL tablespaces do increase the administrative overhead of the database, 
and shouldn't be created unless there is a compelling need for them./

--
-- Christophe Pettus
   x...@thebuild.com





Re: Can we have multiple tablespaces with in a database.

2020-02-20 Thread Christophe Pettus



> On Feb 20, 2020, at 22:34, Daulat Ram  wrote:
> 
> You mean we can have only single default tablespace for a database but the 
> database objects can be created on different-2 tablespaces?

Yes.

> Can you please share the Doc URL for your suggestions given in trail mail.

https://www.postgresql.org/docs/current/manage-ag-tablespaces.html

--
-- Christophe Pettus
   x...@thebuild.com





Re: Can I trigger an action from a coalesce ?

2020-02-22 Thread Christophe Pettus



> On Feb 22, 2020, at 13:05, Adrian Klaver  wrote:
> 
> On 2/22/20 1:02 PM, stan wrote:
>> I have a case where if a value does not exist, I am going to use a default,
>> which is easy with coalesce. But I would like to warn the user that a
>> default has been supplied. The default value is reasonable, and could
>> actually come from the source table, so I can't just check the value.
>> I'd like to do a raise NOTICE, if the default portion of the coalesce fires.
>> Anyone have a good way to accomplish this?
> 
> No.

You can, of course, create a PL/pgSQL function and use that as the default.

--
-- Christophe Pettus
   x...@thebuild.com





Re: Can I trigger an action from a coalesce ?

2020-02-22 Thread Christophe Pettus



> On Feb 22, 2020, at 13:33, stan  wrote:
> I suppose you are suggesting that the function try the original SELECT, and
> if it returns a NULL then retun the default AND do the raise NOTICE?

Something like this:

create function supply_default() returns int as $$
begin
   raise notice 'Supplied default';
   return 1;
end;
$$ immutable language plpgsql;

xof=# create table t ( i integer default supply_default(), t text );
CREATE TABLE
xof=# insert into t(i, t) values (2, 'text');
INSERT 0 1
xof=# insert into t(t) values ('text');
NOTICE:  Supplied default
INSERT 0 1

--
-- Christophe Pettus
   x...@thebuild.com





Re: Can I trigger an action from a coalesce ?

2020-02-22 Thread Christophe Pettus



> On Feb 22, 2020, at 14:02, Tom Lane  wrote:
> It's a really bad idea to mark a function that has side-effects
> (i.e., emitting a NOTICE) as immutable, especially if the occurrence
> of the side-effect at well-defined times is exactly what you're
> desirous of.

True, and it doesn't actually need to be immutable here; just cut and pasted 
from the wrong example.

(That being said, I'm not coming up with a specific bad thing that a RAISE 
NOTICE in an immutable function will cause.  Is there one?)

--
-- Christophe Pettus
   x...@thebuild.com





Re: Can I trigger an action from a coalesce ?

2020-02-22 Thread Christophe Pettus



> On Feb 22, 2020, at 14:36, Tom Lane  wrote:
> The problem that I'm worried about is premature evaluation of the
> "immutable" function, causing the NOTICE to come out once during
> query planning, independently of whether/how many times it should
> come out during execution.

Ah, good point.  My solution also does assume that a DEFAULT expression is only 
evaluated if the default is required, and that behavior isn't (afaik) a promise.

--
-- Christophe Pettus
   x...@thebuild.com





Re: Examing cotets of NEW & OLD in a function programed in perl

2020-03-03 Thread Christophe Pettus



> On Mar 3, 2020, at 13:26, stan  wrote:
> So, they should just be visible as OLD, and NEW as hasshes?

They're documented here:

https://www.postgresql.org/docs/current/plperl-triggers.html

A global hash variable $_TD is available in the trigger function with all sorts 
of trigger-related info.

--
-- Christophe Pettus
   x...@thebuild.com





Re: How does pg_basebackup manage to create a snapshot of the filesystem?

2020-03-19 Thread Christophe Pettus



> On Mar 19, 2020, at 15:19, Dennis Jacobfeuerborn  
> wrote:
> I'm currently trying to understand how backups work. In the
> documentation in section "25.2. File System Level Backup" it says that
> filesystem level backups can only be made when the database if offline
> yet pg_basebackup seems to do just that but works while the database is
> online. Am I misunderstanding something here or does pg_basebackup use
> some particular features of Postgres to accomplish this?

pg_basebackup does, indeed, take an inconsistent copy of the file system while 
it is running; what allows it to bring the database back up to consistency is 
the write-ahead log segments that are created while pg_basebackup is running.  
That's why it is important to have all of the WAL segments created during the 
run (which is what --wal-method=stream provides you).
--
-- Christophe Pettus
   x...@thebuild.com





Re: keeping images in a bytea field on AWS RDS

2020-03-31 Thread Christophe Pettus



> On Mar 31, 2020, at 12:49, Richard Bernstein  wrote:
> 
> I am using postgresql on RDS. I need to upload an image to the table. I 
> understand that I need to set the PGDATA directory and place the image file 
> in it, before setting the path in the bytea field. But how do I set PGDATA if 
> I don't have the ability to set an environment variable, and don't have 
> access to the following on an AWS controlled installation?

No, that's not how you upload images to a bytea field; you use the standard 
client library and send the bytea data over that way.  It works the same on 
community PostgreSQL and RDS.  PostgreSQL handles managing the bytea data for 
you.  Note that for very large binary objects, storing them in the database is 
not going to be very efficient versus keeping them in the filesystem (noting, 
of course, that you don't have access to the filesystem of an RDS server).

--
-- Christophe Pettus
   x...@thebuild.com





Re: keeping images in a bytea field on AWS RDS

2020-03-31 Thread Christophe Pettus


> On Mar 31, 2020, at 13:52, Ron  wrote:
> initdb is definitely not what you want to do just to load an image into an 
> exiting database.

I think there may be some confusion on the OP's part because many sources give 
out *advice* to put a filesystem path, rather than the entire actual binary 
object, into the database... and that's still good advice, even on RDS!  It 
just means that path needs to be a URI or some other piece of metadata that 
points to a different server, rather than the RDS server.

--
-- Christophe Pettus
   x...@thebuild.com





Re: Best way to use trigger to email a report ?

2020-05-08 Thread Christophe Pettus



> On May 8, 2020, at 09:26, David Gauthier  wrote:
> 
> psql (9.6.0, server 11.3) on linux
> 
> Looking for ideas.  I want a trigger to...
> 1) compose an html report based on DB content
> 2) email the report to a dist list (dl = value of a table column)

You probably *don't* want to actually send an email within a trigger; doing an 
operation that can block on an external service (DNS, SMTP) within a trigger is 
asking for hard-to-diagnose trouble.  You probably don't even want to create 
the HTML; that's adding a lot of time to the operation that fires the trigger.

I'd probably set up the trigger to store the minimal data required to produce 
the HTML into a separate table, and then have a background job query the table, 
create the HTML, and mail out the report.  If you don't want to periodically 
poll the table, you can use NOTIFY within the trigger to wake up a process that 
is waiting on NOTIFY.

--
-- Christophe Pettus
   x...@thebuild.com





Re: Inherited an 18TB DB & need to backup

2020-05-15 Thread Christophe Pettus



> On May 15, 2020, at 12:01, Scottix  wrote:
> 
> Also when you get in the multi TB data storage the bill gets a little harder 
> to digest in S3.

Indeed.  Right now, just buying off of Amazon, a 12TB Seagate IronWolf drive is 
$0.03/GB.  S3 infrequent access is $0.0125/GB/month, so the drive pays for 
itself (storage only) in 2.4 months.  Even with an annualized failure rate of 
1.4% (per Backblaze), and with the required host, rack, etc., etc., it be 
significantly more economical to run your own backup server.  If you regularly 
do restores off of the backups (for example, to prime staging environments or 
developer systems), the outbound transfer can add up fast, too.
--
-- Christophe Pettus
   x...@thebuild.com





Can't remove default permissions entry

2020-05-27 Thread Christophe Pettus
On RDS (thus, no superuser) we are trying to drop a user.  The only remaining 
item that the user owns is an "empty" default permissions entry, but we can't 
seem to get rid of it so that the user can be dropped:

I'm sure I'm missing something obvious!

Logged in as xyuser:

db=> \ddp+
Default access privileges
   Owner|Schema |   Type   |Access privileges 
+---+--+--
 xyuser |   | table| 

db=> ALTER DEFAULT PRIVILEGES FOR USER xyuser REVOKE ALL ON TABLES FROM xyuser;
ALTER DEFAULT PRIVILEGES
db=> \ddp+
Default access privileges
   Owner|Schema |   Type   |Access privileges 
+---+--+--
 xyuser |   | table| 

db=> 

--
-- Christophe Pettus
   x...@thebuild.com





Re: pg_dump of database with numerous objects

2020-05-31 Thread Christophe Pettus



> On May 31, 2020, at 08:05, t...@exquisiteimages.com wrote:
> 
> My pg_class table contains 9,000,000 entries and I have 9004 schema.

Which version of pg_dump are you running?  Older versions (don't have the 
precise major version in front of me) have N^2 behavior on the number of 
database objects being dumped.

--
-- Christophe Pettus
   x...@thebuild.com





Re: pg_dump of database with numerous objects

2020-05-31 Thread Christophe Pettus



> On May 31, 2020, at 13:10, t...@exquisiteimages.com wrote:
> 
> On 2020-05-31 13:08, Christophe Pettus wrote:
>>> On May 31, 2020, at 08:05, t...@exquisiteimages.com wrote:
>>> My pg_class table contains 9,000,000 entries and I have 9004 schema.
>> Which version of pg_dump are you running?  Older versions (don't have
>> the precise major version in front of me) have N^2 behavior on the
>> number of database objects being dumped.
> 
> I am upgrading from 9.3

To which version?  You might try the dump with the version of pg_dump 
corresponding to the PostgreSQL version you are upgrading *to* (which is 
recommended practice, anyway) to see if that improves matters.

--
-- Christophe Pettus
   x...@thebuild.com





Re: pg_dump of database with numerous objects

2020-05-31 Thread Christophe Pettus



> On May 31, 2020, at 13:37, Adrian Klaver  wrote:
> 
> Just a reminder that the OP's original issue was with using pg_upgrade.

True, although IIRC pg_ugprade uses pg_dump under the hood to do the schema 
migration.

--
-- Christophe Pettus
   x...@thebuild.com





Re: Oracle vs. PostgreSQL - a comment

2020-06-02 Thread Christophe Pettus



> On Jun 2, 2020, at 13:30, Stephen Frost  wrote:
> 
> Eh, that's something that I think we should be looking at supporting, by
> using FDWs, but I haven't tried to figure out how hard it'd be.

Being able to access a FDW that way would rock.

--
-- Christophe Pettus
   x...@thebuild.com





Re: Postgres12 - Confusion with pg_restore

2020-06-05 Thread Christophe Pettus



> On Jun 5, 2020, at 11:20, Laura Smith  
> wrote:
> sudo -u postgres pg_restore -v -C -d foobar 4_foobar_pgdump_Fc

You need to connect to a database that already exists (such as "postgres"); it 
then creates the database you are restoring and switches to it.  The relevant 
manual line is:

"When (-C / --create) is used, the database named with -d is used only 
to issue the initial DROP DATABASE and CREATE DATABASE commands. All data is 
restored into the database name that appears in the archive."

--
-- Christophe Pettus
   x...@thebuild.com





Re: psql: FATAL: database "postgres" does not exist or ERROR: 23505: duplicate key value violates unique constraint "pg_namespace_nspname_index"

2020-07-16 Thread Christophe Pettus



> On Jul 16, 2020, at 18:32, Naresh Kumar  wrote:
> 
> Can some one help us on this please.

To be clear, what you almost certainly have here is serious data corruption.  
You will need to find a (paid, commercial) specialist to help you with the 
recovery.  You are unlikely to get the level of support you need on this list, 
for free.

--
-- Christophe Pettus
   x...@thebuild.com





Re: psql: FATAL: database "postgres" does not exist or ERROR: 23505: duplicate key value violates unique constraint "pg_namespace_nspname_index"

2020-07-16 Thread Christophe Pettus



> On Jul 16, 2020, at 19:08, Naresh Kumar  wrote:
> 
> Thanks Christopher, if you any such contacts can you share with us.

The community maintains this page; I'm sure you can find someone who can help 
you there:

https://www.postgresql.org/support/professional_support/

--
-- Christophe Pettus
   x...@thebuild.com





Re: How to restore a dump containing CASTs into a database with a new user?

2020-07-19 Thread Christophe Pettus



> On Jul 19, 2020, at 14:25, Thorsten Schöning  wrote:
> Would I need to restore the whole dump as super user? But how do I own
> all those restored contents to some other database user afterwards?

In this case, you may need to change the ownership of the various objects 
directly in the database, rather than using dump/restore as a way of changing 
ownership all at once.  This is not infrequent when you have an existing 
database in which a superuser owns everything
--
-- Christophe Pettus
   x...@thebuild.com





Re: How to restore a dump containing CASTs into a database with a new user?

2020-07-19 Thread Christophe Pettus



> On Jul 19, 2020, at 22:13, Thorsten Schöning  wrote:
> Does Postgres support that in an easy way, without the need to reverse
> engineer an otherwise unknown the schema?

It is straight-forward enough to determine the user-created objects in the 
schema, and then alter their ownership.  For new objects, you can set default 
permissions appropriately.

> That seems very complicated when one simply wants to restore a
> backup into a newly created database.

The complication is arising because you are trying to do two things at the same 
time: Restore the backup, and use that to alter the permissions as a batch.  
That's not straight-forward in the case where you have user-defined CASTs.  You 
should alter the ownership of the user-defined objects, and that will allow you 
to dump and restore the database, if you still need to.

> Additionally, who owns types on which level in the end? To
> successfully restore, I needed to change ownership of type "inet" to
> one new user.

No, you don't, and you (probably) can't change the ownership of "inet".  "inet" 
is a built-in type.  The issue is that you have user-defined objects which are 
owned by the user "postgres"; you should change those to the user that you 
want, leaving the CASTs owned by "postgres".

--
-- Christophe Pettus
   x...@thebuild.com





Re: How to restore a dump containing CASTs into a database with a new user?

2020-07-20 Thread Christophe Pettus



> On Jul 20, 2020, at 02:28, Thorsten Schöning  wrote:
> 
> Would be far easier if Postgres would do that automatically like it
> seems to do for most other objects. The important point is that owning
> those types seems to be per database, so things should be safe to do
> automatically.

I'm not sure I understand exactly how this "feature" would work.  It seems to 
be "in the case that I am using CASTs that include internal types and restoring 
to a different, non-superuser user than the original one in the database that 
was dumped from, change the owner of internal types to make sure that my CAST 
restores work."  That strikes me as a *very* ad hoc feature indeed.

--
-- Christophe Pettus
   x...@thebuild.com





Re: How to restore a dump containing CASTs into a database with a new user?

2020-07-20 Thread Christophe Pettus



> On Jul 20, 2020, at 08:10, Thorsten Schöning  wrote:
> Make internal types used in CASTs owned by the restoring user, like
> all other objects are owned automatically as well.

I don't think that we want to do that, or that we even have to.

Having a restore tool make automatic changes to the ownership of objects in the 
database it is restoring into seems like a bad idea, especially when those 
ownership changes are not part of the backup itself.  On a database with 
multiple users, you can't just get away with changing the ownership of the 
types; you have to make sure that the USAGE is granted appropriately to other 
users.

Again, this is to support a very specific use-case:

* A database has user-defined objects in it that only a superuser can create, 
and,
* The rest of the database objects are owned by that superuser, and,
* You want to change the ownership of the database objects that can be changed, 
and,
* You want to have a single backup that you can restore multiple times, 
changing the ownership in a different way each time, and,
* You want to use pg_restore to do it.

This would require a fair amount of surgery to pg_restore.  Right now, 
pg_restore doesn't really have a "remap these users" functionality.  --no-owner 
*looks* like it does that, and can be used for that in certain cases, but the 
user-remapping functionality of it is really a side-effect.  It happens to 
change the user because instead of altering the user to what it is in the 
backup, it just accepts the default ownership based on the user it is connected 
as.

You can accomplish the same thing by restoring as the superuser, not having to 
alter the ownership of any internal type, and then changing the ownership of 
the user-created objects in the new database once it is restored.  This can be 
done entirely with existing tools, and doesn't need any changes to pg_restore, 
or even having to do ownership changes of internal types (which I strongly 
suspect will bite you later).

--
-- Christophe Pettus
   x...@thebuild.com





Row estimates for empty tables

2020-07-23 Thread Christophe Pettus
I realize I've never quite known this; where does the planner get the row 
estimates for an empty table?  Example:

psql (11.8)
Type "help" for help.

xof=# CREATE TABLE t (i integer, t text, j integer);
CREATE TABLE
xof=# VACUUM ANALYZE t;
VACUUM
xof=# EXPLAIN ANALYZE SELECT * FROM t;
   QUERY PLAN   


 Seq Scan on t  (cost=0.00..22.00 rows=1200 width=40) (actual time=0.015..0.015 
rows=0 loops=1)
 Planning Time: 5.014 ms
 Execution Time: 0.094 ms
(3 rows)

xof=# INSERT INTO t values(1, 'this', 2);
INSERT 0 1
xof=# EXPLAIN ANALYZE SELECT * FROM t;
   QUERY PLAN   


 Seq Scan on t  (cost=0.00..22.00 rows=1200 width=40) (actual time=0.010..0.011 
rows=1 loops=1)
 Planning Time: 0.039 ms
 Execution Time: 0.021 ms
(3 rows)

xof=# VACUUM ANALYZE t;
VACUUM
xof=# EXPLAIN ANALYZE SELECT * FROM t;
 QUERY PLAN 


 Seq Scan on t  (cost=0.00..1.01 rows=1 width=13) (actual time=0.008..0.008 
rows=1 loops=1)
 Planning Time: 0.069 ms
 Execution Time: 0.019 ms
(3 rows)

xof=# DELETE FROM t;
DELETE 0
xof=# VACUUM ANALYZE t;
VACUUM
xof=# EXPLAIN ANALYZE SELECT * FROM t;
   QUERY PLAN   


 Seq Scan on t  (cost=0.00..29.90 rows=1990 width=13) (actual time=0.004..0.004 
rows=0 loops=1)
 Planning Time: 0.034 ms
 Execution Time: 0.015 ms
(3 rows)


--
-- Christophe Pettus
   x...@thebuild.com





Re: Row estimates for empty tables

2020-07-24 Thread Christophe Pettus



> On Jul 24, 2020, at 06:48, Tom Lane  wrote:
> 
> There's certainly not a lot besides tradition to justify the exact
> numbers used in this case. 

Since we already special-case parent tables for partition sets, would a storage 
parameter that lets you either tell the planner "no, really, zero is reasonable 
here" or sets a minimum number of rows to plan for be reasonable?  I happened 
to get bit by this tracking down an issue where several tables in a large query 
had zero rows, and the planner's assumption of a few pages worth caused some 
sub-optimal plans.  The performance hit wasn't huge, but they were being joined 
to some *very* large tables, and the differences added up.
--
-- Christophe Pettus
   x...@thebuild.com





Re: Row estimates for empty tables

2020-07-24 Thread Christophe Pettus


> On Jul 24, 2020, at 12:14, Pavel Stehule  wrote:
> 
> this application stores some results in tables (as guard against repeated 
> calculations). Lot of these tables have zero or one row. 

Yes, that's the situation we encountered, too.  It's not very common (and even 
less common, I would assume, that it results in a bad plan), but it did in this 
case.

--
-- Christophe Pettus
   x...@thebuild.com





Re: Row estimates for empty tables

2020-07-24 Thread Christophe Pettus



> On Jul 24, 2020, at 14:09, Tom Lane  wrote:
> Rather than adding another pg_class column, I'm tempted to say that
> vacuum/analyze should set relpages to a minimum of 1, even if the
> relation has zero pages. 

If there's not an issue about relpages != actual pages on disk, that certain 
seems straight-forward, and no *more* hacky than the current situation.

--
-- Christophe Pettus
   x...@thebuild.com





Re: How to rebuild index efficiently

2020-08-03 Thread Christophe Pettus



> On Aug 3, 2020, at 10:20, Konireddy Rajashekar  wrote:
> Could you please suggest any ideal approach to tackle this ?

You can do CREATE INDEX CONCURRENTLY to build a new index with the same 
definition, and when that is complete, drop the old index.  The locking that is 
required here is modest: CREATE INDEX CONCURRENTLY needs to lock the table 
briefly at a couple of points in the operation, and dropping the old index 
requires a brief lock on the table.  It is, however, much less overall lock 
time than REINDEX would be.

--
-- Christophe Pettus
   x...@thebuild.com





Advancing the archiver position safely

2020-08-06 Thread Christophe Pettus
I've encountered a rather unusual situation (PostgreSQL 9.6).  On a particular 
server, for reasons I've not fully diagnosed, the archiver thinks that the 
current WAL segment to be archived is 00023B680062.  This is 
unfortunate, because the oldest WAL segment that actually exists on disk is 
00023F110004, so the archive script is failing repeatedly because 
of the missing segment.

The system is not actually missing important (for recovery) WAL segments, at 
least:

Latest checkpoint's REDO WAL file:000241760029

I'd like to "catch up" the archiver such that it is operating on files that 
actually exist; besides setting archive_command to '/bin/true' and letting it 
chew through the old ones, is there a way of safely advancing the position of 
the archiver?
--
-- Christophe Pettus
   x...@thebuild.com





Re: Advancing the archiver position safely

2020-08-06 Thread Christophe Pettus



> On Aug 6, 2020, at 18:45, Jerry Sievers  wrote:
> Deleting the .ready file should allow the archiver to get past the
> missing file.

Ah, excellent, yes.

--
-- Christophe Pettus
   x...@thebuild.com





Re: serial + db key, or guid?

2020-08-10 Thread Christophe Pettus



> On Aug 10, 2020, at 15:19, Mark Phillips  wrote:
> Advice, cautionary tales, suggestions and such will be warmly received.

Here's one solution a company found for this; it seems to work very well:

https://instagram-engineering.com/sharding-ids-at-instagram-1cf5a71e5a5c

--
-- Christophe Pettus
   x...@thebuild.com





Re: serial + db key, or guid?

2020-08-11 Thread Christophe Pettus



> On Aug 11, 2020, at 09:37, Mark Phillips  wrote:
> 
> I posed the question on the chance things had evolved since 2012, 
> specifically as it relates to postgres.

The essentials haven't changed.  Keys (such as UUIDs, especially UUID v4) that 
have most of their randomness in the most significant bits can cause 
significant cache hit problems on large indexes.  128 bit keys are usually 
overkill for most applications, unless you need actual *global* uniqueness 
across more than a single database or installation; 64 bit keys are usually 
sufficient.

UUIDs (and similar very large random keys) do have the advantage that they are 
somewhat self-secure: You can expose them to outsiders without having to worry 
about other keys being guessable.
--
-- Christophe Pettus
   x...@thebuild.com





Re: Why SELECT COUNT(*) takes so long?

2020-09-13 Thread Christophe Pettus



> On Sep 13, 2020, at 23:09, Matthias Apitz  wrote:
> Why a SELECT COUNT(*) of the
> full table takes around 1 minute:

There's an explanation here:

https://wiki.postgresql.org/wiki/Slow_Counting

--
-- Christophe Pettus
   x...@thebuild.com





Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?

2020-10-01 Thread Christophe Pettus



> On Oct 1, 2020, at 16:08, tutilu...@tutanota.com wrote:
> But of course I should be grateful no matter what because it doesn't cost 
> money.

No one is asking you to be grateful.  However, you are asking for other people 
to do things that important to you, but not them.  They are not required to do 
so.  If you cannot persuade them, and are not in a position to pay them, then 
that's a reality you'll just have to accept.

--
-- Christophe Pettus
   x...@thebuild.com





Re: UUID with variable length

2020-10-15 Thread Christophe Pettus



> On Oct 15, 2020, at 13:49, Dirk Krautschick  
> wrote:
> Or do you have some other ideas how to use a primary key datatype like UUID 
> but with variable length?

You're probably best off storing it as a VARCHAR() with a check constraint or 
constraint trigger that validates it.

--
-- Christophe Pettus
   x...@thebuild.com





Re: Christopher Browne

2020-11-04 Thread Christophe Pettus



> On Nov 4, 2020, at 15:29, Steve Singer  wrote:
> 
> It is with much sadness that I am letting the community know that Chris 
> Browne passed away recently.

I'm so sorry to hear this.  I did not know him well, but he was always a kind 
and friendly face anytime I saw him at events.

--
-- Christophe Pettus
   x...@thebuild.com





Re: JSONB order?

2020-11-05 Thread Christophe Pettus



> On Nov 5, 2020, at 07:34, Tony Shelver  wrote:
> But...  seen above, the order gets mixed up.
> 
> Any ideas?

JSON objects, like Python dicts, are not automatically ordered by key.  Once 
you move from the column space to the JSON object space, you can't rely on the 
object keys being in a consistent order.

You'll want to have a step when ingesting the JSON object into a report that 
lines up the key values appropriately with the right presentation in the report.
--
-- Christophe Pettus
   x...@thebuild.com





Re: JSONB order?

2020-11-05 Thread Christophe Pettus



> On Nov 5, 2020, at 07:45, Tony Shelver  wrote:
> Thanks Christophe, that's what I thought.  
> Just seemed weird that they were 'disordered' in exactly the same way every 
> time.
> 
> FYI, as of Python 3.7, dicts are ordered.
> 
> The problem is that we are possibly going to have many versions of these 
> forms with slightly differing keys, which will be a pain to order in some 
> hard coded way.

As Magnus noted, you can use JSON instead of JSONB.  JSON is basically a text 
blob with a syntax check wrapper around it, so it will be order-stable once 
created.  (If you run it through a JSONB-expecting function, then the ordering 
may change again.)  It's less efficient to operate on than JSONB, but that 
might be OK for your purposes.

--
-- Christophe Pettus
   x...@thebuild.com





Re: psql backward compatibility

2020-11-18 Thread Christophe Pettus



> On Nov 18, 2020, at 08:05, Stephen Haddock  wrote:
> When upgrading an older version of postgres, version 8.4 for example, to a 
> newer version such as 9.6, does the data have to be migrated immediately?

Yes.  You cannot run binaries from a newer major version of PostgreSQL on a 
cluster that was initialized with an older major version.  You'll need to do a 
pg_dump/pg_restore, or use  pg_upgrade to create a new cluster.

--
-- Christophe Pettus
   x...@thebuild.com





Re: copy command - something not found

2020-12-29 Thread Christophe Pettus



> On Dec 29, 2020, at 11:12, Susan Hurst  wrote:
> 
> ##-- shell script command
> psql -d ${DBNAME} -U ${DBNAME} -h ${HOSTNAME} -c < ${CSVPATH}copycmd.z

The -c argument there specifies a command to run, so it needs an argument of 
some kind.  It looks like ultimately the .csv file gets handed to the shell to 
execute, which of course doesn't work very well.

The file ${CSVPATH}copycmd.z contains the COPY command to run, yes?

The -i argument specifies a file that contains a command to run, so you might 
give this a go:

psql -d ${DBNAME} -U ${DBNAME} -h ${HOSTNAME} -i "${CSVPATH}copycmd.z"
--
-- Christophe Pettus
   x...@thebuild.com





Re: Getting "could not read block" error when creating an index on a function.

2020-12-30 Thread Christophe Pettus



> On Dec 30, 2020, at 11:37, Demitri Muna  wrote:
> I want to index the results of these repeated, unchanging calculations to 
> speed up other queries. Which mechanism would be best to do this? Create 
> additional columns? Create another table?

This might be a good use for a generated column.

https://www.postgresql.org/docs/current/ddl-generated-columns.html
--
-- Christophe Pettus
   x...@thebuild.com





Re: Getting "could not read block" error when creating an index on a function.

2020-12-30 Thread Christophe Pettus



> On Dec 30, 2020, at 11:48, Christophe Pettus  wrote:
> 
> This might be a good use for a generated column.
> 
>   https://www.postgresql.org/docs/current/ddl-generated-columns.html

I take that back; the generation formula has to be immutable as well.  Perhaps 
a column populated by a trigger?
--
-- Christophe Pettus
   x...@thebuild.com





Re: Trigger with conditional predicates

2021-01-01 Thread Christophe Pettus



> On Jan 1, 2021, at 07:56, Dirk Mika  wrote:
> In particular, columns are populated with values if they are not specified in 
> the update statement which is used.
> Usually with an expression like this:
> 
>  IF NOT UPDATING('IS_CANCELED')
>  THEN
> :new.is_canceled := ...;
>  END IF;
> 
> I have not found anything similar in PostgreSQL. What is the common approach 
> to this problem?

PostgreSQL doesn't have an exact equivalent.  Typically, the OLD and NEW values 
are compared and then action is taken based on that.  For example, in PL/pgSQL:

IF NEW.is_canceled IS NOT DISTINCT FROM OLD.is_canceled THEN
NEW.is_canceled := etc etc ;
ENDIF;

There's currently no way to detect if the column was simply not mentioned at 
all in the UPDATE statement.

--
-- Christophe Pettus
   x...@thebuild.com





Re: Trigger with conditional predicates

2021-01-04 Thread Christophe Pettus



> On Jan 4, 2021, at 11:06, Dirk Mika  wrote:
> 
> See thread below:
> 
> https://www.postgresql.org/message-id/VisenaEmail.26.7cbf2947c8d23ceb.1769a2755ff%40tc7-visena
> 
> I found that thread already, but It doesn't not provide a solution to my 
> problem.

One possibility, which is admittedly very hacky, is:

-- Create a new column which is a flag (or bitmap) of other columns that need 
to be managed in this way, with a default of 0.
-- Have two EACH ROW triggers:

* The first is ON UPDATE OF the actual column to managed, and sets the 
appropriate flag or bitmap in the flag column when run.  This flags that the 
application has updated the column.

* The second, which runs always, checks that flag, and if it is set, clears it; 
otherwise, it sets the column to the value desired if the application didn't 
change it.

Of course, the order of execution of these triggers matters; PostgreSQL 
executes triggers at the same level alphabetically.

Now, this is a pretty high-overhead way of handling it, and it is probably 
better to see if there is an application logic change that can happen here.

Best,
--
-- Christophe Pettus
   x...@thebuild.com





Re: Max# of tablespaces

2021-01-05 Thread Christophe Pettus



> On Jan 5, 2021, at 13:55, Thomas Flatley  wrote:
> 
> As far as I can tell, each tablespace is a partition, and I assume they felt 
> this was the best way to perform partition maintenance - again, I don’t know 
> , 

It's a very common Oracle-ism to have a lot of tablespaces, in part because 
(IIRC) Oracle makes it an incredible pain in the neck to add tablespaces once 
the DB is in use.  For sharding purposes, you probably want schemas in 
PostgreSQL instead of tablespaces, although having that many schemas is going 
to not be optimal, either.

--
-- Christophe Pettus
   x...@thebuild.com





Re: FTS and tri-grams

2021-01-05 Thread Christophe Pettus



> On Jan 5, 2021, at 13:26, Mark Phillips  wrote:
> 1. Is FTS required for tri-gram to work?
> 2. Are these independent of each other?
> 3. Is tri-gram alone sufficient for a “full text search” feature?

The answers are, kind of in order:

2. Yes.
1. No.
3. It depends on what you mean by "full text search."

Trigrams are mostly for fuzzy matching on a single or small number of words.  
There are things that the tsvector machinery can do that trigrams can't, such 
as proximity searches between words, prefix and stemmed searches, and things of 
that type.  If you just want fuzzy searching on a small number of words, 
trigrams are probably fine; for more sophisticated kinds of searching, you want 
tsvector.

They're completely different sets of functionality in PostgreSQL.

--
-- Christophe Pettus
   x...@thebuild.com





Re: How to keep format of views source code as entered?

2021-01-07 Thread Christophe Pettus



> On Jan 7, 2021, at 08:19, Markhof, Ingolf  
> wrote:
> I want the SLQ code of my views stored as I entered it. Is there any way to 
> achieve this? Or will I be forced to maintain my views SQL code outside of 
> PostgreSQL views?

The text that you get back from the PostgreSQL system catalogs is based on the 
parsed version of the view definition, rather than the literal text you 
entered.  Generally, you maintain your view definition separately in a source 
code control system in its original form.

--
-- Christophe Pettus
   x...@thebuild.com





Re: How to keep format of views source code as entered?

2021-01-07 Thread Christophe Pettus
Hello,

> On Jan 7, 2021, at 09:33, Markhof, Ingolf  
> wrote:
> 
> So, it looks like PostgreSQL does support saving the original source code of 
> a view.

To be clear, PostgreSQL itself does not.  The suggestion is to use an external 
source code repository, such as GitHub, GitLab, or one of (many!) other tools 
or products to store the view definition.

This has benefits besides just retaining the original source code, as you 
mention below: Version control, tracking, issue management and commit merging, 
etc.
--
-- Christophe Pettus
   x...@thebuild.com





Re: How to keep format of views source code as entered?

2021-01-09 Thread Christophe Pettus



> On Jan 9, 2021, at 06:22, Markhof, Ingolf  
> wrote:
> What I would like to have is something that would automatically update the 
> SQL code in the software repository when I run a CREATE OR REPLACE VIEW.

I think you are approaching this backwards.  The SQL in the repository should 
be the definitive version.  If you wish to change the view, you change the 
CREATE OR REPLACE VIEW command that you have stored in the repository, and then 
apply that to the database so it now has the new view definition.

You may not, in a small independent project, feel the need for a source code 
repository, but it becomes very useful very quickly.

--
-- Christophe Pettus
   x...@thebuild.com





Re: Best tools to monitor and fine tune postgres

2021-01-15 Thread Christophe Pettus
This conversation doesn't really have anything to do with monitoring and 
fine-tuning PostgreSQL, at this point, does it?  It might be appropriate to let 
go the meta-discussion when each individual person thinks it is appropriate to 
answer a question.

--
-- Christophe Pettus
   x...@thebuild.com





Re: Do we need a way to moderate mailing lists?

2021-01-15 Thread Christophe Pettus



> On Jan 15, 2021, at 21:51, Hemil Ruparel  wrote:
> 
> This is a meta discussion. I couldn't find a meta mailing list so I am 
> posting it here. This discussion sparked from this message. 

If you feel someone is being disruptive, or abusive, there is a Code of Conduct 
process:

https://www.postgresql.org/about/policies/coc/
--
-- Christophe Pettus
   x...@thebuild.com





Re: Do we need a way to moderate mailing lists?

2021-01-15 Thread Christophe Pettus



> On Jan 15, 2021, at 21:51, Hemil Ruparel  wrote:
> It's my personal opinion. But i personally do not want to deal with entitled 
> people who cannot do basic google searches.

Well, I have to mention that you don't have to deal with them.  It might be 
slightly irritating to read questions like that, but they really aren't choking 
the list, and you can just move on to a different question that you feel like 
answering.

There is also a virtue in people asking very basic questions on the list, even 
one that could be revealed by a search, because it gives people who are not 
deep experts a chance to answer the question.
--
-- Christophe Pettus
   x...@thebuild.com





Re: Do we need a way to moderate mailing lists?

2021-01-15 Thread Christophe Pettus



> On Jan 15, 2021, at 22:10, Hemil Ruparel  wrote:
> 
> I fear that mailing lists become like quora.

pgsql-general is almost old enough to drink. :)  The very first message I could 
find with a reliable date is from 31 May 2000.  I think after nearly 21 years, 
we don't have much to worry about in that regard.

That message also includes the statements:

> What you are asking for is replication, which is not easy to implement, and 
> almost damn impossible to get it RIGHT.

and

> Now, what is WAL? When is it scheduled for implementation?

... which is pretty wild all by itself.
--
-- Christophe Pettus
   x...@thebuild.com





Re: Do we need a way to moderate mailing lists?

2021-01-15 Thread Christophe Pettus



> On Jan 15, 2021, at 22:19, Hemil Ruparel  wrote:
> 
> I have no problems if there are one or two questions which are exactly the 
> same. I give them the benefit of doubt. What I won't tolerate are entitled 
> people who think we work for them for free and that they are entitled to 
> receive and answer.

I suppose it would be rude to point out that PostgreSQL list style is to not 
top-post?  I have to say, if you are going to be firm with people about 
etiquette...

If someone gets abusive about not receiving help (and it does happen, sadly), 
that's exactly the kind of thing the Code of Conduct was designed for.  If they 
are seriously spamming the list, likewise.

For a lot of people, though, they just aren't familiar with list etiquette, do 
not have English as their first language and are not clear what is being asked 
of them, or just don't know the resources out there.

I would assume they are acting in good faith.  If you politely point out 
resources to them and they get snappish, then it can become a CoC issue.  
Otherwise, I think that being generous in what we receive and accurate in what 
we reply, as with any protocol, is the right answer.
--
-- Christophe Pettus
   x...@thebuild.com





Re: interval data type

2021-01-21 Thread Christophe Pettus



> On Jan 21, 2021, at 13:22, James B. Byrne  wrote:
> 
> What is the difference between interval(3)[] and simply interval(3)?  Where in
> the documentation is the [] syntax discussed?

The [] syntax means an array.  For example, float[] means an array of floating 
point numbers, so interval[] means an array of intervals.


> I got this to work with: ADD COLUMN lead_time interval day;  and also 
> with:
>ADD COLUMN lead_time interval(3); but I do not understand what these mean
> frankly. Does the form 'interval(3) imply a field value of SECOND?

No.  An interval in PostgreSQL has multiple components: the year, month, and 
day intervals are all stored separately.  For example, if months were always 
converted to seconds (or days), this wouldn't work properly:

xof=# SELECT '2021-01-01'::date + '1 month'::interval;
  ?column?   
-
 2021-02-01 00:00:00
(1 row)

xof=# SELECT '2021-02-01'::date + '1 month'::interval;
  ?column?   
-
 2021-03-01 00:00:00
(1 row)

The value in parenthesis is the number of decimal places to store fractional 
seconds:

xof=# select '0.33312312'::interval;
interval 
-
 00:00:00.333123
(1 row)

xof=# select '0.33312312'::interval(3);
   interval   
--
 00:00:00.333
(1 row)

> Are there other types of 'fields' that may be used with interval that are not
> given?

No, that list is exhaustive.  The "fields" in the discussion of interval are 
not the same as the columns in a table; the documentation is talking about the 
components of an interval.

> I could not find a definition of 'sectored fields' in the manual.  What is its
> meaning?

I don't believe that's a thing in PostgreSQL, and I didn't see the word 
"sectored" in the documentation.  Can you quote where you saw it?

> Also I do not understand under what circumstance one would use the interval
> type in place of a simple integer.

Interval represents more than just a count of seconds or milliseconds, or some 
other unit; it also includes intervals that are not a fixed number of seconds, 
such as months and years.

--
-- Christophe Pettus
   x...@thebuild.com





Re: Can I use Postgres rules to reset session variables before/after queries?

2021-01-24 Thread Christophe Pettus



> On Jan 24, 2021, at 21:00, Andrew Stuart  wrote:
> Can anyone suggest if session variables can be SET/RESET using Postgres 
> rules, or optionally perhaps there is a better way to do so?

PostgreSQL poolers generally use the RESET ALL command when reassigning a 
session to clear the session state:

https://www.postgresql.org/docs/current/sql-reset.html

You probably want to do this rather than try to intercept every single 
operation in order to the reset at the end.

--
-- Christophe Pettus
   x...@thebuild.com





MultiXactMemberControlLock contention on a replica

2021-02-12 Thread Christophe Pettus
On a whole fleet of load-balanced replicas, we saw an incident where one 
particular query started backing up on MultiXactMemberControlLock and 
multixact_member.  There was no sign of this backup on the primary.  Under what 
conditions would there be enough multixact members on a replica (where you 
can't do UPDATE / SELECT FOR UPDATE / FOR SHARE) to start spilling to disk?
--
-- Christophe Pettus
   x...@thebuild.com





Re: MultiXactMemberControlLock contention on a replica

2021-02-15 Thread Christophe Pettus



> On Feb 15, 2021, at 07:47, Laurenz Albe  wrote:
> So my guess would be that the difference between primary and standby is not 
> that a
> different number of multixacts are created, but that you need to read them on
> the standby and not on the primary.

Why does the secondary need to read them?  Visibility?
--
-- Christophe Pettus
   x...@thebuild.com





Re: MultiXactMemberControlLock contention on a replica

2021-02-15 Thread Christophe Pettus



> On Feb 15, 2021, at 08:15, Laurenz Albe  wrote:
> Right.  I cannot think of any other reason, given that the standby only
> allows reading.  It's just an "xmax", and PostgreSQL needs to read the
> multixact to figure out if it can see the row or not.

OK, I think I see the scenario: A very large number of sessions on the primary 
all touch or create rows which refer to a particular row in another table by 
foreign key, but they don't modify that row.  A lot of sessions on the 
secondary all read the row in the referred-to table, so it has to get all the 
members of the multixact, and if the multixact structure has spilled to disk, 
that gets very expensive.

--
-- Christophe Pettus
   x...@thebuild.com





Re: "A block containing an EXCEPTION clause is significantly more expensive to enter and exit than a block without one"

2022-06-13 Thread Christophe Pettus



> On Jun 12, 2022, at 23:07, Pavel Stehule  wrote:
> The lazy implementation theoretically can be possible, but why?

Isn't one of the reasons for the savepoint (in fact, the principal reason) to 
reset the connection back to non-error state so that execution can continue?  
In that case, it really does need to create the savepoint at the start of the 
block, regardless of what's in it, since any statement can raise an error.



Re: User's responsibility when using a chain of "immutable" functions?

2022-06-28 Thread Christophe Pettus



> On Jun 28, 2022, at 18:41, Bryn Llewellyn  wrote:
> Should I simply understand that when I have such a dynamic dependency chain 
> of "immutable" functions, and should I drop and re-create the function at the 
> start of the chain, then all bets are off until I drop and re-create every 
> function along the rest of the chain?

Yes.

You don't have to drop and recreate the functions, though.  DISCARD PLANS 
handles it as well:

xof=# create function f1() returns text as $$ begin return 'cat'; end $$ 
language plpgsql immutable;
CREATE FUNCTION
xof=# create function f2() returns text as $$ begin return f1(); end $$ 
language plpgsql immutable;
CREATE FUNCTION
xof=# create function f3() returns text as $$ begin return f2(); end $$ 
language plpgsql immutable;
CREATE FUNCTION
xof=# select f1(), f2(), f3();
 f1  | f2  | f3  
-+-+-
 cat | cat | cat
(1 row)

xof=# drop function f1();
DROP FUNCTION
xof=# create function f1() returns text as $$ begin return 'dog'; end $$ 
language plpgsql immutable;
CREATE FUNCTION
xof=# select f1(), f2(), f3();
 f1  | f2  | f3  
-+-+-
 dog | dog | cat
(1 row)

xof=# discard plans;
DISCARD PLANS
xof=# select f1(), f2(), f3();
 f1  | f2  | f3  
-+-+-
 dog | dog | dog
(1 row)

xof=# 

The contract on an immutable function is that it returns the same return value 
for particular input values regardless of database or system state: that is, 
it's a pure function.  Changing the definition in such a way breaks the 
contract, so I don't think PostgreSQL needs to do heroics to accommodate that 
situation.  (For example, changing the definition of an immutable function 
that's used in an expression index could corrupt the index.)  If one's fixing a 
bug, then rolling out the change in a controlled way is a reasonable 
requirement.



Re: User's responsibility when using a chain of "immutable" functions?

2022-06-29 Thread Christophe Pettus



> On Jun 28, 2022, at 23:42, Laurenz Albe  wrote:
> That is not enough in the general case.  You are not allowed to redefine
> an IMMUTABLE function in a way that changes its behavior [...]

I think "not allowed" is putting it too strongly.  It would be a bit much to 
ask that every single user-written immutable function be 100% perfect when it 
is rolled out, and never have to fix any bugs in them.  However, you definitely 
*do* have to understand that there are administrative consequences for doing 
so, like rebuilding indexes and invalidating session caches.  I think that the 
OP's statement that you can't ever use user-defined functions from an immutable 
function is too strong, too; you need to be aware of the consequences if you 
change an immutable function in a way that alters the return result for a 
previously-valid set of arguments.



Re: Seems to be impossible to set a NULL search_path

2022-07-05 Thread Christophe Pettus



> On Jul 5, 2022, at 11:12, Bryn Llewellyn  wrote:
> Finally, what do you think of a possible future enhancement to allow setting 
> a null search_path?

You use the empty string, rather than NULL, but it works right now:

xof=# show search_path;
   search_path   
-
 "$user", public
(1 row)

xof=# select * from t;
 i | d1 | d2 
---++
(0 rows)

xof=# set search_path='';
SET
xof=# show search_path;
 search_path 
-
 ""
(1 row)

xof=# select * from t;
ERROR:  relation "t" does not exist
LINE 1: select * from t;
  ^
xof=# 






Re: lifetime of the old CTID

2022-07-05 Thread Christophe Pettus



> On Jul 5, 2022, at 22:35, Matthias Apitz  wrote:
> Internally, in the DB layer, the read_where() builds the row list matching
> the WHERE clause as a SCROLLED CURSOR of
> 
>SELECT ctid, * FROM d01buch WHERE ...
> 
> and each fetch() delivers the next row from this cursor. The functions
> start_transaction() and end_transaction() do what their names suggest and
> rewrite_actual_row() does a new SELECT based on the ctid of the actual row
> 
>SELECT * FROM d01buch WHERE ctid = ... FOR UPDATE
>...
>UPDATE ...

On first glance, it appears that you are using the ctid as a primary key for a 
row, and that's highly not-recommended.  The ctid is never intended to be 
stable in the database, as you have discovered.  There are really no particular 
guarantees about ctid values being retained.

I'd suggest having a proper primary key column on the table, and using that 
instead.



Re: Seems to be impossible to set a NULL search_path

2022-07-06 Thread Christophe Pettus



> On Jul 6, 2022, at 09:48, Bryn Llewellyn  wrote:
> Neither causes an error. The "show", in each case, prints the bare value with 
> no quotes. It never struck me try try double quotes around the timezone 
> argument. I'm shocked that they are silently accepted here and seem to have 
> the same effect (in this syntax setting) as single quotes.

It's really a lot easier than it is being made out to be.

GUCs accept a string as a value, which might then be converted to other values 
based on the semantics of the GUC.  PostgreSQL generously accepts three 
different ways of delimiting this string:

1. Single quotes (as in SQL).
2. Double quotes (which in SQL, yes, are used to delimit names).
3. No delimiter at all if there are no embedded characters that would terminate 
the value prematurely.

That's pretty much it.  That's the rule.  In the case of search_path, the 
argument is "a list of SQL names, including possibly one or none".  Now, one 
could perhaps argue that PostgreSQL is being overly-accepting by allowing #2, 
since it does create odd situations like you describe with search_path, quotes, 
and so forth (there might be others, but search_path is the only one that jumps 
to mind).  That being said, it does, it has for decades, and there's really no 
compelling reason to change it.  Of all the things that might be described as 
"shocking" about PostgreSQL, "GUCs allow double-quotes to delimit values for 
convenience and historic reasons" is not quite the one I would pick.



Re: lifetime of the old CTID

2022-07-06 Thread Christophe Pettus



> On Jul 6, 2022, at 12:51, Matthias Apitz  wrote:
> it is uniqu to identify a row in a table once
> known.

I think the point that we are trying to make here is that a ctid *isn't* that.  
There is no guarantee, at all, at any level, that the ctid of a row will remain 
stable, not even between two SELECT statements.  (Although it doesn't right 
now, I could easily image some kind of repack logic in PostgreSQL that runs on 
read operations, not just write.)  It shouldn't be considered an API.  I 
understand that it might be painful to change to a generated primary key, but I 
think that will be less painful in the long run than having to stay ahead of 
PostgreSQL's internals.



Re: equivalent thing of mtr in mysql

2022-07-18 Thread Christophe Pettus



> On Jul 18, 2022, at 20:03, merryok  wrote:
> 
> I've read the doc, and it doesn't help too much.
> Finally I've found START_CRIT_SECTION and END_CRIT_SECTION. It's like 
> mtr.start(), mtr.commit() in mysql. May I ask why many places are wrapped 
> into START_CRIT_SECTION/END_CRIT_SECTION during a single dml operation ?

A PostgreSQL critical section is not the equivalent of MySQL InnoDB 
mini-transaction.

A critical section in PostgreSQL is a section of code that needs to run without 
interruption to avoid corruption of internal in-memory data structures.

PostgreSQL doesn't have a direct equivalent of a MySQL mini-transaction.  When 
WAL information is created by a statement, it's stored in the WAL buffers, and 
then flushed to disk by the WAL writer (to a first approximation).  There's no 
special operation that groups pages together for atomic writes; that's done by 
the underlying file system flush operation.

> And if Assert(CritSectionCount > 0) isn't satisfied (CritSectionCount need't 
> be protected ?), PG server will panic and exit ? If so, what's the 
> probability of that ?

If it occurs, it indicates a bug in PostgreSQL.  It is *extremely* infrequent 
(as in, you can go years without seeing one; I can't remember the last time I 
did).



Re: « The PL/pgSQL interpreter parses the function's source text and produces an internal binary instruction tree... »

2022-07-28 Thread Christophe Pettus



> On Jul 28, 2022, at 18:04, Bryn Llewellyn  wrote:
> Is this expected?

Yes.  This isn't a bug.

> In other words, is there a careful explanation of what "parse" means in the 
> context of "create or replace" for a subprogram that predicts all of the 
> outcomes that I reported here?


Database objects (such as tables and columns) are left as identifiers until 
they are executed, because that is the point at which a plan for those 
statements is created.  The other components of PL/pgSQL are translated to 
internal form (and thus checked for existence) as compile time.



Re: « The PL/pgSQL interpreter parses the function's source text and produces an internal binary instruction tree... »

2022-07-28 Thread Christophe Pettus



> On Jul 28, 2022, at 18:49, Bryn Llewellyn  wrote:
> It's this that surprises me. And it's this, and only this, that I'm asking 
> about: might _just_ this be a fixable bug?

It might be surprising, but it's not a bug.  You can demonstrate it with a very 
small test case:

CREATE FUNCTION f() RETURNS VOID AS $$
DECLARE
   x int not null := 0;
BEGIN
   x := y;
END;
$$ language plpgsql;

But gets an error on execution:

xof=# SELECT f();
ERROR:  column "y" does not exist
LINE 1: x := y
 ^
QUERY:  x := y
CONTEXT:  PL/pgSQL function f() line 5 at assignment

The clue is that it is complaining about a missing "column."  Assignment in 
PL/pgSQL is essentially syntactic sugar around a SELECT ... INTO.  The 
assignment there is processed pretty much as if it were written:

SELECT y INTO x;

Note, however, that this does *not* compile:

CREATE OR REPLACE FUNCTION f() RETURNS VOID AS $$
DECLARE
   x int not null := 0;
BEGIN
   y := x;
END;
$$ language plpgsql;

ERROR:  "y" is not a known variable
LINE 5:y := x;

Unquestionably, this is surprising!  The reasons, such as they are, are based 
in how PL/pgSQL processes SQL statements.  (For example, if you look at the 
grammar, it literally takes "SELECT x INTO y;" turns it into "SELECT x   
;", and passes that to the SPI.  This has the virtue that it doesn't have to 
have a complete PostgreSQL SQL grammar replicated in it (what a nightmare), but 
it does result in some of the implementation poking through.



Re: Upgrading from 12.3 to 12.11

2022-08-03 Thread Christophe Pettus



> On Aug 3, 2022, at 10:16, zaphod61  wrote:
> 
> 
> I've inherited a postgresql 12.3 installation. It has 1 database in it.  I 
> need to upgrade it to the newest version, which appears to be 12.11. Can I 
> just download the installer for 12
> 11 and run that to upgrade the product and still maintain access to the 
> existing connections?

Any upgrade will require a server restart.  If it's a minor version upgrade, 
you just need to replace the binaries, but *always* read the upgrade notes for 
post-upgrade housekeeping.



Re: Setting up streaming replication on large database (20+ TB) for the first time

2022-08-17 Thread Christophe Pettus



> On Aug 17, 2022, at 13:06, Ivan N. Ivanov  wrote:
> 
> How to speed up recovering of WAL files?

Since you are running on your own hardware, you might take a look at:

https://github.com/TritonDataCenter/pg_prefaulter




Re: Unable to Create or Drop Index Concurrently

2022-08-17 Thread Christophe Pettus



> On Aug 17, 2022, at 22:57, Abdul Qoyyuum  wrote:
> Question is, do we have to shutdown traffic and close all existing open 
> connections in order to drop and properly recreate the index?

No, you don't.

On the CREATE INDEX CONCURRENTLY command, what is likely going on is that when 
the connection drops, the session terminates, which will terminate the CREATE 
INDEX CONCURRENTLY command and leave the index in an INVALID state.  The 
problem to solve is preventing the session from disconnecting, either by 
finding a way to avoid a timeout, connecting via screen or tmux, etc.

On the DROP INDEX, what is likely going on is that the DROP INDEX is waiting 
for other transactions which are accessing that table to finish, since it needs 
to take an exclusive lock on the table.  If the session drops, the command 
isn't run, so the index hasn't been dropped.  The solution is the same as 
above.  If you are on a version that supports it,  you can use the DROP INDEX 
CONCURRENTLY command to avoid locking issues with the table, since even before 
the DROP INDEX happens, new transactions attempting to access that table will 
queue up behind the DROP INDEX.



SIReadLock vs hot_standby_feedback

2022-08-18 Thread Christophe Pettus
I am reasonably sure the answer to this is "no", but can the oldest xmin that 
hot_standby_feedback sends to the primary also delay SIReadLocks cleanup?  
Going through the code, it looks like they're independent, but this isn't a 
part of the system I know particularly well.



Re: With Recursive / Recursive View question

2022-08-20 Thread Christophe Pettus



> On Aug 20, 2022, at 15:42, Perry Smith  wrote:
> 
> To rephrase, is it possible to write a view that would work from the child 
> terms out towards the ancestors?

Assuming that the concern is that you want to parameterize this predicate:

WHERE basename = '10732.emlx'

... you might consider an SQL function taking basename as a parameter.



Re: Two questions about "pg_constraint"

2022-08-25 Thread Christophe Pettus



> On Aug 25, 2022, at 21:43, Bryn Llewellyn  wrote:
> [...]

I've read this a few times, and I am having trouble understanding what behavior 
you were expecting out of PostgreSQL, and what behavior you received that you 
didn't think was correct.  If it is "pg_constraint has a column connamespace, 
and that appears to be a denormalization since a constraint is always in the 
same schema as the table it is owned by," I believe Tom explained the reason 
for that.

If that's not what is concerning you, can you summarize it in a sentence two?





Re: Two questions about "pg_constraint"

2022-08-26 Thread Christophe Pettus



> On Aug 26, 2022, at 15:33, Bryn Llewellyn  wrote:
> [...]

I'm still not clear on what you are proposing.  Are you proposing a change to 
PostgreSQL to remove the "connamespace" column from the "pg_constraint" table, 
since it can be derived from other tables?



Re: Two questions about "pg_constraint"

2022-08-26 Thread Christophe Pettus



> On Aug 26, 2022, at 18:47, Bryn Llewellyn  wrote:
> No, I’m not proposing any code change.

Thanks for clarifying.




Re: Diffs in PG output vs WAL

2022-09-01 Thread Christophe Pettus



> On Sep 1, 2022, at 10:14, V  wrote:
> I want new/old tuples with pgoutput. how?

I assume here you are reading the pgoutput protocol directly.

Logical decoding sends out two tuple structures:

1. The replica identity of the row (in the case of update and delete).
2. The new row data (in the case of update and insert).

If you want to get the entire content of the old row, you'll need to set the 
table as REPLICA IDENTITY FULL.  This is something of a hack, and *greatly* 
increases the data volume, so you may want to use the subscribing side for the 
old tuple (perhaps captured with a trigger).



Re: Unable to archive logs in standby server

2022-09-02 Thread Christophe Pettus



> On Sep 1, 2022, at 21:41, Meera Nair  wrote:
> Archival hangs. Is this expected?
> postgres=# select pg_start_backup('test', true, false);
> pg_start_backup
> -
> 1/F960
> (1 row)
>  
> postgres=# select pg_stop_backup('f');
> NOTICE:  base backup done, waiting for required WAL segments to be archived
> WARNING:  still waiting for all required WAL segments to be archived (60 
> seconds elapsed)
> HINT:  Check that your archive_command is executing properly.  You can safely 
> cancel this backup, but the database backup will not be usable without all 
> the WAL segments.
> WARNING:  still waiting for all required WAL segments to be archived (120 
> seconds elapsed)
> HINT:  Check that your archive_command is executing properly.  You can safely 
> cancel this backup, but the database backup will not be usable without all 
> the WAL segments.

This generally means the command being run by archive_command is failing.  
Check the PostgreSQL logs (if you are using CSV logs, check the *.log file 
rather than the *.csv file).



  1   2   3   4   >