RE: Load data from a csv file without using COPY

2018-06-25 Thread Kevin Brannen
From: Ravi Krishna [mailto:srkris...@yahoo.com]
Sent: Tuesday, June 19, 2018 4:15 PM
To: Steve Atkins 
Cc: PG mailing List 
Subject: Re: Load data from a csv file without using COPY

>
> If performance is relevant then your app should probably be using COPY
> protocol, not line by line inserts. It's supported by most postgresql
> access libraries. If your app does that then using "\copy" from psql would be 
> an appropriate benchmark.

Actually the reluctance to not use COPY is to make the benchmark same across 
two different RDBMS in two diff env.

---

Seems like a ~10 line Perl program could handle this very easily. Use the 
Text::CSV module to make handling of the input easier. Prepare your insert 
statement, then once you've pulled each line of input in execute the insert. 
For slightly better performance, I'd probably use transactions, add a counter, 
and commit every 10,000 rows (or something like that).

Once you have that working with 1 DB, you can copy your program, change the DBD 
driver and the connect statement to the other DB and try the other one. Unless 
you want to be really clever and make the same program do both and pick the DB 
by a command line switch. :)

HTH,
Kevin
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.



dumping only table definitions

2018-06-29 Thread Kevin Brannen
I'm trying to figure out how to dump only the table definitions, well those and 
things they need directly, like sequences & types. What I do NOT want are all 
the millions (not literally but it feels like it :)) of functions we have. 
Triggers would be all right if I must, as we only have a few of those.

I've tried various combinations of args to pg_dump, with -s being what I'd like 
to work, but I still get all the functions.

I suppose I could filter it with Perl or Awk, but that could be tricky. In 
fact, the best my searching could find is to do:

 pg_dump -s databasename | awk 'RS="";/CREATE TABLE[^;]*;/'

which fails in some interesting ways.

I could dump the schema (with functions) then load it into another DB then 
programmatically drop all the functions before dumping that with pg_dump, but 
again why should I have to.

Is there any Pg tool that gives me just the table defs or am I going to have to 
write my own?

Thanks,
Kevin
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: dumping only table definitions

2018-06-29 Thread Kevin Brannen
On Fri, Jun 29, 2018 at 6:30 PM, Kevin Brannen 
mailto:kbran...@efji.com>> wrote:
I’m trying to figure out how to dump only the table definitions, well those and 
things they need directly, like sequences & types. What I do NOT want are all 
the millions (not literally but it feels like it ☺) of functions we have. 
Triggers would be all right if I must, as we only have a few of those.

Melvin wrote:

> > What I do NOT want are all the millions (not literally but it feels like it 
> > ☺) of functions we have.

> It sounds like you would best be served by installing pg_extractor. In 
> essence, it is a super flexible version of pg_dump. You have your choice of 
> exactly what you want to dump.

> https://github.com/omniti-labs/pg_extractor
Thanks Melvin, that looks like it would do what I want. If it doesn’t, I’ll try 
Alvaro’s and Tom’s suggestions.

Kevin
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: Improving pg_dump performance

2018-07-23 Thread Kevin Brannen
-Original Message-
From: Adrian Klaver [mailto:adrian.kla...@aklaver.com]
Sent: Monday, July 23, 2018 8:56 AM
To: Ron 
Cc: pgsql-general@lists.postgresql.org
Subject: Re: Improving pg_dump performance

On 07/23/2018 06:47 AM, Ron wrote:
> On 07/23/2018 08:46 AM, Stephen Frost wrote:
>> Greetings,
>>
>> * Ron (ronljohnso...@gmail.com) wrote:
>>> An interesting idea.  To clarify: it's possible to parallel backup a
>>> running
>>> 8.4 cluster remotely from a 9.6 system?
>> Yes, you can do a parallel backup, but you won't be able to get a
>> consistent snapshot.  You'll need to pause all changes to the
>> database while the pg_dump processes connect and start their
>> transactions to have the backup be consistent.
>
> I can do that!!!

Assuming you can get this setup, have you tested some subset of your data on 
9.6.9?:

---

+1 on that!  Case in point...

When we upgraded from 9.5 to 9.6 (only 1 version so it doesn't sound all that 
bad does it?) our application failed in 2 different places which we traced down 
to SQL failing. Both instances where something along the lines of:

select fields from table1 join table2 on (key) where conditionA and conditionB;

What happened was in that 9.5, the planner reordered the WHERE and did 
conditionB first, which always failed (at least when it mattered). In 9.6 the 
planner did conditionA first. The problem came from conditionA needing a type 
conversion that didn't automatically exist, hence the failure. A simple casting 
fixed the issue and we really should have had that in the original version, but 
the data we tested with never had the characteristics that would have triggered 
the problem (in our defense, the data that caused the failure had never shown 
itself in over 3 years of real usage, so I think I can call that rare).

The mistakes were ours, but the new version "tightened" some things and they 
caught us. The fixes were quite simple to make, but it was a real surprise to 
us.

So be aware that while Pg has been very good about being backward compatible, 
or it has for us, you can get bit in upgrades. Reading the release notes 
looking for change is good, but in the end, running your code against the new 
version is the only way to find out.

HTH,
Kevin

This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: pg_dump backup utility is taking more time around 24hrs to take the backup of 28GB

2018-10-19 Thread Kevin Brannen
From: Raghavendra Rao J S V [mailto:raghavendra...@gmail.com]
Sent: Thursday, October 18, 2018 7:26 AM
To: pgsql-general@lists.postgresql.org
Subject: pg_dump backup utility is taking more time around 24hrs to take the 
backup of 28GB

Hi All,

We are using pg_dump backup utility in order to take the backup of the 
database. Unfortunately,it is taking around 24hrs of time  to take the backup 
of  28GB database. Please guide me how to reduce the time and is there any 
parameter need to be modified which will help us to reduce the backup time. We 
are using Postgres 9.2 version

Note:-Kindly suggest me options using pg_dump only.

--
Regards,
Raghavendra Rao
===

I agree with Ozz, be sure you’re dumping to another drive than what your DB 
sits on.
Investigate the “-f d” option, and the “-j” option can be useful too. I’ll also 
point out that hardware makes a big difference. :)

We can dump a 180G DB in about 15min. Note, that is to an SSD RAID array and we 
do “-f d -j $num_cpus_minus_2” and there are 32 cpus on the server (so -j 30).

HTH,
Kevin

This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


SELECT INTO question

2019-07-25 Thread Kevin Brannen
Hi,

We're trying to understand what happened with a SELECT INTO. The problem can be 
see with this example:

# create table t1 (id int, v int);
CREATE TABLE

# insert into t1 (select x, x from generate_series(1, 5) as g(x));
INSERT 0 5

# select * from t1;
id | v
+
  1 |  1
  2 |  2
  3 |  3
  4 |  4
  5 |  5
(5 rows)

nms=# select into t2 from t1;
SELECT 5

# select * from t2;
--
(5 rows)

# select * into t3 from t1;
SELECT 5

# select * from t3;
id | v
+
  1 |  1
  2 |  2
  3 |  3
  4 |  4
  5 |  5
(5 rows)

As you can see on the first select into, the result in t2 is ... missing, no 
"data" at all, unlike t3 which was the expected answer. Upon closer inspection, 
it was realized that the "expression" in the statement was left out (oops!), 
but instead of getting a syntax error, it worked.

So why did it work and why was nothing stored?

The only answer I've been able to come up with is that the expression was 
evaluated as a "null expression" for each row, so it gave us 5 null rows. A 
small part of my brain understands that, but most of my brain goes "what?!"

I've noticed that I can also do:

# select from t1;
--
(5 rows)

That also doesn't make sense and yet it does in a weird way. I suspect the 
answer revolves around some corner case in the SQL Standard.

So, what's going on here?

Thanks,
Kevin
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: SELECT INTO question

2019-07-25 Thread Kevin Brannen
-Original Message-
From: Tom Lane 
Sent: Thursday, July 25, 2019 2:47 PM
To: Adrian Klaver 
Cc: Kevin Brannen ; pgsql-generallists.postgresql.org 

Subject: Re: SELECT INTO question



Adrian Klaver mailto:adrian.kla...@aklaver.com>> 
writes:

> On 7/25/19 12:23 PM, Kevin Brannen wrote:

>> We're trying to understand what happened with a SELECT INTO. The

>> problem can be see with this example:

>>

>> nms=# select into t2 from t1;

>> SELECT 5

>> # select * from t2;

>> --

>> (5 rows)



> The list of output expressions after SELECT can be empty, producing a

> zero-column result table. This is not valid syntax according to the

> SQL standard. PostgreSQL allows it to be consistent with allowing

> zero-column tables. However, an empty list is not allowed when

> DISTINCT is used.



Right, you selected no columns from t1, so t2 has no columns (and yet five 
rows).  Worth noting here is that psql is a bit squirrely about displaying 
zero-column results --- it drops the column-names header line, and it doesn't 
emit a blank-line-per-row as one might expect.

Perhaps somebody ought to fix that, but it's such a corner case that no one has 
bothered yet.






Hmm, I don't particularly like that answer as I'd have preferred a "syntax 
error", but I do understand it.

Thanks for the answer, Adrian; and thanks for the expansion, Tom.

Kevin



This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


pg_restore issues with intarray

2019-09-05 Thread Kevin Brannen
I think I need some help to understand what’s going here because I can’t figure 
it out and google isn’t helping…

This is for Pg 9.6.5. (Yes, we are upgrading to 12.x in a few months, or so the 
plan is.) Pg code came from the community and we compiled it with no changes. 
This is on Centos 6.7, though I think the OS doesn’t matter.

I’ve found that when we do a pg_restore, that sometimes we get “errors”. I 
quote that because it turns out they’re really only warnings we can ignore, but 
when you check the return code ($?) after pg_restore, you get a non-zero value.

We’re calling pg_restore like:

$PGPATH/pg_restore -jobs=$NCPU --dbname=x .

FWIW, the backup was created with:

$PGPATH/pg_dump --clean --create --format=d --jobs=$NCPU --file=$EXP --dbname=x

The issue I’m seeing is:

pg_restore: [archiver (db)] Error while PROCESSING TOC:
pg_restore: [archiver (db)] Error from TOC entry 3605; 2753 18784 OPERATOR 
FAMILY gin__int_ops postgres
pg_restore: [archiver (db)] could not execute query: ERROR:  operator family 
"gin__int_ops" for access method "gin" already exists
Command was: CREATE OPERATOR FAMILY gin__int_ops USING gin;

pg_restore: [archiver (db)] Error from TOC entry 3606; 2753 18806 OPERATOR 
FAMILY gist__int_ops postgres
pg_restore: [archiver (db)] could not execute query: ERROR:  operator family 
"gist__int_ops" for access method "gist" already exists
Command was: CREATE OPERATOR FAMILY gist__int_ops USING gist;

pg_restore: [archiver (db)] Error from TOC entry 3607; 2753 18829 OPERATOR 
FAMILY gist__intbig_ops postgres
pg_restore: [archiver (db)] could not execute query: ERROR:  operator family 
"gist__intbig_ops" for access method "gist" already exists
Command was: CREATE OPERATOR FAMILY gist__intbig_ops USING gist;

Those operators come from the intarray extension. Looking into the toc.dat, I 
can see entries like:

^@3079^@^E^@^@^@16441^@^H^@^@^@intarray^@   
^@^@^@EXTENSION^@^B^@^@^@^@<^@^@^@CREATE EXTENSION IF NOT EXISTS intarray WITH 
SCHEMA common;

^A^A^@^@^@^@^F^@^@^@public^A^A^@^@^@^@^G^@^@^@nmsroot^@^E^@^@^@false^@^B^@^@^@15^@^A^@^@^@3^A^A^@^@^@^@^@^@^@^@^@^U^N^@^@^@^@^@^@^@^@^D^@^@^@2753^@^E^@^@^@18784^@^L^@^@^@gin__int_ops^@^O^@^@^@OPERATOR
 FAMILY^@^B^@^@^@^@/^@^@^@CREATE OPERATOR FAMILY gin__int_ops USING gin;
…same sort of thing for the other 2…

Those seem reasonable to me. It seems the problem is with the CREATE OPERATOR 
as there is not “if not exists” for it. Considering we created a new DB for the 
restore and went with --create to pg_restore, there should be nothing in the DB 
to create a conflict, yet somehow it already exists!

Interestingly, this doesn’t happen all the time. It seems that once we can get 
a restore in, that it never happens again in any subsequent restores. My theory 
for this is that the databases (or really installs) where this happens started 
are an earlier version that was pg_upgrade’d (probably from 9.5 or even from 
9.3). But I can take the shell script that runs this, turn off checking for the 
return code from pg_restore and we’ve found no issues with the DB (hence I can 
treat them like warnings). Of course, the downside there is that if I always 
ignore the return code from pg_restore, how do I catch real errors? 😊

If it matters, here’s the version we’re dealing with for intarray:

# \dx
  List of installed extensions
Name| Version |   Schema   |
Description
+-++
intarray   | 1.2 | common | functions, operators, and index 
support for 1-D arrays of integers

If there is a logical reason for this, I’d appreciate an explanation so I can 
deal with it properly. If more research is needed, I can do that and pointers 
as to what to look for would be great. I have a VM and a backup I can restore & 
test as many times as needed.

Last minute discovery and thought. It just occurred to me to look in template1 
and intarray is in there. Is it possible that gives me a pre-existing extension 
& operators and therefore the source of the conflict? If so, is the solution as 
simple as deleting that extension from template1? In fact, should I delete all 
the extensions from template1 that it shows as being in the public schema?

Thanks!
Kevin


This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving th

RE: pg_restore issues with intarray

2019-09-05 Thread Kevin Brannen
> From: Adrian Klaver 
>
> On 9/5/19 2:57 PM, Kevin Brannen wrote:
> > I think I need some help to understand what’s going here because I
> > can’t figure it out and google isn’t helping.
> >
> > This is for Pg 9.6.5. (Yes, we are upgrading to 12.x in a few months,
> > or so the plan is.) Pg code came from the community and we compiled it
> > with no changes. This is on Centos 6.7, though I think the OS doesn’t 
> > matter.
> >
> > We’re calling pg_restore like:
> >
> > $PGPATH/pg_restore -jobs=$NCPU --dbname=x .
> >
> > FWIW, the backup was created with:
> >
> > $PGPATH/pg_dump --clean --create --format=d --jobs=$NCPU --file=$EXP
> > --dbname=x
>
> The options you are adding for --clean, --create only have meaning for plain 
> text dumps. If you want those actions to occur on the restore then add them 
> to the pg_restore line. Though if you are going to create a new database it 
> will inherit objects from template1(as you found below), assuming you have 
> not set WITH TEMPLATE to something else.
>

Good point that I'm not doing plain text dumps.

Are you saying that my problem is that I need "--clean" on the pg_restore?
I can try that. The fact that this only happens on a few DBs and not all still
mystifies me. See below on the template...

> >
> > The issue I’m seeing is:
> >
> > pg_restore: [archiver (db)] Error while PROCESSING TOC:
> >
> > pg_restore: [archiver (db)] Error from TOC entry 3605; 2753 18784
> > OPERATOR FAMILY gin__int_ops postgres
> >
> > pg_restore: [archiver (db)] could not execute query: ERROR:  operator
> > family "gin__int_ops" for access method "gin" already exists
> >
> >  Command was: CREATE OPERATOR FAMILY gin__int_ops USING gin;
> >
...
> >
> > If there is a logical reason for this, I’d appreciate an explanation
> > so I can deal with it properly. If more research is needed, I can do
> > that and pointers as to what to look for would be great. I have a VM
> > and a backup I can restore & test as many times as needed.
> >
> > Last minute discovery and thought. It just occurred to me to look in
> > template1 and intarray is in there. Is it possible that gives me a
> > pre-existing extension & operators and therefore the source of the
> > conflict? If so, is the solution as simple as deleting that extension
> > from template1? In fact, should I delete all the extensions from
> > template1 that it shows as being in the public schema?

I've had time since I wrote the post to drop the intarray extension from
template1 and try the restore. Sadly, same errors happen, so it's not
template1 that's doing this to me, or at least not in a way I understand.

Kevin
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: pg_restore issues with intarray

2019-09-05 Thread Kevin Brannen
> On 9/5/19 4:24 PM, Adrian Klaver wrote:
> > On 9/5/19 4:06 PM, Kevin Brannen wrote:
> >>> From: Adrian Klaver 
> >>>
> >>> On 9/5/19 2:57 PM, Kevin Brannen wrote:
> >>>> I think I need some help to understand what’s going here because I
> >>>> can’t figure it out and google isn’t helping.
> >>>>
> >>>> This is for Pg 9.6.5. (Yes, we are upgrading to 12.x in a few
> >>>> months, or so the plan is.) Pg code came from the community and we
> >>>> compiled it with no changes. This is on Centos 6.7, though I think
> >>>> the OS doesn’t matter.
> >>>>
> >>>> We’re calling pg_restore like:
> >>>>
> >>>> $PGPATH/pg_restore -jobs=$NCPU --dbname=x .
> >>>>
> >>>> FWIW, the backup was created with:
> >>>>
> >>>> $PGPATH/pg_dump --clean --create --format=d --jobs=$NCPU
> >>>> --file=$EXP --dbname=x
> >>>
> >>> The options you are adding for --clean, --create only have meaning
> >>> for plain text dumps. If you want those actions to occur on the
> >>> restore then add them to the pg_restore line. Though if you are
> >>> going to create a new database it will inherit objects from
> >>> template1(as you found below), assuming you have not set WITH
> >>> TEMPLATE to something else.
> >>>
> >>
> >> Good point that I'm not doing plain text dumps.
> >>
> >> Are you saying that my problem is that I need "--clean" on the
> >> pg_restore?

Not the issue, that made various things worse. :)

> No, just that if you were expecting the clean to happen on the
> restore you would be disappointed.

To be crystal clear, on restore I do this from a bash script:

# move old to the side in case we need this on failure
$PGPATH/psql -d template1 -c "DROP DATABASE IF EXISTS save$db;"
$PGPATH/psql -d template1 -c "ALTER DATABASE $db RENAME TO save$db;"
# restore
$PGPATH/createdb -e -O $dbowner -T template0 $db
$PGPATH/pg_restore $VERBOSE --jobs=$NCPU --dbname=$db .

So by using template0, I'm expecting nothing to be there and the restore
to put everything in there I need to get back to the point where the
backup/dump happened. This is why I'm surprised I'm getting this error.

It feels like the restore is adding the intarray extension, which does a
CREATE OPERATOR FAMILY on its own, then later the restore does CREATE OPERATOR
FAMILY on again causing the problem. Yet this doesn't happen on most of our
databases, just a few. It's maddening to me.

> >> I can try that. The fact that this only happens on a few DBs and not
> >> all still mystifies me. See below on the template..
>
> My guess is you where restoring into a database with preexisting
> objects because neither create or clean was being done.

Shouldn't be happening with that createdb command. Hmm, I wonder what
I'd see if I put a "psql" command with "\dx" after the createdb and before
the restore...

Nope, the only extension is plpgsql, so the problem is coming from the
restore. Maybe I gave a bad option to pg_dump, but pg_restore seems to be
the issue. It really makes me want to modify the toc.dat file and hack
out those CREATE OPERATOR FAMILY lines and see what happens.

K.
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: SQL equivalint of #incude directive ?

2019-09-06 Thread Kevin Brannen
> From: stan 
>
> I thought this would be common. But a quick Google only revealed what look to 
> be workarounds.
>
> I am defining a bunch of functions, and I would prefer to store them in a 
> separate file, which then gets "source" by the main DB init file.
>
> Is there a standard way to do this?

Besides what the others have said, not if you stick to SQL. This is really
more of an application level question where you have something that
will do your init/load process.

If you're willing to use other tools, then there are solutions. IIRC, your
other thread had "#include" in the title. You could build a small file with:

#include "function1.sql"
#include "function2.sql"
...

Then run it thru "cpp" and use the output of that. Might need some
post-processing if you're unlucky; I'd have to try it to see if there are
extraneous lines.

Personally, I'd probably just have a shell script that goes into that dir
and does something like:

cat *.sql | psql -d DB # and other args

and call it done...translate if you're not on a Unix-like system.

I could write a Perl program to do it because Perl can do almost anything. ;)

Your imagination is the limit.

HTH,
Kevin
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.




RE: pg_restore issues with intarray

2019-09-06 Thread Kevin Brannen
>From: Jerry Sievers 
>
>>Kevin Brannen  writes:
>>
>> It feels like the restore is adding the intarray extension, which does
>> a CREATE OPERATOR FAMILY on its own, then later the restore does
>> CREATE OPERATOR FAMILY on again causing the problem. Yet this doesn't
>> happen on most of our databases, just a few. It's maddening to me.
>>
>
>I've seen this sort of problem before.
>
>It was due to some legacy DBs where I work having a few missing extension 
>membership registrations.
>
>pg_dump wants to include any such things in the output  which may run afoul  
>of same having been already created by the extension load.

Ugh! That would explain why it's only on some installs and only on the
"older" ones. Thanks for the info!

Did you ever find any work around? Or is it just a matter of recognizing
what's going on, ignore those "warnings", and moving to a later version to
get past this so it doesn't happen again?

Kevin
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: pg_restore issues with intarray

2019-09-06 Thread Kevin Brannen
>From: Adrian Klaver 
>On 9/5/19 5:05 PM, Kevin Brannen wrote:
>>
>> It feels like the restore is adding the intarray extension, which does
>> a CREATE OPERATOR FAMILY on its own, then later the restore does
>> CREATE OPERATOR FAMILY on again causing the problem. Yet this doesn't
>> happen on most of our databases, just a few. It's maddening to me.
>>
>What does \dx show in the database you taking the dump from?

Sadly, I don't have access to that system.

>What if you do a restore to a file only the schema e.g.:
>
>pg_restore -s -f some_file.sql
>
>This will create a plain text version of only the schema objects in 
>some_file.sql instead of restoring to the database. It might help shed some 
>light.

No CREATE EXTENSION or CREATE OPERATOR FAMILY statements.

Jerry's post indicates this is something that just happens with some older
versions and it seems I got unlucky. I do have a work around (ignore) but
I'd rather be proactive in knowing I'm ignoring something I should be and
not ignoring meaningful errors.

Thanks for the help Adrian, I really appreciate it!
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: pg_restore issues with intarray

2019-09-06 Thread Kevin Brannen
>From: Jerry Sievers 
>
>
>Try running \dx+ for intarray on one of your deviant systems.  You may find 
>the item pg_dump is trying to be explicit about *missing* from the extension 
>member list.
>
>In such a case, see the ALTER EXTENSION ADD... which can be run manually to 
>register whatever is missing.
>
>You will do this on the system that is *origin* for the pg_dump.
>
>This is what we've done.

Hmm, I was really hoping for something on the restore side instead of the
dump side as that makes it harder, but we don't always get what we want. :)

Still, at least was have something to pursue and this is now documented
on the mailing list.

Thanks!
Kevin
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.




can't install pg 12 beta on centos 6

2019-09-23 Thread Kevin Brannen
I thought I’d get a jump on testing this since we’re a little slow sometimes. 😊

I’ve spun up a new VM with Centos 6.10 (the latest). I found 
https://yum.postgresql.org/testing/12/redhat/rhel-6-x86_64/repoview/ to get the 
packages (BTW, that was kind of hard to find). I ended up with:

postgresql12-12beta4-1PGDG.rhel6.x86_64.rpm
postgresql12-contrib-12beta4-1PGDG.rhel6.x86_64.rpm
postgresql12-libs-12beta4-1PGDG.rhel6.x86_64.rpm
postgresql12-plperl-12beta4-1PGDG.rhel6.x86_64.rpm
postgresql12-server-12beta4-1PGDG.rhel6.x86_64.rpm

Then I did an “rpm -ivh *” and found I was missing a few things. I downloaded 
and installed most of what was missing, but I’ve got 2 dependencies that I 
can’t resolve.

error: Failed dependencies:
python2-libs is needed by postgresql12-12beta4-1PGDG.rhel6.x86_64
openssl-libs >= 1.0.2k is needed by 
postgresql12-libs-12beta4-1PGDG.rhel6.x86_64

Openssl is at “1.0.1e” from CentOS. So where does this “1.0.2k” come from?

Centos 6.10 has nothing called python2-libs, though I do have:

$ rpm -qa | grep python | grep libs
python-libs-2.6.6-66.el6_8.x86_64

It *feels* like the packager of the PG rpms took a dependency list from RH 7.x 
and gave it to RH 6.x.

Since this is only a test on my part, I wanted to use the community packages 
instead of building from source. Am I going to have to build from source at 
this point in the game until we get to at least the RC’s?

Thanks,
Kevin
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: citext, actually probably using extensions

2019-09-23 Thread Kevin Brannen
> From: Morris de Oryx 
>
> Not sure about best practices, but what I'm going is like this:
>
> * Create a schema named extensions.
> * Install extensions in this special schema only. I don't put anything else 
> in there.
> * Put the extensions schema early (left) in the search_path for each role.
> * Grant execute access permissively on the functions in that schema.
>
> If there's something deeply flawed about this strategy, I'd be keen to hear 
> about it. On the positive side, I find it simple to understand, maintain, and 
> explain to other people. YMMV
>

We do approximately the same thing if it makes you feel better, although we 
call our schema "common".
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: can't install pg 12 beta on centos 6

2019-09-23 Thread Kevin Brannen
> Adrian Klaver wrote:
> On 9/23/19 12:04 PM, Kevin Brannen wrote:
> > I thought I’d get a jump on testing this since we’re a little slow
> > sometimes. ??
> >
> > I’ve spun up a new VM with Centos 6.10 (the latest). I found
> > https://yum.postgresql.org/testing/12/redhat/rhel-6-x86_64/repoview/
> > to get the packages (BTW, that was kind of hard to find). I ended up with:
>
> You might want to take a look at:
>
> https://people.planetpostgresql.org/devrim/index.php?/archives/101-Installing-PostgreSQL-12-betaRC-on-RHELCentOSFedora.html
>
> and change to RH/CentOS 6.


Thanks for the idea; however, that didn't work.


$ sudo yum -y install 
https://dl.fedoraproject.org/pub/epel/epel-release-latest-6.noarch.rpm
Loaded plugins: fastestmirror, security
Setting up Install Process
epel-release-latest-6.noarch.rpm
|  14 kB 00:00
Examining /var/tmp/yum-root-aZqbqq/epel-release-latest-6.noarch.rpm: 
epel-release-6-8.noarch
/var/tmp/yum-root-aZqbqq/epel-release-latest-6.noarch.rpm: does not update 
installed package.
Error: Nothing to do


It looks like they're not ready for Centos 6 yet and I'm not ready for 8.0 -- 
well, it's not even released until tomorrow and I'm sure that's a whole 'nother 
battle. I'll just grab the source and give that a try.

Kevin
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: can't install pg 12 beta on centos 6

2019-09-24 Thread Kevin Brannen
>From: Adrian Klaver 
>On 9/23/19 2:00 PM, Kevin Brannen wrote:
>>> Adrian Klaver wrote:
>>> On 9/23/19 12:04 PM, Kevin Brannen wrote:
>>>>
>>>> I’ve spun up a new VM with Centos 6.10 (the latest). I found
>>>> https://yum.postgresql.org/testing/12/redhat/rhel-6-x86_64/repoview/
>>>> to get the packages (BTW, that was kind of hard to find). I ended up with:
>>>
>...
>Well the below is just one part of the above.
>
>Did you:
>
>etc/yum.repos.d/pgdg-redhat-all.repo
>
>[pgdg12-updates-testing]
>
>enabled=1
>
>and then:
>
>yum -y install postgresql12-server postgresql12-contrib
>
> From your original post I gathered that you went here:
>
>https://yum.postgresql.org/testing/12/redhat/rhel-6-x86_64/repoview/
>
>and grabbed individual packages. Is that correct?
>

Hmm, no, I didn't add the pgdg-redhat-all.repo.

Yes I did grab the RPMs directly as I thought that should have been good enough.
And really, it is as far as the Pg code. The issue is the non-Pg dependencies 
not
found in the standard Centos.

For the moment, I do have the source and I've compiled it successfully.
That will hold me for now as I change our environment and install scripts
so I can test v12 against our code.

I'll revisit this again when the RC's come out and I'll try your suggestions 
then.

Thanks for your help, Adrian!
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: date function bug

2019-10-23 Thread Kevin Brannen
From: Ravi Krishna 

> postgres=# select to_date('2018150X','MMDD');
  > to_date

> 2019-03-03


> postgres=# select to_date('20181501','MMDD');
>  to_date
> 
  > 2019-03-03

is this a cut-n-paste mistake?


Surprisingly (to me), no….
db=# select to_date('20181501','MMDD');
  to_date

2019-03-03
(1 row)

Time: 0.497 ms
nms=# select version();
 version
--
PostgreSQL 9.6.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 
(Red Hat 4.4.7-16), 64-bit
(1 row)

Time: 0.247 ms
db=#



This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: SQL pretty pritner?

2019-10-28 Thread Kevin Brannen
>From: Adrian Klaver 
>On 10/27/19 3:21 PM, stan wrote:
>> I have a presentation later in the week, and i plan on printing out
>> some files containing SQL commands. I have used some "pretty printers"
>> in the past for things like Perl scripts. What I am thinking of s
>> something that bolds keywords, handles page breaks, and does some formatting.
>
>http://sqlformat.darold.net/
>
>https://sourceforge.net/projects/pgformatter/

If I had that task, I'd use a combo of tools. "pgFormatter" that Adrian shows
above is what you use to reformat to make it aligned nicely, plus it does some
colorization. The first link is the online tool; the 2nd link is if you want
to download the code to your own server.

Example:
select count(*) from tablea where x is null;
Result:
SELECT
count(*)
FROM
tablea
WHERE
x IS NULL;

You can't see it here, but all the keywords are bold and colored, and "count" is
colored but not bolded.

If that colorization isn't good enough for you, then as others have said,
paste the result into "vim" to get the colorization, pasting that into
PowerPoint/Word or something that honors the colors...or just take a screen
shot of that...or whatever works for you.

HTH,
Kevin
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: Upgrade procedure

2019-10-30 Thread Kevin Brannen
>From: rihad 

>Hi, all. Why is it normally suggested to stop the server, upgrade it, then 
>start it? Wouldn't it be easier & quicker to simply upgrade the package 
>in-place and restart the service? On OSen that allow modification of currently 
>running binaries, which is most Unix OS, M$ Windows being a notable exception )



That might be possible on a minor upgrade, but quite probably not on a major 
version upgrade. I'm reasonably sure I've read that a major upgrade *can* 
change underlying data/structures for tables and other things. I don't think 
you want version-X writing to the tables on disk while version-Y writes a new 
layout to the same files at the same time. 😊

As always, see the fine manual, especially the part on upgrades and release 
notes.

Kevin

This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: Upgrade procedure

2019-11-05 Thread Kevin Brannen
>> >From: rihad 
>>
>>> Hi, all. Why is it normally suggested to stop the server, upgrade it,
>>> then start it? Wouldn't it be easier & quicker to simply upgrade the
>>> package in-place and restart the service? On OSen that allow
>>> modification of currently running binaries, which is most Unix OS, M$
>>> Windows being a notable exception )
>>>
>> That might be possible on a minor upgrade, but quite probably not on a
>> major version upgrade. I'm reasonably sure I've read that a major
>> upgrade *can* change underlying data/structures for tables and other
>> things. I don't think you want version-X writing to the tables on disk
>> while version-Y writes a new layout to the same files at the same
>> time. ??
>>
>>
>
>Why would that matter if the server gets restarted after replacing the 
>binaries? Aren't previous version's binaries "hard-wired" into memory while 
>they are running? AFAIK on FreeBSD at least no attempt is made to stop the 
>corresponding server or restart it when a package is upgraded by pkg(8).

We may be talking past each other here a bit...

After you do an upgrade, of course you have to restart the *PG* server or
you won't be using the new code, will you? :)

The manual or others here are more knowledgeable than I, but I believe that
for a "minor" upgrade, you can just swap out the code and restart PG. For
major upgrades, the PG server is going to have to come down as the underlying
files might be changed/transformed during the upgrade, then you start the
PG server when that's done. Check out the -k option as it can significantly
speed up pg_upgrade. You might find it safer to do a "pg_upgrade -c" before
the real upgrade; something to look at. As always on things like this, test
on a non-production machine first.

For us, we always use pg_upgrade even for minor updates because it feels
safer to me. That being said, we rarely do minor updates and just do majors
because upgrading is just hard enough (lots of testing!) we tend to wait and
then jump further. Upgrading is known to take a maintenance window; we just
plan things that way. Your organization may have different needs.

HTH,
Kevin
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: Upgrade procedure

2019-11-05 Thread Kevin Brannen
>> For us, we always use pg_upgrade even for minor updates because it
>> feels safer to me. That being said, we rarely do minor updates and
>> just do majors because upgrading is just hard enough (lots of
>> testing!) we tend to wait and then jump further. Upgrading is known to
>> take a maintenance window; we just plan things that way. Your organization 
>> may have different needs.
>
>Yeah, but that way you're almost guaranteed to run an unsupported & vulnerable 
>release for quite some time, until the next major one is ready )

If we ran into a true bug that affected us, we'd upgrade sooner. Thankfully, the
PG team is great about putting out quality software with a low bug rate. 
Running a
few minor versions back from the current is OK for us; we're always on a 
supported
major version (so we could upgrade to the current minor version if really 
required).
This is an organizational decision based on where it's better to spend time and 
effort.

OTOH, the better partitioning of v12 is a feature that will get us to upgrade
sooner. :)

As for security, we run in a very protected environment. If we are compromised,
it'll be by an inside person and there's really no tech defense against that.

Upgrading the Pg software isn't all that hard, we even have it automated.
The upgrade process for our application is what's so hard -- again, lots of
testing/man-hours required.

HTH,
Kevin
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: ERROR: COPY escape must be a single one-byte character (multi-delimiter appears to work on Postgres 9.0 but does not on Postgres 9.2)

2019-11-14 Thread Kevin Brannen
>Adrian Klaver wrote:
>On 11/13/19 4:40 PM, Brandon Ragland wrote:
>> Hello,
>>
>> I have a Talend enterprise job that loads data into a PostgreSQL
>> database via the COPY command. When migrating to a new server this
>> command fails with the following error message:
>> org.postgresql.util.PSQLException:ERROR: COPY escape must be a single
>> one-byte character
>
>Does the Postgres log offer anymore information?
>
>Can you crank up the error level to get more info?
>
>Are the encodings for the Postgres server/OS different ?


What is the COPY command you're using?

To extend Adrian's good questions, have you really examined the line in
question carefully with a tool that will show you the true characters being
used? You didn't show the COPY command, but to me, this sounds like you
have a "fancy" character happening. The place I most often see this is
with double quotes:

U+0022  normal ASCII "(1 byte)
U+201c  left double quote (2 bytes!)

Depending on the font used, those 2 can look very much alike and
there are far more characters that can be mixed up like this. The
insidious ones for me are the whitespace chars, e.g. 0xa0. I find this
happens frequently when the source of a copy-paste is from another
program, e.g. MS-Word; our ticket system did this for awhile too. :(

HTH,
Kevin
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: Add column with default value in big table - splitting of updates can help?

2020-02-04 Thread Kevin Brannen
>From: Durumdara 
>
>a.)
>PG version is mainly 9.6, but some of the servers are 10.x or 11.x.
>b.)
>We have semi-automatic tool which get the a new modifications on databases, 
>and execute them at once by database.
>So one SQL script by one database, under one transaction - whole or nothing. 
>If something failed, we know where to we start again by hand. It is impossible 
>to execute only the first half, and we don't know which one executed or not.

Unless you have some special requirement, you don't have to do it all or
nothing. As Despez points out, you can do it in multiple transactions just
fine. We do it that way here all the time. :)

What it means is that you have to write guards or checks in your upgrade script.
In some instances, it's very easy because some statements have IF NOT EXISTS to
help you. For those places where doing the same statement twice would cause an
error, then put a check around it. A plpgsql DO block allows for conditions,
then only if the condition check fails, you do the work (the information_schema 
and
pg_catalog tables are your friend for this).

The point is that you can run your upgrade script as many times as needed, 
should
something happen and it stops. Rerunning the upgrade script should never
cause an error if you've coded it correctly.

>
>The main problem that sometimes we have to modify some tables which have too 
>much records in some customer databases.

I'm going to go with everyone else here because it works. We tend to do updates
in blocks of 10K records at a time. Do some tests and figure out what works best
for your setup (maybe you have enough memory to do 100K chunks). Whatever you
do, make sure that the column you use to divide the work has an index on it! Use
the primary key if it's an INT. If you have no index, find a column you can 
create
an index on for this work then drop it at the end; that will be far faster than 
having
to do a bunch of table scans.

Yes, it's probably a very good idea to upgrade to a newer version if you can
as performance improvements come with each new version.

HTH,
Kevin
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


trouble making PG use my Perl

2020-02-27 Thread Kevin Brannen
We're trying to upgrade our Pg 9.6 installs up to 12.2. In a break from 
tradition where we grab source and compile our own, I've downloaded the 
community RPMs for Centos 6 and installed them (they default into /usr/pgsql-12 
it seems).

I can make Pg come up, initdb, that sort of stuff just fine. But we also use 
the Perl extension and we have references to Perl modules that are in *our* 
Perl and not the system one. Yes, we compile our own Perl like we provide our 
own Pg because Centos uses much older versions.

The issue is that I've not been able to make Pg use our Perl (in /opt/perl) 
instead of the system one (in /usr). I've tried changing the env-vars in 
multiple places, the most obvious being the /etc/init.d script, but none of 
that helps. When we compiled our own Pg, I could set this with "export 
PERL=/opt/perl/bin/perl" before the "configure --with-perl" command. Setting 
PERL in the init.d file doesn't help either.

Sadly, I can't find anything about this in the docs. Searching online doesn't 
turn up anything either as the search terms are too generic (or I can't find a 
way to make them specific enough to help). Haven't found any options for 
postgresql.conf either.

So how does one set the Perl binary/library at server start time?
Am I going to have to compile my own Pg again to make this work?

Thanks,
Kevin
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: trouble making PG use my Perl

2020-02-27 Thread Kevin Brannen
 From: Tom Lane 
>Kevin Brannen  writes:
>> The issue is that I've not been able to make Pg use our Perl (in
>> /opt/perl) instead of the system one (in /usr).
>
>plperl.so will typically have a more or less hard-coded path to libperl.so, eg
>
>$ ldd ...installdir.../lib/plperl.so
>linux-vdso.so.1 =>  (0x7ffc613cf000)
>libperl.so => /usr/lib64/perl5/CORE/libperl.so (0x7fa315d48000)
>^^
>
>You might be able to override that with LD_LIBRARY_PATH, but it's a pain, and 
>it will certainly not work if your homebrew libperl isn't 100% ABI-compatible 
>with the system one.
>
>Personally I'd build plperl against the Perl you want to use it with.
>The rest of PG isn't dependent on Perl, so you could use the community install 
>for the rest of it if you like.
>
>regards, tom lane


Thanks Tom, I can see your point. With the right change to LD_LIBRARY_PATH,
I can make `ldd plperl.so` point to my Perl, but as you say, I'm probably
playing with fire to expect it all to be 100% compatible between Perl
5.10.1 (Centos 6 default) and 5.30.1 (mine).

It'd be nice if we could set Perl at server start time, but I can see how
it might not be possible.

I'll see about making this extension versus just recompiling the whole thing.
I don't recall building Pg to be all that hard, I was just hoping to avoid it.

-OR- I blast it from orbit and rewrite the 2 plperl functions so I don't need 
the
extension. Decisions ... decisions ...

Thanks!
Kevin
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.




RE: trouble making PG use my Perl

2020-03-02 Thread Kevin Brannen
From: Steven Lembark 

>Funny thing is that both PG and Perl are easy enough to build from scratch and 
>the centos compile of Perl at least is both ancient and horrid enough (5.00503 
>compatibility, really?) that it's easier to just shell-script both builds and 
>run it overnight.

>Q: How un-optimized and ancient is the PG on centos?


I agree that it's not all that hard to compile my own Perl and Pg; I've done it 
in the past. That being said, I'd prefer to avoid it and now I can avoid 
compiling Pg.

On Centos 6.10, it ships with Perl 5.10.1, which is really ancient to me. 
Centos 8 ships with 5.14 (IIRC). Still pretty bad and it makes me like your 
conspiracy theory about Python folks ignoring it on purpose. 😊  They do compile 
with -O2 and MULTIPLICITY, so it's not too bad.

In the end, I found there were only 2 plperlu functions. Turns out 1 wasn't 
even used anymore (gotta love legacy code) and the other function was only 
called in 1 place, so it was moved into a module and adjusted (spi_* calls 
turned into DBI calls, etc). After that, there was no more reason for the 
plperlu extension so the problem no longer matters and I can load 1 less rpm.

I find it a shame we can't just swap Perl libraries, but I can understand why 
when I stop to really think about it.

Kevin
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: trouble making PG use my Perl

2020-03-05 Thread Kevin Brannen
From: Alan Hodgson 

On Mon, 2020-03-02 at 18:23 -0500, Tom Lane wrote:

Kevin Brannen <



Centos 8 ships with 5.14 (IIRC).



I don't have an actual Centos 8 machine handy to disprove that,

but the info I have says that RHEL8/Centos 8 branched off from

Fedora 28, and F28 most definitely shipped with Perl 5.26.

Looking at their git repo, the last few Fedora releases

shipped with

> I can confirm that CentOS 8 has perl 5.26.3.

{fires up the C8 VM…}

Yes, you're correct. My memory failed me there. ☹
I must have been thinking of Centos 7, which is 5.16.3 and feels old too -- 
though to be fair C7 came out quite some time ago.

Kevin

This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: Patterns to look for in the PostgreSQL server log

2020-03-11 Thread Kevin Brannen
>From: Mageshwaran Janarthanam 
>
>Hi Team...I am trying to setup some monitoring over the PostgreSQL server log. 
>I am not clear which error I should be most concerned about. Could you please 
>share your thoughts on what pattern I should search in the log file?

I'd treat it like any other log and apply negative filters; i.e. filter out
things that aren't a problem, which leaves things you do care about and
anything unexpected (which might cause you to add another filter to remove
it for next time if you find it's not really a problem).

Adrian is giving good advice too. Understand the log levels and set your config
appropriately for your situation and comfort level.
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


pg_restore restores out of order

2020-03-11 Thread Kevin Brannen
I'm working thru our system to upgrade from 9.6.5 to 12.2. One of the last 
things I'm verifying is backup/restore and it's failing for no reason I can 
figure out. So I'm looking for pointers on this.

If it matters, the code is from the version 12.2 from the Pg site, RPMs for 
Centos 6 (.10).

The backup is made like:

# $PGPATH/pg_dump --format=d --jobs=2 --file=$exp --dbname=nms --schema=public

As far as I can tell, it worked well as the result is about the right size, 
exit code was at least 0. I then go change a couple of things so I know if the 
restore works or not (i.e. they should disappear).

Restore looks something like:

# cd $exp
# echo "
ALTER SCHEMA public RENAME TO savepublic;
CREATE SCHEMA public AUTHORIZATION nmsroot;
" | $PGPATH/psql -d nms
# /usr/pgsql-12/bin/pg_restore --jobs=2 --dbname=nms --schema=public .

So we move the current schema to the side just in case something goes wrong and 
we need to move it back, create an empty schema for it, then restore into that 
schema. Then it goes bad...

pg_restore: while PROCESSING TOC:
pg_restore: from TOC entry 6356; 0 23653 TABLE DATA subscribers nmsroot
pg_restore: error: COPY failed for table "subscribers": ERROR:  relation 
"su_profiles" does not exist
LINE 1: SELECT srvc_data   FROM su_profiles WHERE su_profile...
^
QUERY:  SELECT srvc_data   FROM su_profiles WHERE su_profile_pk = 
p_profile_fk
CONTEXT:  PL/pgSQL function public.check_data_ip(integer,integer,character 
varying) line 7 at SQL statement
COPY subscribers, line 1: "61   1002\N  SU_1002 t   \N  4   
\N  1   250 0   2015-06-22 16:56:27.79333+00 nmsmgr  \N 
 \N  \N"

And it goes further downhill with missing FK's, etc in other tables later 
because of the missing data. The exit code is definitely not 0.

I've tried removing the "--jobs=2" so it'll go single threaded -- it still 
fails. I also did a new dump with the default plain text dump, and the restore 
with that also failed in the same way. Checking out the backup itself, I get:

# pg_restore -Fd . --list | egrep 'su_profile|subscriber' > k1

254; 1259 23653 TABLE public subscribers nmsroot
335; 1259 24222 TABLE public su_profiles nmsroot
5442; 2604 18650 DEFAULT public su_profiles su_profile_pk nmsroot
5303; 2604 18651 DEFAULT public subscribers subscriber_pk nmsroot
6437; 0 24222 TABLE DATA public su_profiles nmsroot
6356; 0 23653 TABLE DATA public subscribers nmsroot
... constraints, index, triggers, FK -- down here

I've removed FUNCTIONS, VIEWS, & SEQ objects from the list. That list is in 
order that is in the file. I suppose creating the tables in that order is fine, 
but there is a FK constraint in subscribers pointing to su_profiles, so that 
order does worry me a little, except all the FK stuff is after the data load so 
I really should be fine. The data looks like it'll be in the right order (at 
least for --jobs=1, yet that failed too).

I don't see anything extra in the Pg error log either.

I don't think I'm running into any issues with "other DB objects being 
missing". Looking at the release notes, I see there were some changes for 12.1 
with parallel restores, but I can't tell if that's affecting me or not. I know 
that the public schema & search_path changed for dump & restore in 10.3, but 
this doesn't look like that's a problem here. I've reread the doc pages on 
pg_dump and pg_restore and don't see anything that look like it'd be bad for me.

Can anyone see anything that looks wrong? Suggest something else to try or look 
at?

Kevin
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.




RE: pg_restore restores out of order

2020-03-11 Thread Kevin Brannen
>Adrian Klaver wrote:
>On 3/11/20 2:46 PM, Kevin Brannen wrote:
>> I'm working thru our system to upgrade from 9.6.5 to 12.2. One of the last 
>> things I'm verifying is backup/restore and it's failing for no reason I can 
>> figure out. So I'm looking for pointers on this.
>>
>> If it matters, the code is from the version 12.2 from the Pg site, RPMs for 
>> Centos 6 (.10).
>>
>> The backup is made like:
>>
>> # $PGPATH/pg_dump --format=d --jobs=2 --file=$exp --dbname=nms
>> --schema=public
>
>Which version of pg_dump 9.6.5 or 12.2?

Both pg_dump and pg_restore are with 12.2, on an already converted 12.2 DB.
So I'm 12.2 all the way on my test system by this point. :)

>>...
>> So we move the current schema to the side just in case something goes wrong 
>> and we need to move it back, create an empty schema for it, then restore 
>> into that schema. Then it goes bad...
>>
>> pg_restore: while PROCESSING TOC:
>> pg_restore: from TOC entry 6356; 0 23653 TABLE DATA subscribers
>> nmsroot
>> pg_restore: error: COPY failed for table "subscribers": ERROR:  relation 
>> "su_profiles" does not exist
>> LINE 1: SELECT srvc_data   FROM su_profiles WHERE su_profile...
>>  ^
>> QUERY:  SELECT srvc_data   FROM su_profiles WHERE su_profile_pk = 
>> p_profile_fk
>> CONTEXT:  PL/pgSQL function public.check_data_ip(integer,integer,character 
>> varying) line 7 at SQL statement
>> COPY subscribers, line 1: "61   1002\N  SU_1002 t   \N  4
>>\N  1   250 0   2015-06-22 16:56:27.79333+00 nmsmgr  
>> \N  \N  \N"
>
>What is in public.check_data_ip(integer,integer,character varying)?
>
>Is it a trigger function on subscribers?

Not quite...

nms=# \d subscribers
   Table "public.subscribers"
...
Check constraints:
"chk_su_data_ip" CHECK (check_data_ip(profile_fk, unit_id, data_ip))


The first line of the check_data_ip() function is the offender:

SELECT srvc_data INTO data
FROM su_profiles
WHERE su_profile_pk = p_profile_fk;

Hmm, why is this a problem now and not before?

(Probably the usual reason of code "tightening" as we go forward, or so I'd 
guess.)

I don't see any option that mentions "CHECK" in the pg_restore doc in regards to
possibly turning it off...more research...

And oh, this hurts. :( From the docs on CHECK constraints:

PostgreSQL does not support CHECK constraints that reference table data other
than the new or updated row being checked. While a CHECK constraint that
violates this rule may appear to work in simple tests, it cannot guarantee
that the database will not reach a state in which the constraint condition
is false (due to subsequent changes of the other row(s) involved). This
would cause a database dump and reload to fail. ...

It goes on to say a trigger is the right way to do this. So yeah, a "tightening
of the code" seems to be hitting me because of bad code from the past.

Problem identified and I'll solve it tomorrow.

Thank you so much Adrian for helping me to figure this out!

Kevin

.
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: pg_restore restores out of order

2020-03-12 Thread Kevin Brannen
>Adrian Klaver wrote:
>> Kevin Brannen wrote:
>> It goes on to say a trigger is the right way to do this. So yeah, a
>> "tightening of the code" seems to be hitting me because of bad code from the 
>> past.
>>
>> Problem identified and I'll solve it tomorrow.
>
>Glad it was solved.
>
>However, looking a gift horse in the mouth, did this not error when moving 
>from the 9.6.5 instance to the first 12.2 instance?
>
>Or
>
>Was the constraint added in the first 12.2 instance?

This is the first instance. :) I've now got our entire code base working
on the latest version of Perl and Postgresql (they have to go at the same
time because of DBI/DBD) and I'm going thru and making sure everything
works before committing to our integration branch for further testing. I've
solved a few other upgrade issues like some names changing in pg_catalog
and moving to the community RPMs instead of compiling our own. The problem
in this thread has been the only hard one and fortunately for me it's
the last (or I think it is).

Unless we have a significant reason, upgrading is hard enough we only
upgrade about every other year, or that's my plan going forward. Waiting
3 years like we did this time makes for a larger jump than I'd like.

Thanks again for the help!
Kevin

.
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: vacuum full doubled database size

2020-03-13 Thread Kevin Brannen
>Fabio Ugo Venchiarutti wrote:
>On 13/03/2020 15:15, Ron wrote:
>> This is why I'd VACUUM FULL in a planned manner, one or two tables at
>> a time, and *locally* from crontab.
>
>That's not really viable on any remotely busy system: VACUUM FULL claims 
>exclusive table locks, causing queries to hang 
>(https://www.postgresql.org/docs/current/sql-vacuum.html#NOTES mentions this 
>too).
>
>Tools like pg_repack can do some live shrinking.

To say "not really viable on any remotely busy system" is a pretty sweeping
statement. I think a better statement is that "for many busy systems, this could
be a real problem and to consider it carefully in light of your needs."

On our systems, we do this just fine, though the difference is probably the
level of busy. We have periods that are not as busy as others (1am). In 
addition,
most of our tables are fairly small'ish and a VACUUM FULL takes 30sec or
less, so it's not so bad. The vast majority of our data is in about a dozen
tables which are mostly used for reports by people in the daytime, so if they
lock for 5min each in the middle of the night on a weekend it's OK (and we only
do this once a quarter).

So this approach can work, but "it depends" and "YMMV" and all that jazz...which
I believe was what Ron was trying to point out with planning.

That being said, I've had "go check out pg_repack" on my to-do list for a while 
and
one day I will, but at the moment the above works for us.

Kevin

.
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: Temporary tablespaces on a RAM disk

2020-03-17 Thread Kevin Brannen
Daniel Westermann wrote:
> is someone using temporary tablespaces on a RAM disk ? Any experiences with 
> that?
> I did some quick tests and checked the archives but could not find any 
> information that either confirmed it is a bad idea nor the opposite.


Well, we point our "stats_temp_directory" to a tmpfs partition, which is 
probably fairly common (or so I'd guess).
But a full tablespace with tables and everything? That sounds pretty risky for 
anything other than a place to store
data for transformation or summary of data that could be recreated by 
restarting a process if the server goes down
in the middle.

I think you'd be better off explaining what your goal is and then people could 
give you better comments.

HTH,
Kevin

.
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: Loading 500m json files to database

2020-03-24 Thread Kevin Brannen
From: pinker 

> it's a cloud and no plpythonu extension avaiable unfortunately


You're misunderstanding him. See David's post for an example, but the point was 
that you can control all of this from an *external* Perl, Python, Bash, 
whatever program on the command line at the shell.

In pseudo-code, probably fed by a "find" command piping filenames to it:

while more files
do { read in a file name & add to list } while (list.length < 1000);
process entire list with \copy commands to 1 psql command

I've left all kinds of checks out of that, but that's the basic thing that you 
need, implement in whatever scripting language you're comfortable with.

HTH,
Kevin
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.




RE: Loading 500m json files to database

2020-03-24 Thread Kevin Brannen
From: Rob Sargent 

> Sorry if I missed it, but have we seen the size range of these json files?

Not that I've seen, but that's an implementation detail for whoever is doing 
the work. As someone else pointed out, pick the value as needed, whether that's 
10, 100, 1000, or whatever. But issuing 1000 lines of "\copy file" sort of 
commands at a time isn't a big deal by itself. OTOH, you have a good point that 
1000 could be too much work for the server to handle, especially if the "-1" 
flag is also used. As always: test, test, test... 😊

Kevin
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: Is PostgreSQL SQL Database Command Syntax Similar to MySQL/MariaDB?

2020-03-30 Thread Kevin Brannen
From: Turritopsis Dohrnii Teo En Ming 

> Is PostgreSQL SQL database command syntax similar to MySQL/MariaDB?


If you stuck to "standard SQL" in Mysql, you'll generally be fine; the 
differences will be pretty minor. The further you strayed from the standard 
will cause you more work -- usually. That being said, moving for any DB to 
another will always cause you to find DB-specific things. Of course, there are 
always a few things that can only be done with DB-specific code, so you'll be 
rewriting those parts with certainty (an example is the insert-or-update 
concept).

[Side note: I forced myself to read a good part of the SQL standard some time 
back and was amazed by how many times the phrase "implementation defined" was 
used, which helps to explain why all the DBs do some things so differently.]

The admin commands are quite different as well, be prepared for that, but I 
can't think of anything that Mysql can do that Pg can't as well. (Although it's 
been awhile since I had to use Mysql, so I may have forgotten something.)

Last but not least, please remember that all of those "unusual" things you had 
to do to get Mysql to work, or work around its bugs, won't be needed and will 
more than likely get in your way with Postgresql. I'm not saying the Pg is 
perfect, but that Mysql's bugs and oddities don't apply to Pg.

Good luck with your conversion and welcome to Pg!
Kevin
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.




how to slow down parts of Pg

2020-04-21 Thread Kevin Brannen
I have an unusual need:  I need Pg to slow down. I know, we all want our DB to 
go faster, but in this case it's speed is working against me in 1 area.

We have systems that are geo-redundant for HA, with the redundancy being 
handled by DRBD to keep the disks in sync, which it does at the block level. 
For normal operations, it actually works out fairly well. That said, we 
recognize that what we really need to do is one of the forms of streaming (ch 
26 of the manual) which I believe would help this problem a lot if not solve it 
-- but we don't have the time to do that at the moment. I plan and hope to get 
there by the end of the year. The part that hurts so bad is when we do 
maintenance operations that are DB heavy, like deleting really old records out 
of archives (weekly), moving older records from current tables to archive 
tables plus an analyze (every night), running pg_backup (every night), other 
archiving (weekly), and vacuum full to remove bloat (once a quarter). All of 
this generates a lot of disk writes, to state the obvious.

The local server can handle it all just fine, but the network can't handle it 
as it tries to sync to the other server. Sometimes we can add network 
bandwidth, many times we can't as it depends on others. To borrow a phrase from 
the current times, we need to flatten the curve. 😊

A few parts of our maintenance process I've tamed by doing "nice -20" on the 
process (e.g. log rotation); but I can't really do that for Pg because the work 
gets handed off to a background process that's not a direct child process … and 
I don't want to slow the DB as a whole because other work is going on (like 
handling incoming data).

Part of the process I've slowed down by doing the work in chunks of 10K rows at 
a time with a pause between each chunk to allow the network to catch up 
(instead of an entire table in 1 statement). This sort of works, but some 
work/SQL is between hard to next-to-impossible to break up like that. That also 
produces some hard spikes, but that's better than the alternative (next 
sentence). Still, large portions of the process are hard to control and just 
punch the network to full capacity and hold it there for far too long.

So, do I have any other options to help slow down some of the Pg operations? Or 
maybe some other short-term mitigations we can do with Pg configurations? Or is 
this a case where we've already done all we can do and the only answer is move 
to WAL streaming as fast as possible?

If it matters, this is being run on Linux servers. Pg 12.2 is in final testing 
and will be rolled out to production soon -- so feel free to offer suggestions 
that only apply to 12.x.

Thanks,
Kevin
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: how to slow down parts of Pg

2020-04-21 Thread Kevin Brannen
From: Michael Lewis 

> You say 12.2 is in testing but what are you using now? Have you tuned configs 
> much? Would you be able to implement partitioning such that your deletes 
> become truncates or simply a detaching of the old partition? Generally if you 
> are doing a vacuum full, you perhaps need to tune autovacuum to be more 
> aggressive. Consider pg_repack at least to avoid taking an exclusive lock for 
> the entire duration. If partitioning is not an option, could you delete old 
> records hourly rather than daily?

Good questions, it's always hard to know how much to include. 😊

Current production is 9.6, so things like partitioning aren't available there, 
but will be in the future.

We've tuned the configs some and don't having any issues with Pg at the moment. 
This does need to be relooked at; I have a few notes of things to revisit as 
our hardware changes.

Partitioning our larger tables by time is on the ToDo list. I hadn't thought 
about that helping with maintenance, so thanks for bringing that up. I'll 
increase the priority of this work as I can see this helping with the archiving 
part.

I don't particularly like doing the vacuum full, but when it will release 
20-50% of disk space for a large table, then it's something we live with. As I 
understand, a normal vacuum won't release all the old pages that a "full" does, 
hence why we have to do that. It's painful enough I've restricted it to once 
quarter; I'd do it only once a year if I thought I could get away with it. 
Still this is something I'll put on the list to go research with practical 
trials. I don't think the lock for the vacuuming hurts us, but I've heard of 
pg_repack and I'll look into that too.

I have considered (like they say with vacuuming) that more often might be 
better. Of course that would mean doing some of this during the day when the DB 
is busier. Hmm, maybe 1000/minute wouldn't hurt and that would shorten the 
nightly run significantly. I may have to try that and see if it just adds to 
background noise or causes problems.

Thanks!
Kevin
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: how to slow down parts of Pg

2020-04-21 Thread Kevin Brannen
From: Michael Loftis 

>>From: Kevn Brannen
>>I have an unusual need:  I need Pg to slow down. I know, we all want our DB 
>>to go faster, but in this case it's speed is working against me in 1 area.
>>
>>We have systems that are geo-redundant for HA, with the redundancy being 
>>handled by DRBD to keep the disks in sync...

> drbdsetup allows you to control the sync rates.

I was hoping not to have to do that, but the more I think about this I'm 
realizing that it won't hurt because the network cap is effectively limiting me 
anyway. :)

I can & will do this, maybe at 90% of our bandwidth, so thanks for the 
suggestion. Still, this is sort of a last resort thing as I believe controlling 
the DB to be the ultimate need.

Thanks!
Kevin
.
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: how to slow down parts of Pg

2020-04-21 Thread Kevin Brannen
From: Michael Loftis 

>>From: Kevn Brannen
>> I don't particularly like doing the vacuum full, but when it will release 
>> 20-50% of disk space for a large table, then it's something we live with. As 
>> I understand, a normal vacuum won't release all the old pages that a "full" 
>> does, hence why we have to do that. It's painful enough I've restricted it 
>> to once quarter; I'd do it only once a year if I thought I could get away 
>> with it. Still this is something I'll put on the list to go research with 
>> practical trials. I don't think the lock for the vacuuming hurts us, but 
>> I've heard of pg_repack and I'll look into that too.


> Why do vacuum full at all? A functional autovacuum will return the free pages 
> to be reused. You just won’t see the reduction in disk usage at the OS level. 
> Since the pages are clearly going to be used it doesn’t really make sense to 
> do a vacuum full at all. Let autovacuum do it’s job or if that’s not keeping 
> up a normal vacuum without the full. The on dusk sizes will stabilize and 
> you’ll not be doing a ton of extra I/O to rewrite tables.


Sometimes I need the disk space back. It also makes me feel better. (OK, this 
may not a good reason but there is a hint of truth in this.) What this probably 
means is that I need to get a better understanding of vacuuming.

Thanks!
Kevin
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: how to slow down parts of Pg

2020-04-21 Thread Kevin Brannen
From: Virendra Kumar 

>Autovacuum does takes care of dead tuples and return space to table's 
>allocated size and can be re-used by fresh incoming rows or any updates.
>
>Index bloat is still not being taken care of by autovacuum process. You should 
>use pg_repack to do index rebuild. Keep in mind that pg_repack requires double 
>the space of indexes, since there will be two indexes existing during rebuild 
>processes.


Ha! I knew there was a reason I was doing the full, I just couldn't remember 
indexes was why. Pg_repack needs to move higher on the ToDo list too. I need a 
clone to do all of this. :)


From: David G. Johnston 
> Imagine you have an auto-expanding array and also that individual cells can 
> be reused if the data in them is removed first…

Yes, the concepts aren't that hard, the issue is how to apply them in the most 
effective manner. Still, nice explanation, I'll use that when explaining the 
work to the group so I can pass the info along.

Thanks!
Kevin
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: how to slow down parts of Pg

2020-04-22 Thread Kevin Brannen
>From: Laurenz Albe 
>
>>On Tue, 2020-04-21 at 20:30 +, Kevin Brannen wrote:
>> I have an unusual need:  I need Pg to slow down. I know, we all want
>> our DB to go faster, but in this case it's speed is working against me in 1 
>> area.
>>
>> [...] The part that hurts so bad is when we do maintenance operations
>> that are DB heavy, like deleting really old records out of archives 
>> (weekly), moving older records from current tables
>> to archive tables plus an analyze (every night), running pg_backup (every 
>> night), other archiving (weekly), and vacuum full to remove bloat (once a 
>> quarter).
>> All of this generates a lot of disk writes, to state the obvious.
>>
>> The local server can handle it all just fine, but the network can't handle 
>> it as it tries to sync to the other server.
>
>The obvious and best answer is: get a faster network, or choose a different 
>storage solution.

I believe I mention originally that the network is controlled by others (the 
customer). I've pointed out the results of their choice
repeatedly, but their reply is always "budgets", and I reply as politely as I 
can, "faster network or live with the slowness as I've
done all I can for now". It's a somewhat frustrating conversation as you can 
imagine.


>Other than that, you can try to make the maintainance operations less resource 
>intense:
>
>- partition the tables so that you can get rid of old data with DROP TABLE.
>  The ANALYZE won't hurt, if you treat only the required tables.
>- use "pg_basebackup" with the "--max-rate" option

Yes, this was the heart of the post, how to use less resources. I'd always 
thought of partitioning the larger tables as an optimization
for running reports, but my eyes have been opened that it has other benefits 
too.

I'm not sure changing the backup program will help when it's sitting on top of 
DRBD, but I can limit DRBD's rate to create the same effect.
Still, it doesn't hurt to spend a little time researching this. OTOH, you did 
just prompt an idea, so that's helpful too.


>About VACUUM, you may have a problem.  Avoid the need for VACUUM (FULL) at any 
>price.
>That usually requires tuning autovacuum to be faster, which means using more 
>I/O.

OK, I've never really liked doing a "full", but I perceived it as helpful to 
us. I'll see about making autovacuum more aggressive.

Thanks!
Kevin
.
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: how to slow down parts of Pg

2020-04-23 Thread Kevin Brannen
>From: Peter J. Holzer 
>On 2020-04-21 21:16:57 +, Kevin Brannen wrote:
>> From: Michael Loftis 
>> > drbdsetup allows you to control the sync rates.
>>
>> I was hoping not to have to do that, but the more I think about this
>> I'm realizing that it won't hurt because the network cap is
>> effectively limiting me anyway. :)
>
>Alternatively you might consider traffic shaping. DRBD can only set a fixed 
>limit (because it knows only about its own traffic). Traffic shaping can 
>adjust the limit depending on other traffic (it can also prioritize traffic, 
>etc.). However, to be effective, it needs to run on a router as close to the 
>bottleneck as possible - typically that means either the border router or the 
>firewall. So it is something the customer's network guy should set up.


Traffic shaping was actually my very first thought. :) It has its upsides
and downsides like any other solution. The biggest downside is that it's not
up to us to control, so we have to find their network person (usually harder
than it should be) and then adjust ... probably multiple times and always
manually.


For any who are wondering what this thread has done for me, other than create
a list of things to research... :)

At this point in time, I think the plan is to (roughly in this order):

0. limit DRBD's rate (I think I can script this & I probably only need to do 
this during the maintenance work);
1. make autovac more aggressive on the larger logging tables;
2. change the "vacuum full" to just reindexing (either with pg_repack or 
"reindex concurrently");
3. partition the bigger logging tables.

I'm tempted to also do the archiving in very small amounts all thru the day
(sort of like how autovac works) to spread that load and not have such a huge
hit once per day. For the moment, this is going in my back pocket to pull out
only if the above doesn't do enough.

Then we move to WAL streaming which I believe will be the biggest
help of all -- or so I hope. It will also have the largest learning curve,
but it'll be good for me to learn that.

Thanks,
Kevin
.
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.




RE: how to slow down parts of Pg

2020-04-23 Thread Kevin Brannen
>From: Ron 
>
>What you need is async replication instead of synchronous replication.


The only way I can think of to do that in our present situation would be to
buy DRBD-Proxy, which becomes a single-point-of-failure and goes against the
idea of HA (it seems like a good product for disaster recovery but that's not
the goal). In addition, since we're trying to move away from DRBD and go to
WAL streaming, that doesn't seem like the best use of time and money. :)

If you'd like to expound on other ways/tools to do that, I'd love to hear about 
it,
although this might be better off-list.

Thanks,
Kevin
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: New "function tables" in V13 documentation

2020-11-13 Thread Kevin Brannen
>From: David G. Johnston 

>On Mon, Nov 9, 2020 at 1:41 PM Tom Lane  wrote:
>Alvaro Herrera  writes:
>> On 2020-Nov-08, Adrian Klaver wrote:
>>> Yeah, I would agree with the mobile first design comments. Then again that
>>> plague is hitting most sites these days. My 2 cents is it is a step
>>> backwards. You can cover more ground quickly and digest it faster in the old
>>> format.

>> If you have suggestion on how to improve the new format, I'm sure we can
>> discuss that.  It seems pretty clear to me that we're not going back to
>> the old format.

>>I think there's no question that the new format is better in any case
>>where a function needs more than a couple words of documentation.
>>I could see the argument for adopting a more compact format for tables
>>that contain no such functions.  I think you might find that the set of
>>such tables is nigh empty, though; even section 9.3 (mathematical
>>functions) has a lot of functions that need a sentence or two.  We used
>>to either omit important details for such functions or stick them in
>>footnotes, and neither of those options is very nice.

>My observation is that the new format reduces one's ability to quickly skim 
>the table to find out what is present since there is considerable extra 
>information in one's eyes during that process that needs to be skimmed over.


I'm slow to the party, but I'm going to go with the new format is horrible. I 
agree with David that you can't quickly scan down the new table to find things 
like the return type (for example) which is something I do frequently. Go to 
the string funcs/ops page in v13, and try to quickly find the ones that return 
an "int" (because your goal is to find the position of something in a string so 
you know the return value will have to be an "int"). The new version makes that 
hard while the old version is easy. In fact, I couldn't even find the return 
type at first when I looked because there is no label (the power of tables with 
headers was removed).

The description and example also run together under the new format, which sort 
of comes across as a "wall of text". If I really zoom in, I can see they're 
different fonts, but at my normal zoom level they look pretty much the same at 
first glance.

This makes me want to stay on v12.x for as long as possible -- really.

If the maintainers are dead-set on maintaining the new format despite it 
drawbacks (and after reading all the other comments about the positives I'm 
going to say I don't see any positives**), then it'd be extremely helpful to do 
some CSS magic to make them easier to read. Personally, I'd like to see there 
be a setting for if media is HTML that it show the old table format, but if 
not, here's some ideas to make the new format more palatable:

* The return type needs to stand out a LOT more, bold would be a great start, 
adding color would help too.

* Make the description and example look much more different, again color or 
perhaps color banding (background a light gray or something on the example).

* Make the example code and example results more different, the simple "->" 
between them gets lost easily to my eye so it's harder to read as is.

* Can you add a few more classes to identify the parts betters in the tables? 
For example, I see the CSS class "returnvalue" on the example result, but there 
is nothing on the example code itself identifying it as such. If you did this 
better labeling then perhaps those of us who are complaining could attempt to 
overcome some of our objections by restyling the tables with colors & fonts & 
such. Not a full solution as that would require manipulating the DOM, but even 
small changes with color could bring large relief.

[** I think it was Bruce who pointed out the old table format was troublesome 
in the PDF format. I concede that and anything else for when you're constrained 
to paper. But I suspect most users look at these docs on a computer monitor 
with HTML so those size constraints aren't an issue there. Designing pages to 
the smallest media just frustrates those users on larger media (cue the many 
examples on the web where the left/right margins are so wide half of your 
screen is wasted instead of letting the text flow and resize).]

Kevin
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: New "function tables" in V13 documentation

2020-11-13 Thread Kevin Brannen
>From: David G. Johnston 



>>On Fri, Nov 13, 2020 at 12:20 PM Kevin Brannen <mailto:kbran...@efji.com> 
>>wrote:

>>Designing pages to the smallest media just frustrates those users on larger 
>>media (cue the many examples on the web where the left/right margins are so 
>>wide half of your screen is wasted instead of letting the text flow and 
>>resize).]



>It is just as bad it is so wide that one has to move their head instead of 
>just moving their eyes.  If anything our tables could probably be improved by 
>enforcing a maximum width to the content area.





True on moving heads is harder, but we have the option of making the browser 
narrower to compensate

if we feel the need.  When there are max width constraints then the option to 
customize is taken out

of the user's hands and that's an issue. Let the user do what works best for 
them. Some flexibility

doesn't seem like to much to ask for...IMO.



I really don't expect the old tables to come back, as much as I'd like that, 
because groups rarely backtrack

or so my experience has been. However, this is also why I made the suggestions 
I did, especially the

last one about adding more CSS classes to let the users restyle if they feel 
strongly enough about it.



Maybe this works for most people:



upper ( text ) → text

Converts the string to all upper case, according to the rules of the 
database's locale.

  upper('tom') → TOM



By why not let people do:



upper ( text ) → text

Converts the string to all upper case, according to the rules of the 
database's locale.

  upper('tom') → TOM



[For those that don’t receive HTML in email, the function is bold, the return 
type is underlined,

the example has a light gray background, and the example result has a light 
blue background.]



I don’t know that I’d really do it that way, but the CSS required for that 
isn’t hard yet it makes

the parts stand out a lot better so I know what is what. The current docs are 
only missing 3 CSS

classes to allow me to do that: the description, the example code, and the 
example return (since it

uses the same class as the function return value). I can’t imagine that would 
be so hard to do.



I don’t see myself contributing to the Pg code base, but this is something I 
might could do and

should look into.



Kevin

This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: New "function tables" in V13 documentation

2020-11-16 Thread Kevin Brannen
>From: Adrian Klaver 
>>On 11/15/20 9:00 AM, David G. Johnston wrote:
>
>> In hindsight it could have been handled better.  Waiting longer at
>> different points and making pronouncements on -announce to solicit
>> feedback from people who don't follow -docs.  That doesn't change the

> Yes and at least one post to --general. It would have motivated me to look at 
> the new docs earlier.

+1 on the post to -general.

FWIW... My attention to -announce is lazy but I'd see that eventually. I get a 
little behind on -general from time to time, but this is where I pay the most 
attention. I have no idea where most people are, perhaps the list maintainers 
could give some guidance on that.

HTH,
Kevin
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: Alter the column data type of the large data volume table.

2020-12-04 Thread Kevin Brannen
>From: Olivier Gautherot 

>>5) If you're brave enough, convert your current table as a partition (rename 
>>it to something like table_hist), duplicate the table model under the same 
>>name as now (adjusting the primary key type) and set the INHERITS on the 
>>primary key range. The inheritance should take care of the type conversion 
>>(haven't tried it but it's worth a try). If it works, you will reach your 
>>goal without downtime or significant overhead.

>Sorry, just tried this one and it failed: type mismatch.

Seems like a sound idea in general. I’d probably rename the tables, let’s call 
them “big_hist” for the old big table and “big_split” for the new partitioned 
table that being used go forward – assuming the original table was called 
“big”. Then create a View that will look at both of those but call it the same 
as the old table, and let the view do a type cast on the old key like 
big_hist.id::bigint so it matches the new type, because the view will probably 
be a union and the type need to match. That way your application only has to 
pause long enough to do a few meta-commands then it all can resume, and like 
Olivier pointed you, you can fix the data by moving it from big_hist to 
big_split in the background as you have time.

I’d probably put it all in a transaction too:

Create table … -- all the commands to create your patitioned table big_split 
here
Begin;
Alter table big rename to big_hist;
Create view big select * from big_split union select id::bigint, /* other cols 
*/ from big_hist;
Commit;

Try it on a dev system and if it works you’re off and running. I’d expect the 
view to slow things down a little, but probably not too much if you have good 
indexes. But at least you could transition without major downtime and then 
rename “big_split” back to “big” and drop “big_hist” when you’ve finished the 
transition. I might even be tempted to add a trigger so that all new inserts 
into “big” really go into “big_split” so “big_hist” doesn’t grow any more. Your 
imagination is probably the limit. 😊

HTH,
Kevin
.

This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: Is there a good discussion of optimizations?

2021-01-07 Thread Kevin Brannen
From: Guyren Howe 

>Most folks, in my experience, who use relational databases don’t really 
>understand the basic theory or even more important the why - the philosophy - 
>of what a relational database is and how to get the most out of them. I see a 
>lot of folks trying to use SQL in an imperative manner - make this temp table, 
>then update it some, then make this other temp table, etc...

>Anyway, I’d like to put together something that explains this. I would go into 
>Codd’s original insight and how simple yet powerful it is. I’d like to discuss 
>how if you stick to SQL, rather than forcing it into this imperative 
>straight-jacket, the database can work out all the details so you don’t have 
>to do the bad things.

Be sure you point out where SQL either fails or comes up short to give a
balanced view. To be fair, sometimes the reason people perceive SQL as failing
them (and why they go with the imperative process) is because they don't know
everything SQL does, or even perhaps their tool doesn't implement the whole
standard.

An example of this is that we have a report we're trying to write that I'd
like to think can be done in SQL, but I can't think of a way to do it. Yet,
if I do the base query and pull the data back into my application, I can do
the last bit with 3 lines of Perl very easily. The problem here revolves
around comparing a row of data to the previous row to know whether the data
changed "significantly enough" to keep the new row.

Another example is doing running totals. A couple of years ago I would have
said SQL can't do that. Now I know about the OVER clause, something that I
would categorize as somewhat obscure, I can do it as needed.

As Michael Lewis pointed out, large dataset can also cause you to choose not
to use SQL in 1 big statemet for everything (i.e. advocating the use to temp
tables). In some ways, using a CTE is a type of temp table, or at least I
view it as such. That allows a person to solve a problem in bite-sized chunks.
I will agree that optimization can do it better at times, but the code also has
to be maintained as well – a balancing act.

To continue the above, the idea of using a sub-select in a FROM clause is
generally not thought of by new learners. So experience is also a factor.

I think your idea is a good one, but I do hope you present that SQL can't
solve everything ... else why do we have plpgsql. :) You’re correct though,
SQL isn’t used as much as it should be in many places.

Kevin
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: Is there a good discussion of optimizations?

2021-01-07 Thread Kevin Brannen
From Guyren Howe
>Actually, I’m mostly going to talk about the relational model, rather than 
>SQL. Our industry seems to always settle for third-best, and SQL is the worst 
>of all the examples of this. The world desperately needs a good relational 
>database based on a better query language — datalog, for example.

>I put up with SQL so I can use the relational model, and I think that 
>understanding SQL has to start with that.

I can understand that that. :) Yet SQL is the interface/lens we most use to 
interact with an RDBMS.


>>An example of this is that we have a report we're trying to write that I'd
like to think can be done in SQL, but I can't think of a way to do it. Yet,
if I do the base query and pull the data back into my application, I can do
the last bit with 3 lines of Perl very easily. The problem here revolves
around comparing a row of data to the previous row to know whether the data
changed "significantly enough" to keep the new row.
>>Another example is doing running totals. A couple of years ago I would have
said SQL can't do that. Now I know about the OVER clause, something that I
would categorize as somewhat obscure, I can do it as needed.

>Actually, Window functions might be "advanced", but are certainly not obscure. 
>Your example sounds like it’s trivially solved with LAG().

Fair enough. Perhaps it would be better to say that many don't go to the depths 
to learn
window functions until forced to, and a lot of problems can be solved without 
them. I can
say that would be true for me and those I work with.

Thanks for the tip on LAG()! It's not a function I've used before (which makes 
it a great
example for the previous para).


>>As Michael Lewis pointed out, large dataset can also cause you to choose not
to use SQL in 1 big statemet for everything (i.e. advocating the use to temp
tables). In some ways, using a CTE is a type of temp table, or at least I
view it as such. That allows a person to solve a problem in bite-sized chunks.
I will agree that optimization can do it better at times, but the code also has
to be maintained as well – a balancing act.

>This appears to be good advice with SQL Server, which I’m coming to learn has 
>a fairly poor query optimizer. But I would have thought Postgres’s optimizer 
>would usually use a temporary table where appropriate.

I'm sure Pg's optimizer does its work better than I would manually, but that's 
not
always the point. For some I've talked to, thinking relationally and in sets is 
hard.
I'll even admit that while I think I do it pretty well, there are times I have 
to stop
and really consider what I'm doing. If you're used to thinking about how to 
solve a
problem with a computer language, most of them are imperative and that mindset 
can be
hard to get out of. So your "philosophy" approach is a well-aimed arrow in many 
ways, IMO.

Also, don't forget about "maintenance". Solving something in bite-sized chunks 
allows
for easier changes in the future, as well as understanding by newcomers -- 
generally speaking.

>Curious to hear if that’s wrong.

Depends on what version. In the earlier versions of Pg, CTE's were a fence and 
there was
no crossing over. That fence now has gates in some places in the latest 
versions from
what I read.


>>I think your idea is a good one, but I do hope you present that SQL can't
solve everything ... else why do we have plpgsql. :) You’re correct though,
SQL isn’t used as much as it should be in many places.

>An important consideration will be when relational is inappropriate. My 
>biggest goal, though, is to get folks to understand how much relations *can* 
>do — far too many devs in my experience don’t use the power of SQL because 
>they don’t understand it.

>Thanks for taking the time to give me this feedback.

I appreciate the efforts of teachers in all forms. I hope your project goes 
well and the feedback has some use.

Kevin
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: Syntax checking DO blocks and ALTER TABLE statements?

2021-02-18 Thread Kevin Brannen
>From: Ron 
>
>How does one go about syntax checking this?
>
>do $$
>begin if exists (select 1 from information_schema.table_constraints
>where constraint_name = 'error_to_web_service_error') then
> raise notice 'EXISTS error_to_web_service_error';
> else
> ALTER TABLE web_service_error
>ADD CONSTRAINT error_to_web_service_error FOREIGN KEY (error_id)
>REFERENCES error_code(error_id)
>ON DELETE NO ACTION NOT DEFERRABLE INITIALLY IMMEDIATE;
> end if
>end $$
>
>(There are 222 ALTER TABLE ADD FOREIGN KEY statements that I'm wrapping in 
>similar DO blocks, and want to make sure the statements are clean.)


I've always wondered why Pg doesn't have something like that built in, but I 
suppose the obvious answer is that no one has felt like scratching that itch.

Have you checked out:  https://github.com/okbob/plpgsql_check

I don't know if it'll do everything you want, but maybe it'd help at least 
some. It's on my to-do list to check out one day when I have time. :)

HTH,
Kevin
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: Localizing stored functions by replacing placeholders in their body

2021-03-02 Thread Kevin Brannen
-Original Message-
From: Alvaro Herrera 
Sent: Tuesday, March 2, 2021 2:19 PM
To: Alexander Farber 
Cc: pgsql-general 
Subject: Re: Localizing stored functions by replacing placeholders in their body

On 2021-Mar-02, Alexander Farber wrote:

> CREATE OR REPLACE FUNCTION localize_hello()
> RETURNS text AS
> $func$
> SELECT '$(hello)';
> $func$ LANGUAGE sql IMMUTABLE;

I'm not sure this is a great approach to in-database translations: you have one 
function per string, which is cumbersome, bloated and probably slow...

---

I would agree with Alvaro and take it a step further. Perhaps you didn't do it 
this way, but many applications are split with back-end code to get & generate 
data while a set of templates is used to produce the result (HTML or whatever) 
the user sees. Many times these "template systems/toolkits" have I18N built 
into them in some way. If that's possible for you, I'd suggest investigating 
that. Overall, this seems more like an application problem and not a DB problem.

HTH,
Kevin
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.





syntax error with alter type

2018-12-07 Thread Kevin Brannen
I'm running Pg 9.6.5 if it matters...

I'm trying to drop a value from an ENUM (type) and it seems like I'm following 
the fine manual yet I still get an error. For example:

nms=# create type alphabet as enum ('a', 'b', 'c', 'd');
CREATE TYPE

nms=# alter type alphabet drop attribute if exists 'c';
ERROR:  42601: syntax error at or near "'c'"
LINE 1: alter type alphabet drop attribute if exists 'c';
 ^
LOCATION:  scanner_yyerror, scan.l:1086

What am I doing wrong? The goal is to get rid of the 'c' value from the enum. 
Yes, you can assume I've already removed of all the 'c' values in the table 
where it's used.

Or does that statement not do what I think it does and I have to do the "create 
new type, change the table to use the new type, drop old type, rename new type 
to old type" routine?

Thanks,
Kevin
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: syntax error with alter type

2018-12-07 Thread Kevin Brannen
On 12/7/18 3:17 PM, Kevin Brannen wrote:
> I'm running Pg 9.6.5 if it matters...
>
> I'm trying to drop a value from an ENUM (type) and it seems like I'm
> following the fine manual yet I still get an error. For example:
>
> nms=# create type alphabet as enum ('a', 'b', 'c', 'd');
>
> CREATE TYPE
>
> nms=# alter type alphabet drop attribute if exists 'c';
>
> ERROR:  42601: syntax error at or near "'c'"
>
> LINE 1: alter type alphabet drop attribute if exists 'c';
>
>   ^
>
> LOCATION:  scanner_yyerror, scan.l:1086
>
> What am I doing wrong? The goal is to get rid of the 'c' value from
> the enum. Yes, you can assume I've already removed of all the 'c'
> values in the table where it's used.

https://www.postgresql.org/docs/10/datatype-enum.html

"Although enum types are primarily intended for static sets of values, there is 
support for adding new values to an existing enum type, and for renaming values 
(see ALTER TYPE). Existing values cannot be removed from an enum type, nor can 
the sort ordering of such values be changed, short of dropping and re-creating 
the enum type.


>
> Or does that statement not do what I think it does and I have to do
> the "create new type, change the table to use the new type, drop old
> type, rename new type to old type" routine?

Adrian,

Thanks, I hadn't seen that "existing values cannot be removed from an enum 
type" since I was looking on the ALTER TYPE page.

So, can you (or anyone) help me understand what "alter type  drop 
attribute" is meant to do? I don't see "attribute" on the page you reference.

Thanks,
Kevin
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.



RE: syntax error with alter type

2018-12-10 Thread Kevin Brannen
Tom Lane wrote:

> Kevin Brannen  writes:
> > So, can you (or anyone) help me understand what "alter type  drop 
> > attribute" is meant to do? I don't see "attribute" on the page you 
> > reference.

> IIRC, that drops a column from a composite type; it's more or less a variant 
> spelling of ALTER TABLE DROP COLUMN.

I'm going to guess this is part of the "SQL Standard", but I'd like to comment 
that this seems really ... weird. All languages have their warts or things they 
could have done better in hind sight, but this strikes me as very weird.

That being said, I can sort of see why you say that. Looking at the examples, I 
can see:

ALTER TYPE compfoo ADD ATTRIBUTE f3 int;

which looks a lot like adding a column to a table. It would have been nice to 
know what "compfoo" is as that would have cleared up some confusion. I guess 
I'll make a doc change request.

Of course, the real answer is that someone here in the past should have made a 
small table of reference values instead of creating an ENUM, then I  wouldn't 
be facing this. Again, hind sight...sigh.

Thanks,
Kevin
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.



why use phpPgAdmin (was: RE: Importing tab delimited text file using phpPgAdmin 5.1 GUI)

2018-12-13 Thread Kevin Brannen
> From: Tony Shelver 
> Just a side comment: Why use phpPgAdmin when pgAdmin 4.6 is current, free and 
> readily available?
> It also has a graphical table-from-file loader as well.


I can’t speak for the original poster, but there are multiple reasons that 
might be:
* You came from the mysql world and are used to phpMysqlAdmin.
* You tried it long ago and it worked great, so you stuck with it.
* You can get phpPgAdmin to work.
* You find it hard to make pgAdmin work for you.

There may be other reasons, but all of the above are true for me.

The most vexing for me is the last one because I'd like it to work well. I just 
downloaded the latest and am trying to run on Win7. I can only guess why it 
made me reboot (some MS library), but I did. Then when starting it, it popped 
up a tab in by browser and that looked promising … until I tried to tell it my 
server, but it won’t connect (while phpPgAdmin is running just fine in another 
tab). After enough futzing around with it, I managed to make it connect.

After playing with pgAdmin for 5-10 minutes, I’ll say it’s OK but it’s not as 
easy to use as phpPgAdmin. In fact, telling it to look at new “objects” 
(tables/views/functions) seems to require multiple clicks and is hard to get 
what I want on the first try.

Then there's the issue that when I stop the server and restart, I can't use the 
bookmark I made to get to it because it's decided to change ports. What the 
heck?

Anyway, I’m not trying to bash pgAdmin. I’m sure that if I read the docs and 
used it more, then I’d find it useful. However, my first impression is that 
it’s harder to use and why switch when I have a perfectly working tool already 
here. I didn’t have to read any docs to use phpPgAdmin ... a few experimental 
clicks to see what the controls did and I was able to use it, plus my bookmark 
for it always works.

Now if as someone said that phpPgAdmin won’t work with Pg v11, which I’ll find 
out in a few months, then I guess I’ll have to revisit this issue, but for the 
moment, I’m good with phpPgAdmin. Not that expect I it, but if someone from the 
pgAdmin team wants to contact me for more details, I'll happily explain as 
maybe it'll help create a better tool.

One user’s perspective…
Kevin
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: Watching for view changes

2018-12-20 Thread Kevin Brannen
From: Mitar 

> On Thu, Dec 20, 2018 at 12:07 PM Rob Sargent  wrote:
> > Are you hoping to see the difference in the returned values for successive 
> > calls to the same query?
>
> i would like to in as close to real-time as possible get notification when 
> results of a query would have changed if I would do the same query again, 
> without me having to do polling or without me having to do diffing.


Hmm, I guess I could see that as long as the DB wasn't too [write] busy, else 
you'd be flooded with notifications.

Some years ago when I was working on a web app in Perl (which uses the DBI 
module for all communication to the DB), I subclassed DBI in order to see all 
calls to the DB. For most statements, I just let all the calls go thru. But for 
inserts, I wrote to a file the data that was being inserted; for deletes I 
wrote what was being deleted; and for updates I wrote the before and after 
values. (For the last 2 I changed the delete/update into a select to get the 
data.) It made it much easier to see how data changed -- especially when I was 
new to the app.

You could do something like that, where you have an interceptor that reports on 
data changes, filtering/searching for just the parts you want as you see fit. 
Of course, that would be just for your app, it wouldn't catch changes made from 
psql and other tools.

Maybe it's a useful idea for you ... or maybe not. 😊

If you had to have all statements no matter what tool was used to change data, 
I'd probably change log_statements to "all", "tail -f" the Pg log, and "do the 
right thing" (which could be non-trivial).

Kevin
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: Relocatable Binaries (RPMs) : custom installation path for PostgreSQL

2019-01-02 Thread Kevin Brannen
From: chiru r 

> I have installed Community  PostgreSQL RPMs and are going into " 
> /usr/pgsql-11/" by default.
> Please let us know how to get the PostgreSQL-11 installed in above custom 
> paths using RPMs? .


I've never tried it, but look at the "--relocate" option for the rpm command. 
The "--prefix" option might do it too, but I think "--relocate" is what you 
need.

HTH,
Kevin


This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: Relocatable Binaries (RPMs) : custom installation path for PostgreSQL

2019-01-04 Thread Kevin Brannen
From: chiru r 

> I have tried to intall the RPMs with -relocate option,however it is not 
> working as expected and throwing below error.
>
> [root@Server1dev:/root/PG11]#
> #-> rpm -ivh --relocate /usr/pgsql-11/=/u01/postgres/pg11_relocate/ 
> postgresql11-server-11.1-1PGDG.rhel7.x86_64.rpm
> warning: postgresql11-server-11.1-1PGDG.rhel7.x86_64.rpm: Header V4 DSA/SHA1 
> Signature, key ID 442df0f8: NOKEY
> Preparing...  # [100%]
>path /usr/pgsql-11 in package 
> postgresql11-server-11.1-1PGDG.rhel7.x86_64 is not relocatable
> [root@server1dev:/root/PG11]#

Then you'll probably have to get the source from the Pg download area and 
compile it yourself and use "--prefix=/u01/postgres/pg11_relocate/" as an arg 
to the "configure" command. It's not hard, the process is well documented. We 
do this and 1 reason is to have it install into /opt, although we also compile 
it ourselves so we know what's going into the install.

It's either that or let it install where it wants to and then put a symlink in 
the place you want it, as someone else suggested.

HTH,
Kevin
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: Immutable way to cast timestamp TEXT to DATE? (for index)

2019-01-04 Thread Kevin Brannen
From: Ken Tanzer 

On Fri, Jan 4, 2019 at 2:54 PM Andrew Gierth 
mailto:and...@tao11.riddles.org.uk>> wrote:
> "Ken" == Ken Tanzer mailto:ken.tan...@gmail.com>> 
> writes:

 Ken> Hi. I've got a text field in a table that holds this style of
 Ken> timestamp:

 Ken> 2014-10-23T00:00:00

You can't make this a field of type "timestamp" rather than text?

> I actually can't, or rather don't want to.  The underlying data this is drawn 
> from is actually a date field, but this particular table keeps a history of 
> what we actually transmitted to another organization, and I want to keep it 
> as an exact replication of what we sent.


If it’s not too painful, add another column to your table of type DATE, and on 
INSERT shove your “timestamp” into that, converting/casting as needed, then 
index that. So at the cost of an extra 4 bytes per row, you can have both your 
“transmission” value and an indexable value.

Kevin
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: How to always run UPDATE FROM despite missing records in the source table?

2019-01-11 Thread Kevin Brannen
From: Alexander Farber 


  *   The only workaround that I could think of is -

UPDATE users SET
visited = now(),
ip  = '20.20.20.20'::inet,
lat = (SELECT lat FROM geoip WHERE '20.20.20.20'::inet <<= block),
lng = (SELECT lng FROM geoip WHERE '20.20.20.20'::inet <<= block)
WHERE uid = 2;

But that would run the same subquery twice (correct?) and my geoip table is 
already slow with 3073410 records (and that is why I am trying to cache its lat 
and lng values in the users table on each user login event)

Have you considered using a WITH clause to get the data so the query is only 
run once?
See section 7.8.2 at https://www.postgresql.org/docs/9.6/queries-with.html
Kevin
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: Key encryption and relational integrity

2019-03-26 Thread Kevin Brannen
-Original Message-
From: Moreno Andreo 
Sent: Tuesday, March 26, 2019 11:09 AM
To: Adrian Klaver ; PostgreSQL mailing lists 

Subject: Re: Key encryption and relational integrity

In a master-detail relation, I need to encrypt one of master table PK or detail 
table FK, in order to achieve pseudonimization, required by GDPR in Europe when 
managing particular data Imagine I have Table users id   surnamelast name
1JohnDoe
2JaneDoe
3Foo Bar

Table medications
iduser_idmed
1 1Medication
2 1Ear check
...
...
medications.user_id is FK on users.id
we should achieve

Table medications
iduser_idmed
1sgkighs98Medication
2sghighs98Ear check

or the opposite (users.id encryption and medications.user_id kept plain)

At a first glance, it IS breaking relational integrity, so is there a way to 
manage this encryption internally so RI is kept safe?

===

To me, this is really more of an application question, not a DB question. When 
you dump or share the data, do your randomization then. If you want to do it in 
the DB, then add an extra column to your user table, let's call it "pseudonym" 
and when you insert a new user, fill that in with your randomization string. 
Then never dump or use the real name, but use the pseudonym column. Better if 
you can do it, just don't store the real name (though your application may not 
let you do that).

Honestly, dumping the user as an internal and meaningless number (user_id) 
seems pretty safe to me, but perhaps your GDPR doesn't allow for that.

HTH,
Kevin
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.



RE: Recommendation to run vacuum FULL in parallel

2019-04-04 Thread Kevin Brannen
From: Perumal Raj 

So conclude the requirement here , The only way to parallelism is multiple 
script. And no need to do REINDEX exclusively.
Question : Do we need to consider  Table dependencies while preparing script in 
order to avoid table locks during vacuum full ?

We have a small bash script (see below) that get the list of tables and their 
sizes, sorted smallest to largest, and do “vacuum full” one at a time because 
(as someone else pointed out) this is very I/O intensive. That order also helps 
to ensure we finish because some of our installs are at the edge of running out 
of space (an issue we’re dealing with). I probably wouldn’t have a problem 
doing 2 at a time, but we do this in the middle of the night when activity is 
lowest and it only takes 1-2 hours, so we’re good with it. It sounds like you 
have a lot more data though.

You might also consider putting the data into different tablespaces which are 
spread over multiple disks to help I/O. If you can, use SSD drives, they help 
with speed quite a bit. 😊

Don’t worry about table dependencies. This is a physical operation, not a data 
operation.

HTH,
Kevin

$PGPATH/psql -t -c "
WITH s AS (SELECT nspname || '.' || relname AS TABLE_NAME, 
pg_total_relation_size(c.oid) AS total_bytes
  FROM pg_class c
  LEFT JOIN pg_namespace n ON n.oid = c.relnamespace
  WHERE relkind = 'r' AND nspname NOT IN ( 'pg_catalog', 
'information_schema' )
  ORDER BY 2 )
SELECT table_name FROM s
" |
while read t ; do echo "" ; echo $t; $PGPATH/vacuumdb -w -z -f -t $t ; done
###
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: Drive Architecture for new PostgreSQL Environment

2019-06-13 Thread Kevin Brannen
From: Hilbert, Karin 


Hello,



We're in the process of building a new PostgreSQL environment on Scientific 
Linux release 7.6.

The new environment will have a Primary & 2 Standby servers & have asynchronous 
replication.  It will use repmgr to manage failover/switchover events.



In the past, we've always had separate separate physical drives for data, 
pg_xlog & backups.

We did this as a precaution against disk failure.  If we lose one, we would 
still have the other two to recover from.

Is that really necessary anymore, with having a repmgr cluster?



My Linux Admin wants to do the following instead:

What I propose is to set this up as a single drive and isolate the three 
directories using the Linux logical volume manager.  As a result, each 
directory would be on a separate filesystem.  This would provide the isolation 
that you require but would give me the ability to modify the sizes of the 
volumes should you run out of space.  Also, since this is a VM and all drives 
are essentially “virtual”, the performance of this different drive structure 
would be essentially identical to one with three separate drives.


===

As with so many situations, “it depends”. 😊

I think the most important part you mentioned is that you’re in a VM, so it’s 
really up to your host server and you can do anything you like. I’d probably 
make 3 separate virtual disks so you can expand them as needed individually.

We use real/standalone hardware and create 1 large RAID6 array with LVM on top 
and then create partitions on top of LVM. Our tablespace is in 1 partition and 
the rest is in another partition, and backups are mirrored to another server.

I can probably come up with other ways to do things, like the tablespace on SSD 
while the logs & backups are on some slower but perhaps “more durable” storage 
(like a NAS/SAN/whatever). Our hardware can support 2-1TB M2 drives in RAID1 
which makes me go “hmm, very fast access for the tablespace”. 😊 Probably can’t 
convince the “powers” to buy it though.

It really does depends on what’s important to you and what resources you have 
available (including budget).

HTH,
Kevin
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: [EXT EMAIL] Re: First Time Starting Up PostgreSQL and Having Problems

2019-06-24 Thread Kevin Brannen
> From: Tom Lane  Brent Bates  writes:
> >  I found the problem.  I cleared everything out and started from 
> > scratch, then slowly added my changes back to the configuration files.  The 
> > problem was in the postgresql.conf.  At the bottom of the file I had 
> > uncommented all the ‘include’ lines, so they looked like this:
> > include_dir = ''# include files ending in 
> > '.conf' from

> Ah-hah!  I wonder if we should disallow empty values for these GUCs?
> And/or put in some kind of test for recursive opening of the same config 
> file?  I don't think it'd occurred to anyone that it's this easy to get the 
> code to try to do that.


I would encourage this. 😊 I know on one of my early installs some time back I 
accidentally did:

data_directory = ''

and had a devil of a time figuring out why the postmaster wouldn't start (in 
fact it was you Tom that pointed me in the right direction to eventually find 
the misconfiguration). So I think it would be a great idea to add checks for 
empty strings in places where that's a problem. An unset value (as in the 
config is commented out) can be OK as any defaults will be used, but to set 
some values to the empty string just hurts and it would be a help to new users, 
or even those of us who make typos, 😊 to get better error messages so we can 
fix the problem faster on our own.

Kevin

This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: Need a DB layout gui

2019-06-25 Thread Kevin Brannen
From: David Gauthier 

> I've been poking around 
> https://wiki.postgresql.org/wiki/Community_Guide_to_PostgreSQL_GUI_Tools to 
> see if there is anything that can read PG metadata in and display it 
> graphically in a gui.  You know, the kind of picture that is 1 block per 
> table with 1->many arrows connecting up the primary/foreign keys of different 
> tables.  SQL Power Architect looked promising, but fails to do the linkage 
> for some PK-FK relationships (buggy?) . Any suggestions on what might do this?

I use and like dbWrench; it’s written in Java so can run on Linux/Windows/Mac. 
It’s commercial but not expensive.

Kevin
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: Check constraint failure messages

2021-04-08 Thread Kevin Brannen
From: Miles Elam 
Sent: Tuesday, April 6, 2021 4:19 PM

Sadly, this is a cloud-managed database without direct access to 5432 from 
outside the VPC and bastian instances are frowned upon by our security folks. 
Guess I'm stuck with bisecting. Thanks for the confirmation.



It'd slow you down a little, but you could install a before trigger to do the 
check and if it fails you'll have all the info you need and can call RAISE to 
communicate a record identifier, bad value, whatever.

If performance is really a big deal, set up the trigger only after you get a 
failure so you only have to run it on the 'bad batches'.

Or else write a precheck program that looks at the data first and looks only 
for this and therefore can report the exact record before it feeds the good 
data to your real loader.

Or there's probably other ways I'm not thinking of. The question is what 
options does your process allow, I suppose.

HTH,
Kevin
.
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: Ways to "serialize" result set for later use?

2021-04-12 Thread Kevin Brannen
From: Adam Brusselback 
Sent: Saturday, April 10, 2021 9:06 PM
To: pgsql-general 
Subject: Ways to "serialize" result set for later use?

Hey there everyone,

I am going through the process of writing my first pgtap tests for my database, 
and I wanted to get some feedback on if my solution seems fine, is just dumb, 
or could be acomplished much easier another way.

So my main problem I was trying to work around, was my tests are written in 
functions and called using runtests(), so using psql and \copy to save test 
data to the filesystem isn't really part of the workflow, but I still needed a 
way to have my "expected" query resultset passed into results_eq 
[https://pgtap.org/documentation.html#results_eq]
 easily within a function body.

I originally manually dumped some "known good" data from a query to csv, and 
built some SELECT ... FROM VALUES (...) statements by hand to do this. That 
obviously sucks.

…

So I really just wanted to see if there is a better way to go about what i'm 
trying to do, does Postgres already support something similar I can harness 
instead of this hack? Or is this really an alright way to go?

This seems more like an application question, but I'll throw something out for 
you to consider…

IMO, you're trying to put pgtap into an area it wasn't really made for. If you 
can make it do what you want, good for you, but I wouldn't try that. Pgtap is 
great for things like:

  *   Does my DB/Schema/Tables/Views/Functions/… exist, have the correct owner, 
etc.
  *   Does each table have the right columns, defaults, constraints, etc.
  *   Testing of simple functions is possible, but probably only really for 
"immutable" stuff, like I insert "x" I better always get "y" out of it.
  *   More DDL stuff like this that's important because you want to know if 
your "alter" script after an upgrade did the right thing.

Checking data (DML), if functions are doing the right things is something we do 
in our code unit tests.

Of course, TMTOWTDI, YMMV, etc. 😊

HTH,
Kevin
.
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: Ways to "serialize" result set for later use?

2021-04-13 Thread Kevin Brannen
From: Adam Brusselback 
Sent: Monday, April 12, 2021 12:51 PM

>  Checking data (DML), if functions are doing the right things is something we 
> do in our code unit tests.

This is exactly what I am writing, unit tests for my code (which is pl/pgsql). 
This is an ELT pipeline for my customers to bulk update their data in my 
system, with detailed error reporting for any issues per-row/column. The code 
is all plpgsql, as are the few tests i've written so far. pgTAP is my unit 
testing framework for this process.

So unit testing my company's (vast) database code is something I am just trying 
to figure out and get into my workflow, and it didn't feel like I had to fight 
too hard with it at this point, other than $subject$. And even that isn't an 
issue with my hacky function in place, it just feels a little...dirty I guess? 
Was just wanting a gut check if there seemed to be an obviously better way to 
get the same results.

If there is something built into Pg that does what you want, I'm not aware of 
it, but there are a number of extensions out there. You could check out that 
list to see.

Otherwise, I'm not sure we can help you much. If that approach seems to be 
working, continue on and see where it takes you.

Our unit tests are in Perl based on a module we created that inherits from 
Test::More. From there, we can exercise any code we need, many of which 
contains calls to DB functions, injecting known values for all the use cases we 
can think of, and making sure we get back the expected results. That also means 
we can COPY data in for testing if required. It's pretty simplistic and 
straight-forward in some ways, but it works for us.

HTH,
Kevin
.
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: [Extern] Re: Advice on binary installation

2021-04-13 Thread Kevin Brannen
-Original Message-
From: Zwettler Markus (OIZ) 
Sent: Tuesday, April 13, 2021 8:08 AM


> -Ursprüngliche Nachricht-

> Von: Paul Förster mailto:paul.foers...@gmail.com>>

> Gesendet: Dienstag, 13. April 2021 15:02

> Betreff: [Extern] Re: Advice on binary installation

>

> Hi Markus,

>

> On 13. Apr, 2021, at 14:43, Zwettler Markus (OIZ)

> mailto:markus.zwett...@zuerich.ch>>

> wrote:

> > We assume to get more than 100 Postgres clusters in the future.

> >

> > We will get a very heterogeneous binary installation basis if we

> > install needed

> extensions (e.g. ip4r) or software (e.g. patroni) on a per project basis.

> >

> > There could be even more incompatibility problems otherwise if we

> > install all

> global needed extensions or software with every project to get a

> homogenous binary installation.

> >

> > Software installation is done with yum using PGDG downstream channels.

> >

> > Any recommendations?

>

> I don't know how you plan to do it, but I can only say how we do it:

>

> - Compile from source including all extensions needed (make install-world).

>   Create a binary only directory this way, i.e. compile source to

>   /data/postgres/xx.x

> - Create a tar file of said directory.

> - Distribute that via Ansible untaring it on the destination servers.

> - Have a standard postgresql.conf ready which includes a

>   shared_preload_libraries = 'pg_stat_statements'

>   (or whatever you need) for use with initdb.

>

> This way, we make sure that all servers get the same new software

> directory as needed in a separate directory including all necessary

> extensions, which means we can then delete the old directory if it is

> not longer needed (i.e. after all databases have been upgraded). Also, this 
> makes sure, everything is loaded properly.

>

> With individual and only some few cases, we then use "create

> extension", but only extensions which we deliver with the tar via

> Ansible. If there is doing to be a new extension (which we avaoid if

> we can), then we put it into the tar Archive and nowhere else. So it's on all 
> servers, but only a few databases use it then.

>

> Hope this helps.

>

> Paul

>





Out IT Sec requires to do software patching at least every 3 months.



How to you do software patching with your method? Creating + deploy new 
tarballs every 3 month?



Thanks, Markus





We used compile our own to do something very similar until we got rid of the 
need for 1 extension with an external dependency that was holding us back. 
After that, we just use the community packages and install the extensions 
package too. That allows us to do "create extension X" to load the ones we need 
in our application.



When we need to upgrade, we just get the newer community packages and install 
them with the OS package manager ("rpm" in our case). As long as you stay 
within the same major version it's all easy. If you cross major versions then 
"pg_upgrade" is required (be sure to use the -k option).



With a little effort, all of that is scriptable so you can run it on any 
servers as required when it's time for that server to upgrade. Ansible, Puppet, 
Chef, or your own distribution software can be useful to get it there and put 
into service.



To us, the above is the easy part. The hard part is certifying that the our 
software works with the newer version of Pg. It's rare we have to change 
things, but Pg has "tightened up" a little over time, showing us where we were 
a little "loose and carefree" in our SQL. That hasn't hit us in the last couple 
of releases, so hopefully we're beyond that and now we left with "what new 
features look useful to take advantage of?". 😊



HTH,

Kevin

This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: Migrating From CentOS-6 Oracle 11g R2 To CentOS-7 PostgreSQL ??

2017-12-05 Thread Kevin Brannen
From: Eugene Poole [mailto:etpool...@comcast.net]
Sent: Tuesday, December 05, 2017 12:59 PM
To: pgsql-gene...@postgresql.org
Subject: Migrating From CentOS-6 Oracle 11g R2 To CentOS-7 PostgreSQL ??


I've asked most of my OS based questions on another mail list, but I think this 
is more specific:

On a physical machine I'm running CentOS-6 64-bit with Oracle 11gR2 and I want 
to migrate to another physical machine running CentOS-7 64-bit with PostgreSQL 
??

1. What version of PostgreSQL should I use?

The latest that works for you. Some people don’t like .0 releases for various 
reasons and if that applies to you, then use the latest of 9.6.x (9.6.5 I 
think); else use 10.0.

2. I don't want to use a RPM because I like controlling where software is 
installed, so where can I locate the required tar ball?

You can download the source and build it yourself; there’s an arg for where you 
want to install it. The instructions are pretty easy to follow if you’ve ever 
built anything.

3. Would I use ora2pg do do the move? Is ora2pg still maintained?

4. Is there a better conversion package?

5. What additional information might be helpful?

I find building & testing on a VM to be useful, as I can snapshop every step of 
the way and revert if required. I’ve never done an Oracle -> Pg conversion, but 
it’s been discussed on the list so you might want to search the archives for 
advice. In fact the last time wasn’t all that long ago, maybe a month?

HTH, Kevin

TIA

Gene

--

Eugene Poole

Woodstock, Georgia

This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.