Re: Postgresql 9.3 Server will not start after Ubuntu Upgrade

2018-03-27 Thread Moreno Andreo

Il 27/03/2018 20:00, Ken Beck ha scritto:

And, looking for log files, I find none.

Have you looked in /var/log/postgresql/ ?





Re: Postgresql 9.3 Server will not start after Ubuntu Upgrade

2018-03-28 Thread Moreno Andreo

Il 27/03/2018 22:57, Adrian Klaver ha scritto:

On 03/27/2018 01:46 PM, Ken Beck wrote:

I am working on two systems, one running in Oracle VirtualBox on my
laptop, the other in a DigitalOcean droplet. I know on one of them I
tried to remove the postgres-9.6, and it must have been my laptop, here
is the output from pg_lsclusters:

Ver Cluster Port Status Owner    Data directory Log file
9.3 main    5432 down   postgres /var/lib/postgresql/9.3/main
/var/log/postgresql/postgresql-9.3-main.log


What happens if you do?:

sudo pg_ctlcluster 9.3 main start 
If possible, I'll take a look at 
/var/log/postgresql/postgresql-9.3-main.log.

That may give us a big help to uncover what's goin'on.

Cheers

Moreno.-




Re: Postgresql 9.3 Server will not start after Ubuntu Upgrade

2018-03-28 Thread Moreno Andreo

Il 28/03/2018 18:28, Adrian Klaver ha scritto:

On 03/28/2018 09:24 AM, Moreno Andreo wrote:

Il 27/03/2018 22:57, Adrian Klaver ha scritto:

On 03/27/2018 01:46 PM, Ken Beck wrote:

I am working on two systems, one running in Oracle VirtualBox on my
laptop, the other in a DigitalOcean droplet. I know on one of them I
tried to remove the postgres-9.6, and it must have been my laptop, 
here

is the output from pg_lsclusters:

Ver Cluster Port Status Owner    Data directory Log file
9.3 main    5432 down   postgres /var/lib/postgresql/9.3/main
/var/log/postgresql/postgresql-9.3-main.log


What happens if you do?:

sudo pg_ctlcluster 9.3 main start 
If possible, I'll take a look at 
/var/log/postgresql/postgresql-9.3-main.log.

That may give us a big help to uncover what's goin'on.


See this post:
https://www.postgresql.org/message-id/5bad1f01-d06a-1526-96d2-c9cfd6062a11%40gmail.com 

Yes I saw it, but I can't say if that log was from 9.3 or 9.6 cluster... 
as you can see above he wrote that he had a 9.6, that he tried to 
remove, and a 9.3, which was not starting. So it's not clear to me if 
the log he sent was from one version or the another...






Cheers

Moreno.-












Schema-based replication

2018-03-30 Thread Moreno Andreo

Hi folks:-)

I'm about to design and develop a cross-platform schema-based replicator 
as a module for an app scheduled for alpha in the next mid-fall/winter.
Before digging into it I need to know if I'm reinventing the wheel or 
not, but AFAIK pglogical can't give this kind of granularity, especially 
in a multi-master environment.


User should have data (one database with one or more schema, but not 
necessarily all of them have to be in sync) synced across more than one 
device, with our server, where there's one database for all and one or 
more schema per user.


Is there anything similar, among PG core or extensions, in progress or 
in beta?


Thanks
Moreno.






Re: Strange error in Windows 10 Pro

2018-04-23 Thread Moreno Andreo

Il 21/04/2018 22:35, Adrian Klaver ha scritto:

On 04/21/2018 01:08 PM, Dale Seaburg wrote:
Thanks Adrian for the suggestion of running the installer with Admin 
rights.  Unfortunately, I get the same results.  It appears that all 
of the folders within C:\Program Files\PostgreSQL\9.6 path are 
created, and populated, BUT, when the items in the *data* folder are 
to be created, or copied into, it leaves an error message as noted 
previously. The *data* folder is empty.


It's almost as if the PC is missing a critical .dll needed in the 
*data* folder filling function (my guess).


Again, I am at a loss as to what to do.


Have you looked at the system logs e.g. Event Viewer?


If you look in %temp%", there should be one or more files named 
"bitrock_installer" or something similar (search for "bitrock"), that's 
the setup log with what's went good and what not.
Another hint I had since 9.1 times (it was 2012, I guess) was to avoid 
installing Postgres under system folders (c:\program files, c:\users, 
and so on) because in some cases there could be some nasty behaviors... 
try installing on something like c:\PG96...


HTH
Cheers,
Moreno.-




Re: Strange error in Windows 10 Pro

2018-04-24 Thread Moreno Andreo

  
  
Il 24/04/2018 04:09, Dale Seaburg ha
  scritto:


  
  Thanks to Moreno and Igor for the Event Viewer suggestions. 
Here are a few lines of log file where they differ between a
good install and a bad incomplete install.  The good install was
from my Shop PC with Windows 10 Pro.
  BTW, I had success with another Dell PC with Win 10 Pro about 4
months ago.  This problem appears to be a one-off.
  
  Good Install:
  Called AclCheck(C:\Program Files\PostgreSQL\9.6\data)
  Called IsVistaOrNewer()...
      'winmgmts' object initialized...
      Version:10.
      MajorVersion:10
  Executing icacls to ensure the SHOP-PC\Dale account can
read the path C:\Program Files\PostgreSQL\9.6\data
      Executing batch file 'rad0510A.bat'...
      processed file: C:\Program Files\PostgreSQL\9.6\data
  Successfully processed 1 files; Failed processing 0 files
  
  Bad Install:
  Called AclCheck(D:\PostgreSQL\9.6\data)
  Called IsVistaOrNewer()...
      'winmgmts' object initialized...
      Version:10.
      MajorVersion:10
  Executing icacls to ensure the WINDOWS-6BEGVO1\don king
account can read the path D:\PostgreSQL\9.6\data
      Executing batch file 'rad6DBC7.bat'...
  

It would be interesting to have the next 2 lines in "Bad install",
like in "good install", to know if rad6DBC7.bat excecution
terminated with or without errors so if there's an error it
would be nice to see it.
I assume D: to be a secondary local drive or anyway a local
partition not a network or external one...

Thanks
Moreno.-
  





pgdg-keyring (or apt-key) failure on fresh 9.6 install

2018-05-28 Thread Moreno Andreo

Hi folks,
I'm trying to install Postgresql 9.6 on a test machine in Google Cloud 
Platform
After a fresh install with Debian 9 (just after the instance has been 
created) I follow steps from here


https://wiki.postgresql.org/wiki/Apt

(instead of pg 10 I install pg 9.6)

During the installation process i encounter the following strange 
warnings that, even if that's a test machine, make me think twice before 
going ahead.


[...]
Processing triggers for man-db (2.7.6.1-2) ...
Setting up pgdg-keyring (2017.3) ...
Removing apt.postgresql.org key from trusted.gpg: Warning: The postinst 
maintainerscript of the package pgdg-keyring
Warning: seems to use apt-key (provided by apt) without depending on 
gnupg or gnupg2.
Warning: This will BREAK in the future and should be fixed by the 
package maintainer(s).
Note: Check first if apt-key functionality is needed at all - it 
probably isn't!

OK
Setting up xml-core (0.17) ...
[...]

I have to say that installation is successfully and database server goes 
up and apparently with no problems at all.


Here's versions:

postgres=# select version();

version 
 



PostgreSQL 9.6.9 on x86_64-pc-linux-gnu (Debian 9.6.9-2.pgdg90+1), 
compiled by gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, 64-bit(1 row)


Anyone stumped on it? Googling around I see a post saying that's about a 
dirmngr package missing tried but no avail.


Do I need to worry?

Thanks

Moreno.-







Re: pgdg-keyring (or apt-key) failure on fresh 9.6 install

2018-05-29 Thread Moreno Andreo

Hi Tim,

Il 29/05/2018 00:06, Tim Cross ha scritto:

Moreno Andreo  writes:


Hi folks,
I'm trying to install Postgresql 9.6 on a test machine in Google Cloud
Platform
After a fresh install with Debian 9 (just after the instance has been
created) I follow steps from here

https://wiki.postgresql.org/wiki/Apt

(instead of pg 10 I install pg 9.6)

During the installation process i encounter the following strange
warnings that, even if that's a test machine, make me think twice before
going ahead.

[...]
Processing triggers for man-db (2.7.6.1-2) ...
Setting up pgdg-keyring (2017.3) ...
Removing apt.postgresql.org key from trusted.gpg: Warning: The postinst
maintainerscript of the package pgdg-keyring
Warning: seems to use apt-key (provided by apt) without depending on
gnupg or gnupg2.
Warning: This will BREAK in the future and should be fixed by the
package maintainer(s).
Note: Check first if apt-key functionality is needed at all - it
probably isn't!
OK
Setting up xml-core (0.17) ...
[...]

I have to say that installation is successfully and database server goes
up and apparently with no problems at all.


This looks like a warning for the package maintainers regarding ensuring
the package depends on either gnupg or gnupg2 and nothing you need to
worry about unless you are building/maintaining deb packages for postgres.
Brilliant. That's what I needed to know. Just to avoid bitter surprises 
in the future... :-)


The Debian package manager, apt, uses gpg keys to verify the
authenticity of packages it downloads. My guess is that previously, you
only needed to ensure the package had a dependency on apt-key and now
apt has/is changing such that you need to have an explicit dependency on
either gnupg or gnupg2.

... so if I update/upgrade this instance in the future it will be 
automatically fixed (and there shouldn't be issues), right?

Thanks a lot!

Moreno.-




Re: SQL problem (forgot to change header with earlier post!).

2018-05-29 Thread Moreno Andreo

Il 29/05/2018 13:14, Paul Linehan ha scritto:

Hi all,

I have a problem that I just can't seem to solve:

I want to divide the count of one table by the count of another -
seems simple enough!
I created simple VIEWs with counts of the tables, but I just can't
grasp the logic!

If it's not an excercise, I think you don't need them

DDL and DML (simplified) at the bottom of post.


I tried various combinations of things like basic SELECTs.


SELECT avg FROM ((SELECT cnt1 FROM v1)/(SELECT cnt2 FROM v2));


Maybe I didn't catch the problem, but

select (select count(*) from t1) / (select count(*) from t2)::float

should be a starting point (if you need an integer as a return value, 
simply remove the ::float at the end


HTH
Moreno.-




Re: pgdg-keyring (or apt-key) failure on fresh 9.6 install

2018-05-30 Thread Moreno Andreo

Il 30/05/2018 00:25, Tim Cross ha scritto:


Personally, I tend to prefer using the packages which come with the
particular flavour of Linux your installing as they are often more
in-line with the current version of the package management system being
used. I only grab packages from the specific Postgres repo if the
package is not in the current version of the distribution I'm
installing.


You mean "don't add pgdg repos if not necessary"?





catalog is missing n attribute(s) for relid xxxx at character yy

2018-06-14 Thread Moreno Andreo

Hi,
PostgreSQL 9.1 x86 on Windows 10 (EOL, but in this case it doesn't apply 
:-) )


When querying a table, I receive the error reported in subject:
catalog is missing 5 attribute(s) for relid 33238 at character 15

So I decided to drop and recreate the table.
DROP TABLE tbl;
same error.

Is there anything else I can do to avoid to drop and recreate the whole 
database?


Thanks
Moreno.-




pgp_sym_decrypt() - error 39000: wrong key or corrupt data

2018-06-21 Thread Moreno Andreo

Hi,
    while playing with pgcrypto I ran into a strange issue (postgresql 
9.5.3 x86 on Windows 7)


Having a table with a field
dateofbirth text

I made the following sequence of SQL commands
update tbl_p set dateofbirth = pgp_sym_encrypt('2018-06-21', 'AES_KEY') 
where codguid = '0001-0001-0001-0001-0001';

OK

select pgp_sym_decrypt(dateofbirth::bytea, 'AES_KEY') as datanasc from 
tbl_p where codguid = '0001-0001-0001-0001-0001'

'2018-06-21'

select * from tab_paz where pgp_sym_decrypt(natoil::bytea, 'AES_KEY') = 
'2018-06-21'

ERROR:  Wrong key or corrupt data
** Error **

ERROR: Wrong key or corrupt data
SQL state: 39000

Can't find reference anywhere...
Any help would be appreciated.
Thanks,
Moreno.-




Re: pgp_sym_decrypt() - error 39000: wrong key or corrupt data

2018-06-22 Thread Moreno Andreo

Il 21/06/2018 23:31, Adrian Klaver ha scritto:

On 06/21/2018 08:36 AM, Moreno Andreo wrote:

Hi,
 while playing with pgcrypto I ran into a strange issue 
(postgresql 9.5.3 x86 on Windows 7)


Having a table with a field
dateofbirth text

I made the following sequence of SQL commands
update tbl_p set dateofbirth = pgp_sym_encrypt('2018-06-21', 
'AES_KEY') where codguid = '0001-0001-0001-0001-0001';

OK

select pgp_sym_decrypt(dateofbirth::bytea, 'AES_KEY') as datanasc 
from tbl_p where codguid = '0001-0001-0001-0001-0001'

'2018-06-21'

select * from tab_paz where pgp_sym_decrypt(natoil::bytea, 'AES_KEY') 
= '2018-06-21'


You switched gears above.

What is the data type of the natoil field in table tab_paz?

Sorry, just a typo... natoil is, actually dateofbirth, so it's text.
You can read it as
select * from tbl_p where pgp_sym_decrypt(dateofbirth::bytea, 'AES_KEY') 
= '2018-06-21'


Was the data encrypted in it using the 'AES_KEY'?

Yes, the command sequence is exactly reported above.
If I use pgp_sym_decrypt in a SELECT statement it's OK, but if it's in a 
where clause it seems not to be working.






Re: pgp_sym_decrypt() - error 39000: wrong key or corrupt data

2018-06-22 Thread Moreno Andreo

Il 22/06/2018 15:18, Adrian Klaver ha scritto:

On 06/22/2018 01:46 AM, Moreno Andreo wrote:

Il 21/06/2018 23:31, Adrian Klaver ha scritto:

On 06/21/2018 08:36 AM, Moreno Andreo wrote:

Hi,
 while playing with pgcrypto I ran into a strange issue 
(postgresql 9.5.3 x86 on Windows 7)


Having a table with a field
dateofbirth text

I made the following sequence of SQL commands
update tbl_p set dateofbirth = pgp_sym_encrypt('2018-06-21', 
'AES_KEY') where codguid = '0001-0001-0001-0001-0001';

OK

select pgp_sym_decrypt(dateofbirth::bytea, 'AES_KEY') as datanasc 
from tbl_p where codguid = '0001-0001-0001-0001-0001'

'2018-06-21'

select * from tab_paz where pgp_sym_decrypt(natoil::bytea, 
'AES_KEY') = '2018-06-21'


You switched gears above.

What is the data type of the natoil field in table tab_paz?

Sorry, just a typo... natoil is, actually dateofbirth, so it's text.
You can read it as
select * from tbl_p where pgp_sym_decrypt(dateofbirth::bytea, 
'AES_KEY') = '2018-06-21'


Was the data encrypted in it using the 'AES_KEY'?

Yes, the command sequence is exactly reported above.
If I use pgp_sym_decrypt in a SELECT statement it's OK, but if it's 
in a where clause it seems not to be working.


Are you sure that the entries where not encrypted with a different key 
because I can't replicate.(More comments below):

(other replies below, inline)
I'm almost sure (you're never absolutely sure :-) ), since I kept all 
commands I entered in PgAdminIII SQL Window, and they're reported above.
On the other side, I tried the same procedure on another field and it 
succeeded.


The only difference between the 2 fields, and I don't know if it can 
make any sense, is that the field I tried now and succeeded was created 
as text, while the other field (dateofbirth) was a timestamp I ALTERed 
with the statement
alter table tbl_p alter column dateofbirth type text using 
to_char(dateofbirth, '-MM-DD');


I'm just afraid it can happen in production



create table pgp_test(id integer, fld_1 varchar);

insert  into pgp_test values (1, pgp_sym_encrypt('2018-06-21', 
'AES_KEY'))


select * from pgp_test ;

 id |     fld_1
+ 

  1 | 
\xc30d04070302444e9b2792436e3d7ed23b01cf097f0a6a36298bab63ae4f22f39de54a9b0d8f905d48198ce76089de5f21669c46d96439718b6b0408c541427b6e7c11008bd3d0ebdae0dceb


select * from pgp_test where pgp_sym_decrypt(fld_1::bytea, 'AES_KEY') 
= '2018-06-21';


id |   fld_1
+ 

  1 | 
\xc30d04070302444e9b2792436e3d7ed23b01cf097f0a6a36298bab63ae4f22f39de54a9b0d8f905d48198ce76089de5f21669c46d96439718b6b0408c541427b6e7c11008bd3d0ebdae0dceb



Have you looked at the entry in its encrypted state to see if it looks 
the same as pgp_sym_encrypt('2018-06-21', 'AES_KEY')?

Yes, it seems to have the same value





Re: pgp_sym_decrypt() - error 39000: wrong key or corrupt data

2018-06-27 Thread Moreno Andreo

Il 22/06/2018 19:56, Adrian Klaver ha scritto:

On 06/22/2018 09:50 AM, Moreno Andreo wrote:

Il 22/06/2018 15:18, Adrian Klaver ha scritto:


Are you sure that the entries where not encrypted with a different 
key because I can't replicate.(More comments below):

(other replies below, inline)
I'm almost sure (you're never absolutely sure :-) ), since I kept all 
commands I entered in PgAdminIII SQL Window, and they're reported above.
On the other side, I tried the same procedure on another field and it 
succeeded.


The only difference between the 2 fields, and I don't know if it can 
make any sense, is that the field I tried now and succeeded was 
created as text, while the other field (dateofbirth) was a timestamp 
I ALTERed with the statement
alter table tbl_p alter column dateofbirth type text using 
to_char(dateofbirth, '-MM-DD');


Assuming the ALTER TABLE was done and then the values where encrypted, 
that does not seem to affect anything here(More below):


test=# create table pgp_alter_test(id integer, birthdate date);
CREATE TABLE
test=# \d pgp_alter_test
    Table "public.pgp_alter_test"
  Column   |  Type   | Collation | Nullable | Default
---+-+---+--+-
 id    | integer |   |  |
 birthdate | date    |   |  |

test=# insert into pgp_alter_test values (1, '2018-06-21');
INSERT 0 1
test=# select * from pgp_alter_test ;
 id | birthdate
+
  1 | 2018-06-21
(1 row)

test=# alter table pgp_alter_test alter column birthdate type text 
using to_char(birthdate, '-MM-DD');

ALTER TABLE

test=# \d pgp_alter_test
    Table "public.pgp_alter_test"
  Column   |  Type   | Collation | Nullable | Default
---+-+---+--+-
 id    | integer |   |  |
 birthdate | text    |

test=# select * from pgp_alter_test ;
 id | birthdate

+

  1 | 2018-06-21

(1 row)




test=# update pgp_alter_test set birthdate = 
pgp_sym_encrypt(birthdate, 'AES_KEY') where id = 1;

UPDATE 1

test=# select * from pgp_alter_test ;
 id |   birthdate
+ 

  1 | 
\xc30d04070302b3f55c80f9ab657c68d23b010cd9d4d82631f89c786394a8bceb35529db07c708e5a0c4f04cf91aed24b5ff397dd99c678ec4f5bb769e148cfae3cdfc5453daaeb34ddd2737c 


(1 row)





 ^

test=# select * from pgp_alter_test where 
pgp_sym_decrypt(birthdate::bytea, 'AES_KEY') = '2018-06-21';

 id |   birthdate
+ 

  1 | 
\xc30d04070302b3f55c80f9ab657c68d23b010cd9d4d82631f89c786394a8bceb35529db07c708e5a0c4f04cf91aed24b5ff397dd99c678ec4f5bb769e148cfae3cdfc5453daaeb34ddd2737c 


(1 row)

I am at a loss now. The only thing I can think of is that data itself 
is actually corrupted. Maybe some sort of language encoding/collation 
issue. Just not sure how to test that at the moment.


Actually, I tried it in a bunch of other fields with varying data types 
and everything went fine.
I don't know if it's as you say and I mismatched keys (and I need 
another pair of glasses) or something else. Just hoping (but being 
confident) it won't happen again.


Now trying to speed up a little some queries involving SELECTing among 
these encrypted fields, if I'm stuck I'll open a new thread.


Thanks,
Moreno.-






I'm just afraid it can happen in production



create table pgp_test(id integer, fld_1 varchar);

insert  into pgp_test values (1, pgp_sym_encrypt('2018-06-21', 
'AES_KEY'))




Have you looked at the entry in its encrypted state to see if it 
looks the same as pgp_sym_encrypt('2018-06-21', 'AES_KEY')?

Yes, it seems to have the same value


So













Re: Not able to update some rows in a table

2018-07-02 Thread Moreno Andreo

Il 02/07/2018 16:51, Marco Fochesato ha scritto:

Dear all,
I have a table with 21 columns.
Primary key is done with 20 of these colums.

I have 3.313 records.
I don't know why, but I'm not able to update some of these records.
I don't understand, it seems that I'm not able to fetch.
Could you post some of the statement you execute? If you are unable to 
SELECT some record, it's quite straightforward that you can't UPDATE them.


It happens with Pgadmin, with Python Psycopg2 and also through 
Libreoffice Base.

OS? PG Version?



Looking to the properties of the table in Pgadmin, I can see only the 
estimated number

of rows. Counted rows are 'not counted'.
Seems that autovacuum is not active, or has never run (but i'm quite 
sure it's not related to your problem, so it can be investigated later).


Cheers

Moreno.-





Role problem in Windows

2018-07-06 Thread Moreno Andreo

Hi,
Running 9.1 on Windows 10, upgrading to 10 with pg_upgrade.

"Once upon a time" there was a bug in our automatic role creation 
procedure that did not mask vowels with accent (used in Italian 
language), like "ò, è" and the result was a role with an empty name.
We are now upgrading to 10, and pg_dumpall exits complaining with this 
role, showing its name (with mis-encoded UTF-8 accented vowel) as an 
invalid utf-8 character.


Trying to get rid of the role, that can't be deleted with a drop role 
because of the empty name, I did

delete from pg_authid where oid = 

Role disappeared from role list.

At the next execution of the pg_upgrade it complains that role "" 
does not exist while dumping a trigger function. I tried remove the 
privilege from function ACL, but "role n does not exists".


Is there a way to recreate the deleted role, either as a dummy, so I can 
finish upgrade?

Is there another way to bypass the problem?

Any help would be appreciated.

Cheers,
Moreno.-




Re: Role problem in Windows

2018-07-06 Thread Moreno Andreo

  
  
Il 06/07/2018 17:34, Melvin Davidson ha
  scritto:


  

  On Fri, Jul 6, 2018 at 10:01 AM,
Moreno Andreo <moreno.and...@evolu-s.it>
wrote:
Hi,
  Running 9.1 on Windows 10, upgrading to 10 with
  pg_upgrade.
  
  "Once upon a time" there was a bug in our automatic role
  creation procedure that did not mask vowels with accent
  (used in Italian language), like "ò, è" and the result was
  a role with an empty name.
  We are now upgrading to 10, and pg_dumpall exits
  complaining with this role, showing its name (with
  mis-encoded UTF-8 accented vowel) as an invalid utf-8
  character.
  
  Trying to get rid of the role, that can't be deleted with
  a drop role because of the empty name, I did
  delete from pg_authid where oid = 
  
  Role disappeared from role list.
  
  At the next execution of the pg_upgrade it complains that
  role "" does not exist while dumping a trigger
  function. I tried remove the privilege from function ACL,
  but "role n does not exists".
  
  Is there a way to recreate the deleted role, either as a
  dummy, so I can finish upgrade?
  Is there another way to bypass the problem?
  
  Any help would be appreciated.
  
  Cheers,
  Moreno.-
  
  

  
  
  >Is there a way to recreate the deleted role, either as a
  dummy, so I can finish upgrade?
  

I can't really suggest how to recreate
  the dummy role, but I do have an alternate solution.
Most probably pg_dump is complaining
  that role 'xxx' owns some tables.
  

IIRC the complain was about "role  does not exist"
In the meantime I was able to pg_dump single databases (5 in total,
one of them complaining about the role not existing but dumped with
all data in its place) and, with my surprise (since I was convinced
that pg_dump was working inside a single transaction) I found all
roles (all but the "failing" one) at their place in the new server.
So, lesson learned: don't mess with system catalogs before RTFM
:-))


  
 So you can use the 

attached script and add 'AND a.rolname
  = 'xxx' to the WHERE clause.
Then as a superuser you can use ALTER
  TABLE xyz OWNER TO new_owner for each table found.

  

I'll keep it, so if something similar happens maybe it can come in
hand.

Thanks for your time
Moreno.-
  





New tablespace: just an advice

2018-10-15 Thread Moreno Andreo

Hi everyone!

My space on my Debian 8 DB server is running a bit low (10% left of a 
2TB disk), so, since it's not possible to have a primary MBR disk with 
size > 2 TB, I decided to create another disk and map it on the server, 
creating another tablespace on it and moving databases aross disks to 
balance disk usage.


After creating a test server on this night's production server image, I 
created a folder and assigned ownership to postgres user

mkdir /newdisk/tbsp_new
chown -R postgres /newdisk/tbsp_new

then created new tablespace
create tablspace tb2 location '/newdisk/tbsp_new';

and moved a database
alter database db set tablespace tb2;

As you can see a flat, basic tablespace with no customizations.
I just ran a VACUUM FULL on the cluster before creating tablespace.
After the ALTER DATABASE command ran successful, I checked disk space 
(df -h) and saw some more free space on primary disk, and the same space 
occupied on new disk. Just what I needed.
I psql'd in the cluster with the user's username connecting on the 
database just migrated, and was able to SELECT and UPDATE (not tried 
INSERTing).


Now, 2 questions.
1. Is it all or do I need to adjust something else about permissions, 
indexes, vacuuming, etc...?
ALTERing the database namespace means copying its physical files to new 
directory, but is it applied to all objects (indexes, triggers, etc)?


2. What will happen to who tries to access the database while it's being 
moved from one tablespace to another?


Thanks in advance,
Moreno.




Re: New tablespace: just an advice

2018-10-16 Thread Moreno Andreo

Il 16/10/2018 10:18, Laurenz Albe ha scritto:

Moreno Andreo wrote:

Now, 2 questions.
1. Is it all or do I need to adjust something else about permissions,
indexes, vacuuming, etc...?
ALTERing the database namespace means copying its physical files to new
directory, but is it applied to all objects (indexes, triggers, etc)?

All objects that are located in the default tablespace of the database
will be moved.


2. What will happen to who tries to access the database while it's being
moved from one tablespace to another?

They will be "hanging" with a lock until the transaction is done.

Yours,
Laurenz Albe


Perfect!

Thanks for your time,

Moreno





Updating 3-table dataset

2019-08-09 Thread Moreno Andreo

Hi all,
    I don't know if that's the heat burning my brain but I can't find a 
solution to what seemed a simple operation to me.


I have 3 tables
create table t_all
{
id uuid,
ref_id uuid (FK to t_ana.id)
};
create table t_ana
{
id uuid,
code text
};
create table t_app
{
id uuid,
code text    (subset of t_ana.code)
}
I need to update t_all set t_all.id = t_app.id having t_ana.code in 
t_app.code (I wrote it in some kind of meta-sql but I hope it's clear)
I tried to create a view but I need an INSTEAD OF trigger, since it 
spreads among 3 tables so I hope there's some faster path to achieve the 
solution


Thanks
Moreno







Re: Updating 3-table dataset

2019-08-09 Thread Moreno Andreo

Il 09/08/19 16:50, Luca Ferrari ha scritto:

On Fri, Aug 9, 2019 at 2:29 PM Moreno Andreo  wrote:

Hi all,
  I don't know if that's the heat burning my brain but I can't find a
solution to what seemed a simple operation to me.

I have 3 tables
create table t_all
{
id uuid,
ref_id uuid (FK to t_ana.id)
};
create table t_ana
{
id uuid,
code text
};
create table t_app
{
id uuid,
code text(subset of t_ana.code)
}
I need to update t_all set t_all.id = t_app.id having t_ana.code in
t_app.code (I wrote it in some kind of meta-sql but I hope it's clear)
I tried to create a view but I need an INSTEAD OF trigger, since it
spreads among 3 tables so I hope there's some faster path to achieve the
solution


Not sure I got what you need, and I've not tested, but something like
the following:

WITH must_update AS (
SELECT app.id AS app_id, ana.id AS ana_id
FROM t_app app, t_ana ana
WHERE app.code = ana.code
)

UPDATE t_all
SET id = ( SELECT app_id FROM must_update WHERE ref_id = must_update.ana_id );

I've written the CTE because it is a little clearer in my mind, but
you can push down as a subquery of course.


Thanks Luca,

that's the idea I needed... now some small trimming and I think I'll be 
there


(just for the logs... the UPDATE statement needs a WHERE clause, 
otherwise it will NULL the id field in all rows where ref_id is not 
present in must_update :-) )


I owe you a beer :-)

Cheers

Moreno.



Luca








How to conditionally change the 2nd part of a full join

2019-08-14 Thread Moreno Andreo
I need to obtain a single record from 2 records in the same table 
grouping for a key, say

id    value    value2
1 2            5
1    2         7

the result would be
1    2    5    7
and that works fine with a full join:
SELECT * FROM
(SELECT idp, data, i1, i2, m1, m2 from tblwk WHERE recordkey = 4) s1
FULL JOIN
(SELECT idp, data, i1, m1, m2 from tblwk WHERE recordkey = 10) s2
USING (data, idp)

Now, to get another dataset, I need the second subselect to change based 
on a value acquired from the first one. I tried with a case


SELECT * FROM (
    SELECT idp, data, idd, rif1, rif2, t1, t2, t3, t5, t7, t9, t10, i1, 
i2, i3, dg from tblwk WHERE recordkey = 1) s1

    FULL JOIN
       case
       when i1=1 then (SELECT idp, data, desc, rif1, rif3, t1, t2, 
t5 from tblwk WHERE recordkey = 2) s2
       when i1=2 then (SELECT idp, data, desc, rif1, t4, t5, i2 
from tblwk WHERE recordkey = 3 order by i2) s2
       when i1=3 then (SELECT idp, data, desc, rif1, t2, t5, t6, i2 
from tblwk WHERE recordkey = 4 order by i2) s2
       when i1=4 then (SELECT idp, data, desc, i2 from tblwk WHERE 
recordkey = 9) s2


       end

       USING (data, idp)
but it doesn't like "case" after a FULL JOIN.
I read a bit of docs and discovered LATERAL, but AFAIK it's useless here.
Is it necessary to write a function (which would be my last resort, not 
just because I'm not so good in writing functions) or there is some SQL 
syntax that can come in help?

Thanks
Moreno.-





Rebuild pg_toast from scratch?

2019-09-05 Thread Moreno Andreo

Hi,
    I have an issue with a Windows 10 PC with Postgres 9.1 x86.
Running some SELECTs we got "ERROR: could not open file 
"base/48121/784576": No such file or directory"


I then ran

select
n.nspname AS tableschema,
    c.relname AS tablename
from pg_class c
inner join pg_namespace n on (c.relnamespace = n.oid)
where c.relfilenode = 784576;

and discovered that the involved table is pg_toast_49713.

Now I need to recreate an empty copy of this table (data can be 
recovered later), how can I achieve this?


Thanks in advance
Moreno.-







Re: Rebuild pg_toast from scratch?

2019-09-05 Thread Moreno Andreo

Hi Tom and thanks for your time,

Il 05/09/19 15:53, Tom Lane ha scritto:

Moreno Andreo  writes:

      I have an issue with a Windows 10 PC with Postgres 9.1 x86.

You realize 9.1 is long out of support ...

Absolutely :-)
I'm about to migrate it to 11

Now I need to recreate an empty copy of this table (data can be
recovered later), how can I achieve this?
TRUNCATE ought to be enough.

truncate pg_toast_49713;

ERROR: relation "pg_toast_49713" does not exist

:-\
... or do I need to truncate the "master" table (the table blobs are in)?

Thanks





Pg11 -- MultiXactId xxxx has not been created yet -- apparent wraparound

2019-10-04 Thread Moreno Andreo

  
  
Hi all,
    I'm encountering this issue in a Windows 10/Pg11.5


I followed the thread @
https://postgrespro.com/list/thread-id/2380690

but examining heap pages is far beyond my knowledge, so if any of
the gurus would spend some time on it, I would be very grateful.

So, here comes the facts

database0=# select datminmxid from pg_database where datname =
current_database();
 datminmxid

    365
(1 row)


The CTID of the tuple BEFORE the failing one is 3159,51:

database0=# select ctid from tablename offset 368 limit 1;
ERROR:  MultiXactId 12800 has not been created yet -- apparent
wraparound
database0=# select ctid from tablename offset 367 limit 1;
   ctid
---
 (3159,51)
(1 row)

Now, I started issuing the queries
select * from heap_page_items(get_raw_page('tablename',3159));
select * from heap_page_items(get_raw_page('tablename',3160));

and so on for about 5 or 6 pages.
What do I need to search for? 
Or is it better to \copy them to and send 'em as an attachment?
How many pages do I need to search for? 
Until I see the error again?

Thanks in advance
Moreno.-

  






Re: Pg11 -- MultiXactId xxxx has not been created yet -- apparent wraparound

2019-10-04 Thread Moreno Andreo

Il 04/10/19 17:30, Alvaro Herrera ha scritto:

On 2019-Oct-04, Moreno Andreo wrote:


select * from heap_page_items(get_raw_page('tablename',3159));
select * from heap_page_items(get_raw_page('tablename',3160));

and so on for about 5 or 6 pages.

Please paste the output of that for pages 3159 and 3160, as well as the
output of pg_controldata.


Thanks Alvaro,
    you can find attached the data you requested


3159.csv
Description: MS-Excel spreadsheet


3160.csv
Description: MS-Excel spreadsheet
Numero di versione di pg_control:   1100
Numero di versione del catalogo:201809051
Identificatore di sistema del database: 6742478776920498196
Stato del cluster di database:  in produzione
Ultima modifica a pg_control:   04/10/2019 15.57.47
Ultima posizione del checkpoint:3/5768A440
Locazione di REDO dell'ultimo checkpoint:   3/5768A408
File WAL di REDO dell'ultimo checkpoint:000100030057
TimeLineId dell'ultimo checkpoint:  1
PrevTimeLineID dell'ultimo checkpoint:  1
full_page_writes dell'ultimo checkpoint:attivato
NextXID dell'ultimo checkpoint: 0:2405812
NextOID dell'ultimo checkpoint: 859841
NextMultiXactId dell'ultimo checkpoint: 366
NextMultiOffset dell'ultimo checkpoint: 0
oldestXID dell'ultimo checkpoint:   663
DB dell'oldestXID dell'ultimo checkpoint:   18836
oldestActiveXID dell'ultimo checkpoint: 2405812
oldestMultiXID dell'ultimo checkpoint:  365
DB dell'oldestMulti dell'ultimo checkpoint: 13011
oldestCommitTsXid dell'ultimo checkpoint:   0
newestCommitTsXid dell'ultimo checkpoint:   0
Orario ultimo checkpoint:   04/10/2019 15.57.43
Falso contatore LSN per rel. non loggate:   0/1
Posizione del minimum recovery ending:  0/0
Timeline posiz. minimum recovery ending:0
Posizione dell'inizio del backup:   0/0
Posizione della fine del backup:0/0
Record di fine backup richiesto:no
Impostazione di wal_level:  replica
Impostazione di wal_log_hints:  disattivato
Impostazione di max_connections:1000
Impostazione di max_worker_processes:   8
Impostazione di max_prepared_xacts: 0
Impostazione di max_locks_per_xact: 64
Impostazione di track_commit_timestamp: disattivato
Massimo allineamento dei dati:  8
Dimensione blocco database: 8192
Blocchi per ogni segmento grosse tabelle:   131072
Dimensione blocco WAL:  8192
Byte per segmento WAL:  16777216
Lunghezza massima degli identificatori: 64
Massimo numero di colonne in un indice: 32
Massima dimensione di un segmento TOAST:1996
Dimensione di un blocco large-object:   2048
Memorizzazione per tipi data/ora:   interi a 64 bit
Passaggio di argomenti Float4:  per valore
passaggio di argomenti Float8:  per valore
Versione somma di controllo dati pagine:0
Finto nonce di autenticazione:  
f4d747c6c35ca14eef04db5eb0b575e0ec8a8deb90c23c96bc986f27cdd5463a


Re: Pg11 -- MultiXactId xxxx has not been created yet -- apparent wraparound

2019-10-04 Thread Moreno Andreo

Il 04/10/19 18:28, Alvaro Herrera ha scritto:

I wonder if it would work to just clear that multixact with
SELECT ... WHERE ctid='(3160,31)' FOR UPDATE
select ...what? :-) Sorry but it's totally beyond my knowledge and my 
control after resolving the issue i'll surely go and search docs to 
understand what we've done


If this was in my hands, I would scan the WAL looking for the place that
last touched this page (and the latest FPI for this page, also).  It
might have an explanation of what went on.  Maybe use the page's LSN
(from pageinspect's page_header()) as starting point for the WAL
location that modified the page.  I hope you have a WAL archive that
goes back to well before the previous checkpoint.
One thing I forgot to report is that this cluster is just upgraded from 
a 9.1 that was crashing at least once a day (in many cases the upgrade 
itself resolved the issue)

here's the log line
2019-10-03 15:11:52 CEST LOG:  server process (PID 18668) was terminated 
by exception 0xC005

In this case probably the access violation was due to a data corruption.
These are customer machines that are really badly kept and NTFS issues 
are not that rare, so I won't bother investigating what's happened but 
just make the customer up & running again.


Thanks for your time
Moreno









Re: Pg11 -- MultiXactId xxxx has not been created yet -- apparent wraparound

2019-10-07 Thread Moreno Andreo

Il 04/10/19 21:14, Alvaro Herrera ha scritto:

On 2019-Oct-04, Moreno Andreo wrote:


Il 04/10/19 18:28, Alvaro Herrera ha scritto:

I wonder if it would work to just clear that multixact with
SELECT ... WHERE ctid='(3160,31)' FOR UPDATE

select ...what? :-) Sorry but it's totally beyond my knowledge and my
control after resolving the issue i'll surely go and search docs to
understand what we've done

This should do it:

SELECT * FROM the_broken_table WHERE 

But of course I make no promise of it working or even having any effect
at all ...

Unfortunately, it didn't work :(

db0=# select * from failing_table where ctid='(3160,31)' for update;
ERROR:  MultiXactId 12800 has not been created yet -- apparent wraparound

Since the probability we are into corruption is very high, what if I 
\copy all the table but the failing row(s) to an external file, drop and 
recreate the table, and then \copy clean data back inside?


Thanks
Moreno.-





Re: Pg11 -- MultiXactId xxxx has not been created yet -- apparent wraparound

2019-10-15 Thread Moreno Andreo

Hi Alvaro,
    sorry for late reply, I've been out of office.

Il 09/10/19 19:51, Alvaro Herrera ha scritto:

On 2019-Oct-07, Moreno Andreo wrote:


Unfortunately, it didn't work :(

db0=# select * from failing_table where ctid='(3160,31)' for update;
ERROR:  MultiXactId 12800 has not been created yet -- apparent wraparound

Oh well.  It was a long shot anyway ...

It was a long shot, but it was worth trying

Since the probability we are into corruption is very high, what if I \copy
all the table but the failing row(s) to an external file, drop and recreate
the table, and then \copy clean data back inside?

Yes, that should work.

It did not work... I think there was some big deal with the cluster itself.
To extract these small parts of data I had to SELECT using OFFSET and LIMIT.
Well, the same query (i.e. select * from table offset 35 limit 145) run 
as it is worked well, but from the moment I put it into a COPY 
statement, it was messing again with multixact, even if I tried back the 
only query.

It ended recovering data from backups (2 days old, and that's good news)

Thanks for your time
Moreno.-









CASE(?) to write in a different column based on a string pattern

2019-11-13 Thread Moreno Andreo

Hi,
    I need to create a CASE (I think) statement to check for a string 
pattern, and based on its value, write a substring in a different column 
(alias).
I'm trying to create a COPY statement to port a table into antoher 
database, which has a table with another format (that's why the aliases)


Let's write it in pseudoSQL:

given this

select pattern from tbl;
pattern
--
foo1234
bar5678
baz9012

That's what I'm trying to achieve

select
    pattern,
        CASE when pattern like 'foo%' then ltrim(pattern, 'bar') as foo
              when pattern like 'bar%' then ltrim(pattern, 'bar') 
as bar
              when pattern like 'baz%' then ltrim(pattern, 'baz') 
as baz

        END
from tbl;

|foo   |bar  |baz |
 1234
            5678
                        9012
(hoping text formatting is ok... 1234 should go in column foo, 568 in 
bar and 9012 in baz)


Is it possible?

Thanks in advance
Moreno.-





Re: CASE(?) to write in a different column based on a string pattern

2019-11-13 Thread Moreno Andreo

Il 13/11/19 17:48, Andrew Kerber ha scritto:
So what you are doing is transforming the table format from vertical 
to horizontal.  I think you will want to use a union to join the table 
to itself along with the case statement to produce the output you are 
looking for.


Not precisely, the string pattern is only part of a bigger table (30 
columns in total), what I'm trying to achieve is what Geoff explained, 
just split values in 3 different columns based on the string pattern


Thanks for your time
Moreno.





Re: CASE(?) to write in a different column based on a string pattern

2019-11-13 Thread Moreno Andreo

Il 13/11/19 17:36, Geoff Winkless ha scritto:


Simplest way in plain SQL would be individual case statements for each
column, I think.

SELECT pattern,
   CASE WHEN pattern LIKE 'foo%' THEN SUBSTR(pattern, 4) ELSE '' END AS foo
   CASE WHEN pattern LIKE 'bar%' THEN SUBSTR(pattern, 4) ELSE '' END AS bar
   CASE WHEN pattern LIKE 'baz%' THEN SUBSTR(pattern, 4) ELSE '' END AS baz
FROM tbl;

Geoff



Geoff,
    it worked perfectly!

Thanks a lot!
Moreno.-





Re: [SPAM] Remote Connection Help

2019-11-21 Thread Moreno Andreo

  
  
Il 21/11/19 15:21, Jason L. Amerson ha
  scritto:


  
  
  
  
   
could not
connect to server: Connection refused (Ox274D/10061) Is
the server running on host " xx.xx.xx.xx" and accepting 
TCP/IP
connections on port 5432'

  

Connection refused means somthing has blocked it. If it was all OK
and simply Postgres was not listening, you should've received a
"connection timed out"(10060) message.
Have you tried adjusting/tearing off iptables and check what
happens, as also Ekaterina pointed out?

And, just as a side note, I normally don't activate IPv6 if it's not
necessary (it has not been necessary in the last 10 years :-) ),
'cause I've run in some troubles that have been cleared getting rid
of IPv6)
so I'll try editing postgresql.conf as
listen = '127.0.0.1'

HTH,
Moreno.-
  



Re: [SPAM] Remote Connection Help

2019-11-22 Thread Moreno Andreo

Il 21/11/19 22:40, Peter J. Holzer ha scritto:

On 2019-11-21 17:27:04 +0100, Moreno Andreo wrote:

Connection refused means somthing has blocked it. If it was all OK and simply
Postgres was not listening, you should've received a "connection timed out"
(10060) message.

Almost exactly the other way around. If you try to connect to a port
where no server is listening, you get a connection refused message.
If something is blocking the connection you may get a timeout.

 hp



O.O
You're right, obviously. Don't know why, it's since school days that 
sometines I swap these 2 error meanings (and have to deal with the 
consequences)...
But I'm confident that one day there will be no more swapping... 
hopefully until I retire ...:-)

Thanks for pointing out and sorry for the mistake
(blushing)

Moreno.-





Re: Should I reinstall over current installation?

2020-02-04 Thread Moreno Andreo

  
  

  Il 04/02/2020 00:16, Chris Charley ha scritto:


  
  
I tried items you suggested (1-5), but could find no
  helpful info.


Thanks for your help and going the extra mile!
  
  


Hope I'm in time to try to save you from reinstall :-)

How to check if PostgreSQL is running
---
- Run Services.msc
- In the console you should have an entry with your postgres version
(I have 9.1 and 9.5 on this host)
- Check it's automatically started and running (It's in Italian, "In
esecuzione" means "running" and "Automatico", well :-)).
- If it's not, try to start it (right click on the service and
select start)
- If you get an error, check the error message




How to check the error message
---
You have 2 choices

1 - Postgres Logs)
of all, you should check PostgreSQL logs under \data\pg_log. Here you should find some files showing you
most of the errors you would find in Event Viewer.
Check error messages and, if still needed, post them

2 - Windows Event Log)

- Enter Event Viewer
- Sort Events by date descending
- First entries should reveal why your postgres server is not
running
(I captured the first error I got on this host just for example)
- On the lower pane, check what's the error message (in my case,
"FATAL: The database system is starting up").


If you need further help, post the error message and we'll try to
help you.


Hope this help
Moreno.-


  



Re: Should I reinstall over current installation?

2020-02-05 Thread Moreno Andreo

Il 04/02/2020 21:18, Chris Charley ha scritto:

Hello Moreno

Thanks for the reply!

I ran Services and it reported postsql as Disabled.


A disabled service will never run nor leave error messages anywhere

Right click on the Postgresql service, select properties.
In the next window, choose startup type (the combo under the executable 
path) and select Automatic, then click on the Start button below

Click OK and close the window.

Now something should happen. If everything goes well, You'll see 
"Started" and "Automatic" in the two columns next to the service 
description, and you can start your application
If there is an error message, then it should log something. See my last 
mail to extract log informations and post them here


I am considering reinstall because I couldn't find any helpful info 
from these services.


You got no helpful info because service does not start, andd so does not 
log :-)



Thanks again for your help.
 I suspect when Windows was reinstalled, the firewall was reconfigured 
and PostgreSQL wasn't allowed to access the port?


ATM there's a service start issue to be resolved, if there's also 
something about the firewall, we'll think about it later.


Moreno.-





Unable to run psql on 9.5 after broken 12 remove

2020-07-03 Thread Moreno Andreo

I have a production server running pg9.5 seamlessly.
Yesterday I decided to install libpq to have some crypto functions. 
Unexpectedly, it installed postgresql 12 and its libpq.

I don't need pg 12, so I decided to remove it.
It did not went well

root@datastore-1:/home/user# apt-get --purge remove postgresql-client-12
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following package was automatically installed and is no longer required:
  libuuid-perl
Use 'apt-get autoremove' to remove it.
The following packages will be REMOVED:
  postgresql-12* postgresql-client-12* postgresql-contrib*
0 upgraded, 0 newly installed, 3 to remove and 89 not upgraded.
1 not fully installed or removed.
After this operation, 30.9 MB disk space will be freed.
Do you want to continue? [Y/n] y
(Reading database ... 32065 files and directories currently installed.)
Removing postgresql-contrib (12+215.pgdg80+1) ...
Removing postgresql-12 (12.3-1.pgdg80+1) ...
update-alternatives: using 
/usr/share/postgresql/9.5/man/man1/postmaster.1.gz to provide 
/usr/share/man/man1/postmaster.1.gz (postmaster.1.gz) in auto mode

Purging configuration files for postgresql-12 (12.3-1.pgdg80+1) ...
Removing postgresql-client-12 (12.3-1.pgdg80+1) ...
update-alternatives: using /usr/share/postgresql/9.5/man/man1/psql.1.gz 
to provide /usr/share/man/man1/psql.1.gz (psql.1.gz) in auto mode

Processing triggers for postgresql-common (215.pgdg80+1) ...
Building PostgreSQL dictionaries from installed myspell/hunspell packages...
Removing obsolete dictionary files:
Setting up python3.4 (3.4.2-1+deb8u3) ...
  File "/usr/lib/python3.4/http/client.py", line 1014
    raise InvalidURL(f"URL can't contain control characters. {url!r} "
^
SyntaxError: invalid syntax
dpkg: error processing package python3.4 (--configure):
 subprocess installed post-installation script returned error exit status 1
Errors were encountered while processing:
 python3.4
E: Sub-process /usr/bin/dpkg returned an error code (1)

After this, the package is not anymore on the installed list and I'm not 
able to issue the psql command:


root@datastore-1:/home/user# sudo -u postgres psql
Error: PostgreSQL version 12 is not installed

How can I repair this?
Thanks in advance
Moreno.





Re: Unable to run psql on 9.5 after broken 12 remove

2020-07-03 Thread Moreno Andreo

After upgrading python the InvalidUrl is gone, but I still can't run psql

Output of pg_lsclusters, if needed
root@datastore-1:/usr/share/postgresql-common# pg_lsclusters
Ver Cluster Port Status    Owner Data 
directory   Log file
9.5 main    6543 online    postgres 
/var/lib/postgresql/9.5/main /var/log/postgresql/postgresql-9.5-main.log
12  main    5432 down,binaries_missing  
/var/lib/postgresql/12/main /var/log/postgresql/postgresql-12-main.log


I followed
https://askubuntu.com/questions/1223270/psql-command-error-postgresql-version-12-is-not-installed
with no luck, apt-purge simply states that pg12 is not installed.


Il 03/07/2020 10:37, Moreno Andreo ha scritto:

I have a production server running pg9.5 seamlessly.
Yesterday I decided to install libpq to have some crypto functions. 
Unexpectedly, it installed postgresql 12 and its libpq.

I don't need pg 12, so I decided to remove it.
It did not went well

root@datastore-1:/home/user# apt-get --purge remove postgresql-client-12
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following package was automatically installed and is no longer 
required:

  libuuid-perl
Use 'apt-get autoremove' to remove it.
The following packages will be REMOVED:
  postgresql-12* postgresql-client-12* postgresql-contrib*
0 upgraded, 0 newly installed, 3 to remove and 89 not upgraded.
1 not fully installed or removed.
After this operation, 30.9 MB disk space will be freed.
Do you want to continue? [Y/n] y
(Reading database ... 32065 files and directories currently installed.)
Removing postgresql-contrib (12+215.pgdg80+1) ...
Removing postgresql-12 (12.3-1.pgdg80+1) ...
update-alternatives: using 
/usr/share/postgresql/9.5/man/man1/postmaster.1.gz to provide 
/usr/share/man/man1/postmaster.1.gz (postmaster.1.gz) in auto mode

Purging configuration files for postgresql-12 (12.3-1.pgdg80+1) ...
Removing postgresql-client-12 (12.3-1.pgdg80+1) ...
update-alternatives: using 
/usr/share/postgresql/9.5/man/man1/psql.1.gz to provide 
/usr/share/man/man1/psql.1.gz (psql.1.gz) in auto mode

Processing triggers for postgresql-common (215.pgdg80+1) ...
Building PostgreSQL dictionaries from installed myspell/hunspell 
packages...

Removing obsolete dictionary files:
Setting up python3.4 (3.4.2-1+deb8u3) ...
  File "/usr/lib/python3.4/http/client.py", line 1014
    raise InvalidURL(f"URL can't contain control characters. {url!r} "
^
SyntaxError: invalid syntax
dpkg: error processing package python3.4 (--configure):
 subprocess installed post-installation script returned error exit 
status 1

Errors were encountered while processing:
 python3.4
E: Sub-process /usr/bin/dpkg returned an error code (1)

After this, the package is not anymore on the installed list and I'm 
not able to issue the psql command:


root@datastore-1:/home/user# sudo -u postgres psql
Error: PostgreSQL version 12 is not installed

How can I repair this?
Thanks in advance
Moreno.










Re: Unable to run psql on 9.5 after broken 12 remove

2020-07-03 Thread Moreno Andreo

Il 03/07/2020 16:51, Adrian Klaver ha scritto:

On 7/3/20 1:54 AM, Moreno Andreo wrote:

Looks like the cluster was not removed from the pgcommon setup. This 
would explain why you can't run psql. By default pgcommon looks for 
the version of psql connected with the most recent version of Postgres 
it knows about, in this case 12. Unfortunately that binary no longer 
exists. You still have psql. Do:


/usr/lib/postgresql/9.5/bin/psql --help

OK, it confirms that 9.5 client is still ok


To avoid having to do that try:

sudo pg_dropcluster  12 main

This reported a warning

root@datastore-1:/home/user# pg_dropcluster 12 main
Warning: corrupted cluster: data directory does not exist

... but it did its job

root@datastore-1:/home/user# sudo -u postgres psql
psql (9.5.6)
Type "help" for help.

postgres=# \q


Thanks, you saved my day again!

Moreno.






Il 03/07/2020 10:37, Moreno Andreo ha scritto:

I have a production server running pg9.5 seamlessly.
Yesterday I decided to install libpq to have some crypto functions. 
Unexpectedly, it installed postgresql 12 and its libpq.

I don't need pg 12, so I decided to remove it.
It did not went well

root@datastore-1:/home/user# apt-get --purge remove 
postgresql-client-12

Reading package lists... Done
Building dependency tree
Reading state information... Done
The following package was automatically installed and is no longer 
required:

  libuuid-perl
Use 'apt-get autoremove' to remove it.
The following packages will be REMOVED:
  postgresql-12* postgresql-client-12* postgresql-contrib*
0 upgraded, 0 newly installed, 3 to remove and 89 not upgraded.
1 not fully installed or removed.
After this operation, 30.9 MB disk space will be freed.
Do you want to continue? [Y/n] y
(Reading database ... 32065 files and directories currently installed.)
Removing postgresql-contrib (12+215.pgdg80+1) ...
Removing postgresql-12 (12.3-1.pgdg80+1) ...
update-alternatives: using 
/usr/share/postgresql/9.5/man/man1/postmaster.1.gz to provide 
/usr/share/man/man1/postmaster.1.gz (postmaster.1.gz) in auto mode

Purging configuration files for postgresql-12 (12.3-1.pgdg80+1) ...
Removing postgresql-client-12 (12.3-1.pgdg80+1) ...
update-alternatives: using 
/usr/share/postgresql/9.5/man/man1/psql.1.gz to provide 
/usr/share/man/man1/psql.1.gz (psql.1.gz) in auto mode

Processing triggers for postgresql-common (215.pgdg80+1) ...
Building PostgreSQL dictionaries from installed myspell/hunspell 
packages...

Removing obsolete dictionary files:
Setting up python3.4 (3.4.2-1+deb8u3) ...
  File "/usr/lib/python3.4/http/client.py", line 1014
    raise InvalidURL(f"URL can't contain control characters. {url!r} "
^
SyntaxError: invalid syntax
dpkg: error processing package python3.4 (--configure):
 subprocess installed post-installation script returned error exit 
status 1

Errors were encountered while processing:
 python3.4
E: Sub-process /usr/bin/dpkg returned an error code (1)

After this, the package is not anymore on the installed list and I'm 
not able to issue the psql command:


root@datastore-1:/home/user# sudo -u postgres psql
Error: PostgreSQL version 12 is not installed

How can I repair this?
Thanks in advance
Moreno.


















Debian : No echo after pg_dump | psql

2018-11-29 Thread Moreno Andreo

Hi guys,
    I'm facing a strange thing on my test server (Google Cloud)

On my Debian 9 box I'm running Postgres 9.6.10, and I'm transferring 
some databases from another server (Debian 8, PG 9.5.15).


The command I'm using is
root@x:~# pg_dump -v -C -h  -p 6543 -U postgres 
 | psql -h localhost -p 6543 -U postgres 

It presents a double password prompt after I run it:
Password: Password for user postgres:

I enter _once_ my postgres password, and it starts processing data.

At a certain point it suddenly stops processing without prompting 
anything, and goes on only if I insert my postgres password again, as if 
it was for one of the two prompts it showed at the beginning where I 
entered only one password.
After another bit of processing, it asks another time (the third!) my 
postgres password. After I entered it, it goes to the end (and there are 
no issues in data or anything simia.


After this, I have no more echo on what I'm typing, but if I type 
something and press enter, I can see the result of the command I just 
typed (and have not seen) on the console.


Restarting the SSH session resolves the problem.

The same strange behavior is the same if I switch servers (running 
pg_dump on Debian 8/PG 9.5 connecting on Debian 9/PG 9.6)


Another thing I was forgetting to report:

If I create database (just database, not schema) on target machine, I 
receive the error "database  already exists" but the dump goes on
If I don't create it, I receive the error "database  does not exist" 
and processing aborts.


Hope I've been clear enough..

Has someone bumped into it?

Thanks in advance

Moreno.-





Re: Debian : No echo after pg_dump | psql

2018-12-12 Thread Moreno Andreo

Adrian, Andrew, thanks and apologies for the late reply

Il 30/11/2018 05:08, Andrew Gierth ha scritto:

"Moreno" == Moreno Andreo  writes:

  Moreno> The command I'm using is
  Moreno> root@x:~# pg_dump -v -C -h  -p 6543 -U postgres
  Moreno>  | psql -h localhost -p 6543 -U postgres 
  Moreno> It presents a double password prompt after I run it:
  Moreno> Password: Password for user postgres:

This is going to prompt once for the remote host's password and twice
for the local one (because -C), and the concurrently-running commands
are going to be fighting over access to the terminal to do it. Best
avoided by using pgpass or non-password-based auth methods.

More seriously, you're misunderstanding how -C works. When you use -C,
the database you specify to psql (or pg_restore) is NOT the database
you're restoring into - the restored db will ALWAYS have the same name
as it had when dumped (if that's not what you want then don't use -C).
Instead, the database you specify to psql or pg_restore is the database
to connect to to issue the CREATE DATABASE command, which should usually
be 'postgres'. That explains this bit:

  Moreno> If I create database (just database, not schema) on target
  Moreno> machine, I receive the error "database  already exists" but
  Moreno> the dump goes on If I don't create it, I receive the error
  Moreno> "database  does not exist" and processing aborts.

I removed -C usage since the target database (the database itself, not 
the schema) is created with a CREATE DATABASE before issuing pg_dump (so 
I don't need it), but strange prompt behavior remained the same


I'm having a really hard time these days, so I can't investigate 
further. Maybe on holidays, when I hope the pressure will be released a 
bit.


Will report as soon as I can.

THanks again

Cheers,

Moreno.-





REVOKE to an user that doesn't exist

2018-12-12 Thread Moreno Andreo

Hi all,

I'm trying to pg_upgrade a cluster from 9.1 to 10 (Windows 10, but I 
don't think it matters).


At a certain point an error is thrown while parsing a trigger:

could not execute query: ERROR:  role "1067431" does not exist

command was: REVOKE ALL ON FUNCTION "x"() FROM PUBLIC;
GRANT ALL ON FUNCTION "public"."x"() TO "1067431" WITH GRANT OPTION;

Here's the evidence :-)

postgres=# \du
    List of roles
    Role name    |    Attributes    | 
Member of

--+-+---
 user5  | Create role, Create DB | {}
 user2     | Superuser, Create role, Create DB, Replication | {}
 user4 | Create role, Create DB | {}
 user3   | | {}
 user1  | Superuser, Create role, Create DB, Replication | {}
 postgres | Superuser, Create role, Create DB, Replication | {}

postgres=# select * from pg_roles;
 rolname  | rolsuper | rolinherit | rolcreaterole | rolcreatedb 
| rolcatupdate | rolcanlogin | rolreplication | rolconnlimit | 
rolpassword | rolvaliduntil | rolconfig |   oid

--+--++---+-+--+-++--+-+---+---+-
 postgres | t    | t  | t | t   
| t    | t   | t |   -1 |     
|   |   |  10
 user1  | t    | t  | t | t | t    
| t   | t  |   -1 |     
|   |   |   16393
 user2     | t    | t  | t | t   | 
t    | t   | t  |   -1 |     
|   |   |   16394
 user3   | f    | t  | f | f | f    
| t   | f  |   -1 |     
|   |   |   16395
 user4 | f    | t  | t | t   | 
f    | t   | f  |   -1 |     
|   |   | 1067432
 user5  | f    | t  | t | t   | 
f    | t   | f |   -1 |     
|   |   |   30602

(6 rows)

So, no user with that OID. I checked in the trigger pointed to the error 
and I found


GRANT EXECUTE ON FUNCTION x() TO "1067431" WITH GRANT OPTION;

How to REVOKE that non-existing user so pg_upgrade can proceed?

thanks

Moreno.-





Re: REVOKE to an user that doesn't exist

2018-12-12 Thread Moreno Andreo

Il 12/12/2018 15:39, Adrian Klaver ha scritto:

On 12/12/18 5:11 AM, Moreno Andreo wrote:

Hi all,

I'm trying to pg_upgrade a cluster from 9.1 to 10 (Windows 10, but I 
don't think it matters).


At a certain point an error is thrown while parsing a trigger:

could not execute query: ERROR:  role "1067431" does not exist

command was: REVOKE ALL ON FUNCTION "x"() FROM PUBLIC;
GRANT ALL ON FUNCTION "public"."x"() TO "1067431" WITH GRANT OPTION;

Here's the evidence :-)


The below is from the 9.1 cluster, correct?

Correct, 9.1.6, IIRC



postgres=# select * from pg_roles;
  rolname  | rolsuper | rolinherit | rolcreaterole | 
rolcreatedb | rolcatupdate | rolcanlogin | rolreplication | 
rolconnlimit | rolpassword | rolvaliduntil | rolconfig |   oid
--+--++---+-+--+-++--+-+---+---+- 

  postgres | t    | t  | t | t | 
t    | t   | t |   -1 |  
|   |   |  10
  user1  | t    | t  | t | t | t | 
t   | t  |   -1 |  
|   |   |   16393
  user2     | t    | t  | t | t   | 
t    | t   | t |   -1 |  
|   |   |   16394
  user3   | f    | t  | f | f | f | 
t   | f  |   -1 |  
|   |   |   16395
  user4 | f    | t  | t | t   | 
f    | t   | f  |   -1 |  
|   |   | 1067432
  user5  | f    | t  | t | 
t   | f    | t   | f |   -1 | 
 |   |   |   30602

(6 rows)

So, no user with that OID. I checked in the trigger pointed to the 
error and I found


GRANT EXECUTE ON FUNCTION x() TO "1067431" WITH GRANT OPTION;


I am not following as a trigger would not have that in its code.

Are you referring to the function x()?

If so is the GRANT in the function?
Sorry, the grant above is extracted from the CREATE statement that 
PgAdmin3 shows when you click on the trigger






Re: REVOKE to an user that doesn't exist

2018-12-12 Thread Moreno Andreo

Il 12/12/2018 16:01, Tom Lane ha scritto:

Moreno Andreo  writes:

I'm trying to pg_upgrade a cluster from 9.1 to 10 (Windows 10, but I
don't think it matters).
At a certain point an error is thrown while parsing a trigger:
could not execute query: ERROR:  role "1067431" does not exist
command was: REVOKE ALL ON FUNCTION "x"() FROM PUBLIC;
GRANT ALL ON FUNCTION "public"."x"() TO "1067431" WITH GRANT OPTION;

Hm.  We've seen similar bugs before; the mechanism is that at some
point the function owner granted privileges to somebody else, and
at some later point the somebody-else role got dropped, but the
privilege grant stayed behind because the system had lost, or never
made, the pg_shdepend entry indicating that this function had an ACL
entry mentioning that role.  The extra ACL entry is harmless, until
you wonder why pg_dump is printing a nonsensical command due to it.
That's the case. The customer never complained about any problem but 
today, while upgrading his cluster, we bumped into it.


We fixed a couple of bugs of that ilk just last month [1], but they were
for cases involving types and schemas, not functions.  The last case
involving function privileges that I see in a quick trawl of the commit
log predates 9.0 release [2].  I wonder how old this cluster is ...

The cluster is version 9.1.6, IIRC



How to REVOKE that non-existing user so pg_upgrade can proceed?

The safest way to clean it up manually would be to set the pg_proc.proacl
field for that function to NULL.  If there are other grants about the
function, you could try removing the bad entry, but it would likely be
safer to just re-grant after the upgrade.


Is it not the case to edit proacl column to just remove the unwanted 
role? Is it faster and safer to null the column and just reGRANT?


Thanks

Moreno.-





Re: REVOKE to an user that doesn't exist

2018-12-17 Thread Moreno Andreo

Il 12/12/2018 16:01, Tom Lane ha scritto:


The safest way to clean it up manually would be to set the pg_proc.proacl
field for that function to NULL.  If there are other grants about the
function, you could try removing the bad entry, but it would likely be
safer to just re-grant after the upgrade.

Setting the column (and several others from other tables such as 
pg_class) to null made me possible to upgrade the cluster. After the 
upgrade, I issued the necessary GRANTs and everything is up & running on 
Postgres 10 now!


Thanks again

Moreno.-





Re: problem

2019-02-01 Thread Moreno Andreo

  
  
Check in your %temp% directory, there
  should be some bitrock* or bitrock_installer* file, these are
  setup logs and can point you to the problem.


If initdb failed, you can try running
  something like this

  initdb -D
  "C:\My\Postgres\Setup\Path" --encoding=UTF8 --locale="Italian,
  Italy"
  -W –n
  and see if you get some error

  
(if you are installing an old veresion
  (<9.3 IIRC) you can have to add -U postgres to have the user
  Postgres -- that's been removed in last versions -- to be the
  owner of data directory)


If you point us to the os/postgres
  versions we could try to be more accurate :-)


Cheers
Moreno.-



Il 31/01/2019 14:10, Mirco Gallazzi ha
  scritto:


  
  
  
  
GoodMornig, 
I can’t install postgres on my pc because i
  have Always this error: “ Problem running post-install step.
  Installation may not complete correctly. The database cluster
  initialisation failed.”
I looked for a guide that can help me to
  solve this problem but it was all a big fail. 
Itried to create a new user on my pc to
  create postgres server but nothing. I tried to follow some
  guides on web/YouTube but nothing.
Can you please help me to solve this
  problem?
I must need to solve this because in my
  university they use postgres and i must need this to complete
  the exam and the project on database.
Thank for patience. Have a good day, mirco
 
 
Inviato da Posta per Windows 10
 
  



  





Re: Connection pooling for differing databases?

2019-03-07 Thread Moreno Andreo

Il 07/03/2019 20:27, Arjun Ranade ha scritto:

Hi all,

I'm wondering if there's a tool like pgpool that can provide a single 
origin point (host/port) that will proxy/direct connections to the 
specific servers that contain the db needing to be accessed.
Yes, I think there are many, but I'm encouraging you to take a look at 
pgbouncer


https://pgbouncer.github.io/

in pgbouncer.ini you enter database configuration values like

database = host=hostname port=xyzk, like
mydb1 = host=cluster1 port=6543 or
mydb2 = host=cluster1 port=9876
mydb3 = host=cluster2 port=6543

but there many other parameters to refine your config (like "proxying" 
database names, so if you share names across clusters you can easily 
avoid conflicts)


Pgbouncer should be installed on the same server as the databases or in 
another and listens on a different port than Postgres' (say 5431 while 
postgres is on 5432)
I'm actively using in my environment with 2 clusters and about 500 
databases, works flawlessly.


One thing you have to consider, if under heavy workload (say 100's of 
connections) is to raise kernel value of maximum open files


Cheers

Moreno.-





Re: Connection pooling for differing databases?

2019-03-08 Thread Moreno Andreo

  
  
Il 07/03/2019 21:19, Arjun Ranade ha
  scritto:


  
  
I'm looking at pgbouncer and it does most of what I need. 
  I'm wondering about clients connecting via pgadmin, is there a
  way for users using pgadmin or another tool to see all the
  databases that are part of the configs?
  

It's an issue I ran into when I set up my pgbouncer architecture,
  but since all servers are reachable by the same private network
  pgAdmin host is, there's no security issue in connecting directly
  to them, instead of passing through pgbouncer, so I did not spend
  time (that I hadn't :-)) in investigating.
If you resolve this (or someone has already done so), sharing the
  solution would be much appreciated.
Cheers,
Moreno.-


  
On Thu, Mar 7, 2019 at 2:39 PM
  Moreno Andreo <moreno.and...@evolu-s.it>
  wrote:

Il
  07/03/2019 20:27, Arjun Ranade ha scritto:
  > Hi all,
  >
  > I'm wondering if there's a tool like pgpool that can
  provide a single 
  > origin point (host/port) that will proxy/direct
  connections to the 
  > specific servers that contain the db needing to be
  accessed.
  Yes, I think there are many, but I'm encouraging you to take a
  look at 
  pgbouncer
  
  https://pgbouncer.github.io/
  
  in pgbouncer.ini you enter database configuration values like
  
  database = host=hostname port=xyzk, like
  mydb1 = host=cluster1 port=6543 or
  mydb2 = host=cluster1 port=9876
  mydb3 = host=cluster2 port=6543
  
  but there many other parameters to refine your config (like
  "proxying" 
  database names, so if you share names across clusters you can
  easily 
  avoid conflicts)
  
  Pgbouncer should be installed on the same server as the
  databases or in 
  another and listens on a different port than Postgres' (say
  5431 while 
  postgres is on 5432)
  I'm actively using in my environment with 2 clusters and about
  500 
  databases, works flawlessly.
  
  One thing you have to consider, if under heavy workload (say
  100's of 
  connections) is to raise kernel value of maximum open files
  
  Cheers
  
  Moreno.-
  
  
  

  



  





Key encryption and relational integrity

2019-03-26 Thread Moreno Andreo

Hello folks :-)

Is there any workaround to implement key encryption without breaking 
relational integrity?


Thanks

Moreno.-





Re: Key encryption and relational integrity

2019-03-26 Thread Moreno Andreo

Il 26/03/2019 15:24, Adrian Klaver ha scritto:

On 3/26/19 7:19 AM, Moreno Andreo wrote:

Hello folks :-)

Is there any workaround to implement key encryption without breaking 
relational integrity?


This is going to need more information. 

OK, I'll try to be as clearer as I can
For starters 'key' has separate meanings for encryption and RI. I 
could make some guesses about what you want, but to avoid false 
assumptions a simple example would be helpful.
In a master-detail relation, I need to encrypt one of master table PK or 
detail table FK, in order to achieve pseudonimization, required by GDPR 
in Europe when managing particular data

Imagine I have
Table users
id   surname    last name
1    John            Doe
2    Jane            Doe
3    Foo         Bar

Table medications
id    user_id    med
1 1                Medication
2 1                Ear check
...
...
medications.user_id is FK on users.id
we should achieve

Table medications
id    user_id        med
1    sgkighs98    Medication
2    sghighs98    Ear check

or the opposite (users.id encryption and medications.user_id kept plain)

At a first glance, it IS breaking relational integrity, so is there a 
way to manage this encryption internally so RI is kept safe?


Thanks

Moreno.-





Re: Key encryption and relational integrity

2019-03-28 Thread Moreno Andreo

  
  
Il 27/03/2019 07:42, Tony Shelver ha
  scritto:


  
  

  Not in Europe, but have worked a bit with medical records
systems in the USA, including sharing across providers.
  
  
  
  The primary key of the user is _should_ be system
generated, and this is meaningless from a user identity
standpoint.  

  

It is, I was planning to use UUID

  

  If you encrypt user name and other significant personal
data on the user id record, you can query the id column all
day long, and there would be no way to identify who the user
is without the encryption key.

  

That's the first idea that came in my mind. Unfortunately, the
customer needs to do "real time" search in personal data, so for
each key that's pressed, I need to query the whole table filtering
the encrypted value (1st layer of slowness) LIKE (2nd layer of
slowness) the value the user is typing. I ran a test on my i7-2660
with SSD and 16G RAM and on a 2500 rows table these queries last
about 2 seconds each keypress. So it's not the way to go, we have to
find something different. Same thing with clinical data, it would be
a mess because when I open the patient clinical record I need to see
all treatments, or all drugs that have been prescripted, all
diseases and so on, and it would be not-so fast if every clinical
data name (i.e. all drug names) is encrypted.

  

  
  
  The only other way to do it would be to store the
encrypted key value in both user.id and
medications.user_id.  That would encrypt the data and
maintain relational integrity.
  

  

Hmmm... if user.id and medications.user_id are the same, I can link
user with medication... and GDPR rule does not apply. or am I
missing something?

  

  
  
  For US medical record systems, there is a strict act
(HIPAA) that specifies both privacy and security rules, with
lists of what is regarded as sensitive or private
information, what can be shared with various providers and
outside organizations, and so on..  As far as user name
goes, that is almost never a decisive form of identification
for any person in a system.

  

GDPR is almost the same concept, even if some rules may differ

  

  
  
  While GDPR is relatively young and untested, surely
someone in your organization (or outside) has a list of the
relevant rules broken down to specific requirements.
  

  

You sure? :-)

  

  Also, securing the data internally on a database system
MAY have very different requirements to making that data
visible to applications or transmitting it to external
parties.
  

  

Profiling, IMHO, has to be designed in application, here I'm trying
to find a way so nobody can, without the use of the
application, match a patient with their clinical records (i.e.
someone breaking into the server -- data breach)

  

  
  
  Storing the user id in plain on the medications record
and encrypted on the user primary key would seem
meaningless, assuming some meaning could be derived from a
system generated ID.

  

It is a system generated ID, obviously the query is more logical and
quicker if i look from master into detail, so I SELECT something
FROM medications WHERE medications.user_id = encrypt(user.id).
Encrypting the (ex-)FK appears the best solution so far, but I'm
afraid of the consequences of losing RI.

  

   
  I would suggest you sit down with the application / legal
guys generating the requirements to see what the real rules
are.  if they want everything encrypted, then automatic
primary key generation in the DB has to be invalidated, and
they should provide that from the application side.
  

  

Having everything encrypted would be a big performance hit. We
  are trying to achieve best performance with the right masking. 

Thanks
Moreno.-


  






Re: Key encryption and relational integrity

2019-03-28 Thread Moreno Andreo

Il 26/03/2019 18:08, Adrian Klaver ha scritto:

On 3/26/19 9:08 AM, Moreno Andreo wrote:

Il 26/03/2019 15:24, Adrian Klaver ha scritto:

On 3/26/19 7:19 AM, Moreno Andreo wrote:

Hello folks :-)

Is there any workaround to implement key encryption without 
breaking relational integrity?


This is going to need more information. 

OK, I'll try to be as clearer as I can
For starters 'key' has separate meanings for encryption and RI. I 
could make some guesses about what you want, but to avoid false 
assumptions a simple example would be helpful.
In a master-detail relation, I need to encrypt one of master table PK 
or detail table FK, in order to achieve pseudonimization, required by 
GDPR in Europe when managing particular data

Imagine I have
Table users
id   surname    last name
1    John            Doe
2    Jane            Doe
3    Foo         Bar

Table medications
id    user_id    med
1 1                Medication
2 1                Ear check
...
...
medications.user_id is FK on users.id
we should achieve

Table medications
id    user_id        med
1    sgkighs98    Medication
2    sghighs98    Ear check

or the opposite (users.id encryption and medications.user_id kept plain)

At a first glance, it IS breaking relational integrity, so is there a 
way to manage this encryption internally so RI is kept safe?


Not that I know of. RI is based on maintaining a link between parent 
and child. So by definition you would be able to get to the parent 
record via the child.

That's what I was afraid of :-(


A quick search on pseudonymisation found a boatload of interpretations 
of how to implement this:


"Pseudonymisation' means the processing of personal data in such a 
manner that the personal data can no longer be attributed to a 
specific data subject without the use of additional information, 
provided that such additional information is kept separately and is 
subject to technical and organisational measures to ensure that the 
personal data are not attributed to an identified or identifiable 
natural person."



To me it would seem something like:

Table medications
id    user_id    med
1    sgkighs98    Medication
2    sghighs98    Ear check



Table users
id    surname    last name
sgkighs98 John    Doe
jkopkl1   Jane    Doe
uepoti21  Foo Bar

Where there is no direct link between the two. 


Are you sure there isn't?... the key "sgkighs98" is present on both 
tables and I can join tables on that field, so the pseudonimysation does 
not apply, it's just "separation" (that was OK with the last privacy 
act, but not with GDPR


The problem is not on the application side... there you can do almost 
anything you want to do. The prolem is that if someone breaks in the 
server (data breach) it is easy to join patients and their medications.


Instead permissions would prevent linking from medications to users 
even via a SELECT. One could also use pgcrypto:


https://www.postgresql.org/docs/10/pgcrypto.html

on the users table to further hide the personal info.
That's what I used to try to encrypt first name, last name, street 
address and some other fields (that would be the best solution because 
RI was not impacted at all), but the customer stated that they have to 
perform real-time search (like when you type in the Google search box), 
and the query that has to decrypt all names and return only the ones 
that begin with a certain set of characters is way too slow (tried on a 
good i7 configuration, that's about 2 seconds for each key pressed on a  
2500-row table). So I dropped this approach.


*NOTE* I am not a lawyer so any advice on my part as to meeting legal 
requirements are just me thinking out loud. I would suggest, if not 
already done, getting proper legal advice on what the section quoted 
above actually means.

Relax, I'm not here to ask and then sue anyone :-)






Re: Key encryption and relational integrity

2019-03-28 Thread Moreno Andreo

Il 28/03/2019 15:45, Adrian Klaver ha scritto:

On 3/28/19 7:29 AM, Moreno Andreo wrote:

Il 27/03/2019 07:42, Tony Shelver ha scritto:
Not in Europe, but have worked a bit with medical records systems in 
the USA, including sharing across providers.


The only other way to do it would be to store the encrypted key 
value in both user.id <http://user.id> and medications.user_id.  
That would encrypt the data and maintain relational integrity.
Hmmm... if user.id and medications.user_id are the same, I can link 
user with medication... and GDPR rule does not apply. or am I 
missing something?


Yes the link means that someone could use the medications.user_id to 
fetch the rest of the user information from the user table. Unless you 
encrypted that information also, which I gather you do not want to do 
for performance reasons.



Yes, but to be GDPR compliant I _have_ to remove that link. As you 
reported in an earlier email, they say that you can't link patient and 
medication if not with an external resource. In this case we are linking 
them without an external resource


Wandering on the web I also bumped on an article that suggested to have 
the tables on 2 databases on 2 different servers with different 
credentials Interesting, but still without RI.


I think I have to answer this question: "is it acceptable to have this 
kind of master-detail structure in a database without RI?" (The example 
is about 2 tables, obviously in the real world the master will have 50+ 
detail tables)


Thanks

Moreno.






Re: Key encryption and relational integrity

2019-03-29 Thread Moreno Andreo

Il 28/03/2019 23:29, Peter J. Holzer ha scritto:

On 2019-03-28 18:36:40 +0100, Moreno Andreo wrote:

Il 26/03/2019 18:08, Adrian Klaver ha scritto:

To me it would seem something like:

Table medications
id    user_id    med
1    sgkighs98    Medication
2    sghighs98    Ear check



Table users
id    surname    last name
sgkighs98 John    Doe
jkopkl1   Jane    Doe
uepoti21  Foo Bar

Where there is no direct link between the two.

Are you sure there isn't?... the key "sgkighs98" is present on both
tables and I can join tables on that field, so the pseudonimysation
does not apply,

Yes. It doesn't matter whether the key is 'sgkighs98' or 1438 or
692da0c1-cf2d-476d-8910-7f82c050f8fe.


it's just "separation" (that was OK with the last privacy act, but not
with GDPR

I doubt that this is correct. The GDPR doesn't prescribe specific
technical means (there may be laws or standards in your country which
prescribe such means for medical data, but that's not the GDPR).


That was told me by a privacy consultant, there was an Italian law 
(196/2003) that introduced "minimal security measures", that has been 
revoked with the GDPR appliance.




The problem is not on the application side... there you can do almost
anything you want to do. The prolem is that if someone breaks in the
server (data breach) it is easy to join patients and their
medications.

I sure hope that the doctors are able to join patients and their
medications. So at some level that has to be possible.
It would be possible at application level, that resides on another 
server (so it would be compliant the separation between the 
pseudonimysation and the reverse method)

If you assume a
break-in into the server, there will always be a level of penetration at
which the attacker will be able to access any data an authorized user
can access.


That's not what I got reading the GDPR article... but I may have 
misunderstood (juridic text is non my cup of tea). My understanding was 
that even in a data breach event there should be a mechanism that 
prevents (or "mitigate the risk that") the attacker to gain access to 
the data in the "joined" form, so he cannot acquire that patient John 
Doe has got Alzheimer, for instance, but only that in that database 
there is a patient which name is John Doe and someone that has got 
Alzheimer.


And I tried to find a solution, and since I did not like that much what 
I found (and it seems that neither you do :-) ), I came here hoping that 
someone would share his experience to shed some light on the topic.




 hp








Re: Key encryption and relational integrity

2019-03-29 Thread Moreno Andreo

Il 28/03/2019 23:50, Peter J. Holzer ha scritto:

On 2019-03-28 15:29:50 +0100, Moreno Andreo wrote:

here I'm trying to find a way so nobody can, without the use of the
application, match a patient with their clinical records (i.e. someone
breaking into the server -- data breach)

I think it is very optimistic to assume that an intruder would get
access to the database but not the application.

 hp

Well, application resides on another instance (server), but if the 
attacker has been able to take control of one server, he surely could 
try to break another one, but it takes time.







Re: Key encryption and relational integrity

2019-04-01 Thread Moreno Andreo

Il 29/03/2019 20:23, Adrian Klaver ha scritto:

On 3/29/19 9:01 AM, Moreno Andreo wrote:

And I tried to find a solution, and since I did not like that much 
what I found (and it seems that neither you do :-) ), I came here 
hoping that someone would share his experience to shed some light on 
the topic.


From what you have posted the biggest issue you are having is less 
then real time search on patient names due to the need to meet 
pseudonymisation. To me that is always going to be a problem as there 
are two opposing forces at work, overhead to implement 
pseudonymisation vs quick lookup. Might be time to lower expectations 
on what can be done.



... or just do NOT meet pseudonimization at all, but try to enforce 
other rules suggested bu GDPR.


Peter put in evidence a concept

"

The GDPR
doesn't say how to do that at all (the legislators were wise enough that
any attempt to do that would result in a mess). So you can't say "the
GDPR says we have to do it this way" (and if your consultant says that
it is probably time to get a different one). You have to consider all
the risks (and yes, an attacker getting access to some or all of the
data is a risk, but a doctor not being able to access a patient's
records is also a risk) and implement the best you can do considering
"the state of the art, the costs of implementation", etc.

"

that would be absolutely right. I'm not forced to use pseudonimysation 
if there's the risk to get things worse in a system. I've got to speak 
about these"two opposing forces at work" to a privacy expert (maybe 
choosing another one, as Peter suggested :-) ) and ask him if it could 
be used as a matter of declining pseudonymisation because of 
"pseudonimysation puts at risk overall performance or database integrity"


What do you think?







 hp

















Re: [SPAM] Re: Key encryption and relational integrity

2019-04-02 Thread Moreno Andreo

Il 01/04/2019 20:48, Rory Campbell-Lange ha scritto:

On 01/04/19, Moreno Andreo (moreno.and...@evolu-s.it) wrote:
...

I'm not forced to use pseudonimysation if there's the risk to get
things worse in a system. I've got to speak about these"two opposing
forces at work" to a privacy expert (maybe choosing another one, as
Peter suggested :-) ) and ask him if it could be used as a matter of
declining pseudonymisation because of "pseudonimysation puts at risk
overall performance or database integrity"

How to interpret the pseudonymisation conditions is ... complicated.

Yes, it is indeed... :-)

  The
UK's Information Commissioner's Office (ICO) writes that
pseudoanonymisation relates to:

 “…the processing of personal data in such a manner that the personal
 data can no longer be attributed to a specific data subject without
 the use of additional information, provided that such additional
 information is kept separately and is subject to technical and
 organisational measures to ensure that the personal data are not
 attributed to an identified or identifiable natural person.”

and that this "...can reduce the risks to the data subjects".

The concept of application realms may be relevant to consider here. An
application may be considered GDPR compliant without pseudonymisation if
other measures are taken and the use case is appropriate.
That could be my case, so I'll have to discuss the strategy and measures 
to be adopted with a privacy consultant.


On the other hand, a copy of a production database in testing which has
been pseudonymised may, if compromised, still leak personal data. As the
ICO states:

 “…Personal data which have undergone pseudonymisation, which could
 be attributed to a natural person by the use of additional
 information should be considered to be information on an
 identifiable natural person…”

https://ico.org.uk/for-organisations/guide-to-data-protection/guide-to-the-general-data-protection-regulation-gdpr/what-is-personal-data/what-is-personal-data/

If leakage occurs pseudonymisation has achieved nothing.


That's another aspect of the question.

Thanks for the clarification,

Moreno.-







Re: Connection refused (0x0000274D/10061)

2019-06-18 Thread Moreno Andreo

Il 18/06/19 15:57, Adrian Klaver ha scritto:

On 6/18/19 1:06 AM, Sourav Majumdar wrote:
Hii, I have checked the logged file. I am attaching it with this 
mail. PFA


In future please do not attach a 1.4MB file. Most of it was Unpacking 
info. The relevant part was at the end:


"
fixing permissions on existing directory C:/PostgreSQL/data ... 
initdb: could not change permissions of directory 
"C:/PostgreSQL/data": Permission denied


[...]

Did you follow the instructions here?:

https://www.enterprisedb.com/docs/en/11.0/PG_Inst_Guide_v11/PostgreSQL_Installation_Guide.1.08.html# 



In particular the part about:

"To perform an installation using the graphical installation wizard, 
you must have superuser or administrator privileges. To start the 
installation wizard, assume sufficient privileges and double-click the 
installer icon; if prompted, provide a password.
Note that in some versions of Windows, you can invoke the installer 
with Administrator privileges by right clicking on the installer icon 
and selecting Run as Administrator from the context menu."



We had this issue in the past, and noted that re-issuing initdb from an 
Administrator Command Prompt (not doing anything else) from the Bin 
directory worked perfectly.


This was applied for 9.5, 9.6 and 10.x. Still not using 11 in 
production, so I have no cases to test at the moment.


My 2 cent

Moreno.-







Partial table duplication via triggger

2024-02-22 Thread Moreno Andreo

Hi *,
    suppose I have 2 tables

CREATE TABLE t1(
    id uuid,
    name text,
    surname text,
    ...
    PRIMARY KEY(id)
)

CREATE TABLE t2(
    id uuid,
    master_id uuid,
    op_ts timestamp with time zone,
    name text,
    surname text,
    ...
    PRIMARY KEY(id)
)

I need to write an AFTER TRIGGER on UPDATE so all columns of t1 go in 
the same columns in t2 (except for t1.id that goes in t2.master_id, and 
t2.op_ts gets now())


I cannot write an 1 to 1 column assignment (like NEW.name := OLD.name 
and so on) because the trigger has to be used on many tables, that has 
different (and evolving) schema and I don't want to write dozen of 
function that have to be frequently mantained.


I'm quite noob at pl-pgsql; at the moment I wrote this, but I can't 
"tell the function that fields are from OLD row" (the error is "missing 
FROM-clause entry for table 'old')


I tried also with field names alone (without OLD.), with no success.
Trigger is fired AFTER UPDATE in t1 (CREATE TRIGGER update_id AFTER 
UPDATE ON t1 FOR EACH ROW EXECUTE PROCEDURE update_op());


CREATE OR REPLACE FUNCTION update_op() RETURNS TRIGGER
AS $$
DECLARE
    fieldlist text := (select string_agg(column_name, ', ')
                    from information_schema.columns c
                    where table_name = TG_TABLE_NAME and
                      (column_name <> 'id'));

    oldfieldlist text := (select string_agg(column_name, ', OLD.')
                    from information_schema.columns c
                    where table_name = TG_TABLE_NAME and
                      (column_name <> 'id'));

BEGIN
    EXECUTE 'INSERT INTO t2 (master_id, op_ts, '|| fieldlist ||') 
VALUES (OLD.id, now(), OLD.'||oldfieldlist||')' USING OLD;

    RETURN NULL;
END;
$$
LANGUAGE 'plpgsql';

What am I missing?
Thanks,
Moreno

Re: [SPAM] Re: Partial table duplication via triggger

2024-02-22 Thread Moreno Andreo




On 22/02/24 17:49, Erik Wienhold wrote:

On 2024-02-22 15:14 +0100, Moreno Andreo wrote:

suppose I have 2 tables
[snip]
What am I missing?

The parameters you pass in with USING have to be referenced as $1, $2,
and so on.  For example:

DECLARE
fieldlist text := (
SELECT string_agg(quote_ident(column_name), ', ')
FROM information_schema.columns
WHERE table_name = TG_TABLE_NAME AND column_name <> 'id'
);
oldfieldlist text := (
SELECT string_agg('$1.' || quote_ident(column_name), ', ')
FROM information_schema.columns
WHERE table_name = TG_TABLE_NAME AND column_name <> 'id'
);
BEGIN
EXECUTE '
INSERT INTO t2 (id, master_id, op_ts, ' || fieldlist || ')
VALUES (gen_random_uuid(), $1.id, now(), ' || oldfieldlist || ')
' USING OLD;
RETURN NULL;
END;

Also make sure to use quote_ident() when constructing statements that
way to avoid SQL injections via column names in this case.  Or use
format() with placeholder %I, although it's not simpler when you need to
construct that variable list of identifiers.


Erik,
It worked perfectly!
I had not clear in mind how to use $1, $2, etc, with using; after your 
reply I had a closer look at the docs and now it's clearer to me.


Many thanks,
Moreno.






Re: Windows XP to Win 10 migration issue

2017-12-12 Thread Moreno Andreo

Il 11/12/2017 14:37, Vincent Veyron ha scritto:

On Sat, 9 Dec 2017 10:11:42 -0600
Dale Seaburg  wrote:


   No Go!  Would not start.

Any error message in your logs?

I would certainly second Scott's suggestion to check the processors. I've had 
to do what you describe once, and it took me four machines before I got one 
that would start postgresql with my rescued data folder.

You might have better luck finding an old XP machine similar to your deceased 
one, installing postgresql on it, and using that to start your cluster.


I'd take a peek into logs (maybe EventLog), it happened to me that the 
postgres user in windows 10 did not have the permission to access data 
directory. Setting the right permissions could make it start. Obviously 
the processor architecture (32/64 bit) must be honored.


My 2 cent





Re: Functions and Indexes

2024-11-19 Thread Moreno Andreo




On 18/11/24 15:43, Gianni Ceccarelli wrote:

On Mon, 18 Nov 2024 15:37:57 +0100
Moreno Andreo  wrote:


If a column appears in the WHERE clause (and so it should be placed
in index), in case it is "processed" in a function (see below), is it
possible to insert this function to further narrow down things?

You probably want to look at
https://www.postgresql.org/docs/current/indexes-expressional.html


Hi Gianni,
    correct. That's a good starting point (now I have to add complexity 
since the index should have 4 columns, one of them is referenced with LIKE)

Thanks!
Moreno





Re: Functions and Indexes

2024-11-19 Thread Moreno Andreo




On 18/11/24 20:05, Laurenz Albe wrote:

On Mon, 2024-11-18 at 15:37 +0100, Moreno Andreo wrote:

I'm creating indexes for some tables, and I came across a doubt.

If a column appears in the WHERE clause (and so it should be placed in
index), in case it is "processed" in a function (see below), is it
possible to insert this function to further narrow down things?

Common index:
SELECT foo1, foo2 FROM bar WHERE foo1 = 2
CREATE index bar1_idx ON bar USING btree(foo1);

What about if query becomes
SELECT foo1, foo2 FROM bar WHERE (POSITION(foo1 IN 'blah blah') >0)

You could create an index like

   CREATE INDEX ON bar (position(foo1 IN 'blah blah'));

Alternatively, you could have a partial index:

   CREATE INDEX ON bar (foo1) INCLUDE (foo2)
   WHERE position(foo1 IN 'blah blah') > 0;

Interesting. Never seen this form, I'll look further on it.

I stumbled into
https://www.cybertec-postgresql.com/en/indexing-like-postgresql-oracle/
and discovered text_pattern_ops.
I'm wondering if it can be of any use in my index, that should hold a 
WHERE condition with a combination of LIKE and the POSITION expression 
above.

More docs to read ... :-)



Second question: I 've seen contrasting opinions about putting JOIN
parameters (ON a.field1 = b.field2) in an index and I'd like to know
your thoughts.

That depends on the join strategy PostgreSQL chooses.
You can use EXPLAIN to figure out the join strategy.
This article should explain details:
https://www.cybertec-postgresql.com/en/join-strategies-and-performance-in-postgresql/

Very nice article, clear and easy to understand!


Yours,
Laurenz Albe



Thanks,
Moreno.





Functions and Indexes

2024-11-18 Thread Moreno Andreo

Hi folks,
I'm creating indexes for some tables, and I came across a doubt.

If a column appears in the WHERE clause (and so it should be placed in 
index), in case it is "processed" in a function (see below), is it 
possible to insert this function to further narrow down things?


Common index:
SELECT foo1, foo2 FROM bar WHERE foo1 = 2
CREATE index bar1_idx ON bar USING btree(foo1);

What about if query becomes
SELECT foo1, foo2 FROM bar WHERE (POSITION(foo1 IN 'blah blah') >0)

Second question: I 've seen contrasting opinions about putting JOIN 
parameters (ON a.field1 = b.field2) in an index and I'd like to know 
your thoughts.


Thanks,
Moreno.





Re: Functions and Indexes

2024-11-19 Thread Moreno Andreo



On 19/11/24 12:34, Laurenz Albe wrote:

On Tue, 2024-11-19 at 11:53 +0100, Moreno Andreo wrote:

What about if query becomes
SELECT foo1, foo2 FROM bar WHERE (POSITION(foo1 IN 'blah blah') >0)

You could create an index like

     CREATE INDEX ON bar (position(foo1 IN 'blah blah'));

Alternatively, you could have a partial index:

     CREATE INDEX ON bar (foo1) INCLUDE (foo2)
     WHERE position(foo1 IN 'blah blah') > 0;

Interesting. Never seen this form, I'll look further on it.

I stumbled into
https://www.cybertec-postgresql.com/en/indexing-like-postgresql-oracle/
and discovered text_pattern_ops.
I'm wondering if it can be of any use in my index, that should hold a
WHERE condition with a combination of LIKE and the POSITION expression
above.
More docs to read ... :-)

I don't think "text_pattern_ops" will help here - queries that use LIKE
to search for a substring (LIKE '%string%') cannot make use of a b-tree
index.
Oh, OK, i was happy to use BTREEs 'cause I had some issues with GIN/GIST 
(like indexes way bigger than table and so inefficient). OK, I'll stick 
with these and try harder to obtain better results.


One thing I can't understand well.
In 
https://www.cybertec-postgresql.com/en/join-strategies-and-performance-in-postgresql/

you say
"Note that for inner joins there is no distinction between the join 
condition and the|WHERE|condition, but that doesn't hold for outer joins."

What do you mean?

Thanks
Moreno

BTREE index: field ordering

2025-03-28 Thread Moreno Andreo

Hi,
Postgres 16.4 (planning to go on 17.4)
I'm creating some indexes based on some slow query reported by logs.
These queries involve a WHERE with more than 5 fields, that are matching 
by =, <>, LIKE and IN()

I read that equality fields must be first, then the others.
Is it correct?

Based on this query

SELECT COUNT(id) AS total
    FROM nx.tbl1
    WHERE
    (date_order >= '2025-03-21')
    AND (date_order <= '2025-03-29')
    AND (flag = TRUE)
    AND ((
  -- (flag = TRUE)
  -- AND (((tipo <> 'C') AND (tipo <> 'V') AND (tipo <> 
'F')) OR (tipo IS NULL) OR (tipo = ''))

  (((op <> 'C') OR (op IS NULL)) OR (tipo = 'F'))
  AND (s_state IN 
('ENQ','WFR','BLQ','BLR','WFA','FRW','FRO','','0001'))

  AND (tiporic IS NOT NULL)
  AND (tiporic NOT LIKE '%cart%')
    ) OR (
  (tiporic LIKE '%cart%') AND (S_state <> 'CON') AND 
(s_state <> '0002') AND ((op <> 'C') OR (op IS NULL))
    ))  AND (priv IS NULL OR priv = false OR (priv = true and 
idpriv = 'TEST')));


Should the following index be correct?

CREATE INDEX IF NOT EXISTS tbl1_count_idx on nx.tbl1 USING BTREE(flag, 
tipo, op, priv, idpriv, date_order, s_state, tiporic);


Would it be better to create a separate GIN/GIST index for the field 
matched with LIKE?


Thanks in advance,
Moreno

Logical replication, need to reclaim big disk space

2025-05-16 Thread Moreno Andreo

Hi,
    we are moving our old binary data approach, moving them from bytea 
fields in a table to external storage (making database smaller and 
related operations faster and smarter).
In short, we have a job that runs in background and copies data from the 
table to an external file and then sets the bytea field to NULL.

(UPDATE tbl SET blob = NULL, ref = 'path/to/file' WHERE id = )

This results, at the end of the operations, to a table that's less than 
one tenth in size.
We have a multi-tenant architecture (100s of schemas with identical 
architecture, all inheriting from public) and we are performing the task 
on one table per schema.


The problem is: this is generating BIG table bloat, as you may imagine.
Running a VACUUM FULL on an ex-22GB table on a standalone test server is 
almost immediate.
If I had only one server, I'll process a table a time, with a nightly 
script, and issue a VACUUM FULL to tables that have already been processed.


But I'm in a logical replication architecture (we are using a 
multimaster system called pgEdge, but I don't think it will make big 
difference, since it's based on logical replication), and I'm building a 
test cluster.


I've been instructed to issue VACUUM FULL on both nodes, nightly, but 
before proceeding I read on docs that VACUUM FULL can disrupt logical 
replication, so I'm a bit concerned on how to proceed. Rows are cleared 
one a time (one transaction, one row, to keep errors to the record that 
issued them)


I read about extensions like pg_squeeze, but I wonder if they are still 
not dangerous for replication.


Thanks for your help.
Moreno.-





Re: Logical replication, need to reclaim big disk space

2025-05-20 Thread Moreno Andreo



On 19/05/25 20:49, Achilleas Mantzios wrote:


On 19/5/25 17:38, Moreno Andreo wrote:




On 19/05/25 14:41, Achilleas Mantzios wrote:



On 5/19/25 09:14, Moreno Andreo wrote:


On 16/05/25 21:33, Achilleas Mantzios wrote:

On 16/5/25 18:45, Moreno Andreo wrote:


Hi,
    we are moving our old binary data approach, moving them from 
bytea fields in a table to external storage (making database 
smaller and related operations faster and smarter).
In short, we have a job that runs in background and copies data 
from the table to an external file and then sets the bytea field 
to NULL.

(UPDATE tbl SET blob = NULL, ref = 'path/to/file' WHERE id = )

This results, at the end of the operations, to a table that's 
less than one tenth in size.
We have a multi-tenant architecture (100s of schemas with 
identical architecture, all inheriting from public) and we are 
performing the task on one table per schema.


So? toasted data are kept on separate TOAST tables, unless those 
bytea cols are selected, you won't even touch them. I cannot 
understand what you are trying to achieve here.


Years ago, when I made the mistake to go for a coffee and let my 
developers "improvise" , the result was a design similar to what 
you are trying to achieve. Years after, I am seriously considering 
moving those data back to PostgreSQL.
The "related operations" I was talking about are backups and 
database maintenance when needed, cluster/replica management, etc. 
With a smaller database size they would be easier in timing and 
effort, right?
Ok, but you'll lose replica functionality for those blobs, which 
means you don't care about them, correct me if I am wrong.
I'm not saying I don't care about them, the opposite, they are 
protected with Object Versioning and soft deletion, this should 
assure a good protection against e.g. ransomware, if someone manages 
to get in there (and if this happens, we'll have bigger troubles than 
this).
PostgreSQL has become very popular because of ppl who care about their 
data.
Yeah, it's always been famous for its robustness, and that's why I chose 
PostgreSQL more than 10 years ago, and, in spite of how a "normal" user 
treats his PC, we never had corruption (only where FS/disk were failing, 
but that's not PG fault)
We are mostly talking about costs, here. To give things their 
names, I'm moving bytea contents (85% of total data) to files into 
Google Cloud Storage buckets, that has a fraction of the cost of 
the disks holding my database (on GCE, to be clear ).

May I ask the size of the bytea data (uncompressed) ?.
single records vary from 150k to 80 MB, the grand total is more than 
8,5 TB in a circa 10 TB data footprint
This data is not accessed frequently (just by the owner when he 
needs to do it), so no need to keep it on expensive hardware.
I've already read in these years that keeping many big bytea fields 
in databases is not recommended, but might have misunderstood this.


Ok, I assume those are unimportant data, but let me ask, what is the 
longevity or expected legitimacy of those ? I haven't worked with 
those just reading  :


https://cloud.google.com/storage/pricing?_gl=1*1b25r8o*_up*MQ..&gclid=CjwKCAjwravBBhBjEiwAIr30VKfaOJytxmk7J29vjG4rBBkk2EUimPU5zPibST73nm3XRL2h0O9SxRoCaogQAvD_BwE&gclsrc=aw.ds#storage-pricing

would you choose e.g. "*Anywhere Cache storage" ?
*

Absolutely not, this is *not* unimportant data, and we are using 
Standard Storage, for 0,02$/GB/month + operations, that compared to a 
0.17$/GB/month of an SSD or even more for the Hyperdisks we are 
using, is a good price drop.

How about hosting your data in your own storage and spend 0$/GB/month ?
If we could host on our own hardware I'd not be here talking. Maybe we 
would have a 10-node full-mesh multimaster architecture with barman 
backup on 2 separate SANs.
But we are a small company that has to balance performance, consistency, 
security and, last but not latter, costs. And margins are tightening.



**

Another way would have been to move these tables to a different 
tablespace, in cheaper storage, but it still would have been 3 
times the buckets cost.


can you actually mount those Cloud Storage Buckets under a supported 
FS in linux and just move them to tablespaces backed by this storage ?


Never tried, I mounted this via FUSE and had some simple operations 
in the past, but not sure it can handle database operations in terms 
of I/O bandwidth



Why are you considering to get data back to database tables?
Because now if we need to migrate from cloud to on-premise, or just 
upgrade or move the specific server which holds those data I will 
have an extra headache. Also this is a single point of failure, or 
best case a cause for fragmented technology introduced just for the 
sake of keeping things out of the DB.
This is managed as an hierarchical disk structure, so

Re: Logical replication, need to reclaim big disk space

2025-05-20 Thread Moreno Andreo



On 20/05/25 12:58, Achilleas Mantzios wrote:


Στις 20/5/25 12:17, ο/η Moreno Andreo έγραψε:




On 19/05/25 20:49, Achilleas Mantzios wrote:


On 19/5/25 17:38, Moreno Andreo wrote:




On 19/05/25 14:41, Achilleas Mantzios wrote:



On 5/19/25 09:14, Moreno Andreo wrote:


On 16/05/25 21:33, Achilleas Mantzios wrote:

On 16/5/25 18:45, Moreno Andreo wrote:


Hi,
    we are moving our old binary data approach, moving them 
from bytea fields in a table to external storage (making 
database smaller and related operations faster and smarter).
In short, we have a job that runs in background and copies data 
from the table to an external file and then sets the bytea 
field to NULL.
(UPDATE tbl SET blob = NULL, ref = 'path/to/file' WHERE id = 
)


This results, at the end of the operations, to a table that's 
less than one tenth in size.
We have a multi-tenant architecture (100s of schemas with 
identical architecture, all inheriting from public) and we are 
performing the task on one table per schema.


So? toasted data are kept on separate TOAST tables, unless those 
bytea cols are selected, you won't even touch them. I cannot 
understand what you are trying to achieve here.


Years ago, when I made the mistake to go for a coffee and let my 
developers "improvise" , the result was a design similar to what 
you are trying to achieve. Years after, I am seriously 
considering moving those data back to PostgreSQL.
The "related operations" I was talking about are backups and 
database maintenance when needed, cluster/replica management, 
etc. With a smaller database size they would be easier in timing 
and effort, right?
Ok, but you'll lose replica functionality for those blobs, which 
means you don't care about them, correct me if I am wrong.
I'm not saying I don't care about them, the opposite, they are 
protected with Object Versioning and soft deletion, this should 
assure a good protection against e.g. ransomware, if someone 
manages to get in there (and if this happens, we'll have bigger 
troubles than this).
PostgreSQL has become very popular because of ppl who care about 
their data.
Yeah, it's always been famous for its robustness, and that's why I 
chose PostgreSQL more than 10 years ago, and, in spite of how a 
"normal" user treats his PC, we never had corruption (only where 
FS/disk were failing, but that's not PG fault)
We are mostly talking about costs, here. To give things their 
names, I'm moving bytea contents (85% of total data) to files 
into Google Cloud Storage buckets, that has a fraction of the 
cost of the disks holding my database (on GCE, to be clear ).

May I ask the size of the bytea data (uncompressed) ?.
single records vary from 150k to 80 MB, the grand total is more 
than 8,5 TB in a circa 10 TB data footprint
This data is not accessed frequently (just by the owner when he 
needs to do it), so no need to keep it on expensive hardware.
I've already read in these years that keeping many big bytea 
fields in databases is not recommended, but might have 
misunderstood this.


Ok, I assume those are unimportant data, but let me ask, what is 
the longevity or expected legitimacy of those ? I haven't worked 
with those just reading  :


https://cloud.google.com/storage/pricing?_gl=1*1b25r8o*_up*MQ..&gclid=CjwKCAjwravBBhBjEiwAIr30VKfaOJytxmk7J29vjG4rBBkk2EUimPU5zPibST73nm3XRL2h0O9SxRoCaogQAvD_BwE&gclsrc=aw.ds#storage-pricing

would you choose e.g. "*Anywhere Cache storage" ?
*

Absolutely not, this is *not* unimportant data, and we are using 
Standard Storage, for 0,02$/GB/month + operations, that compared to 
a 0.17$/GB/month of an SSD or even more for the Hyperdisks we are 
using, is a good price drop.

How about hosting your data in your own storage and spend 0$/GB/month ?
If we could host on our own hardware I'd not be here talking. Maybe 
we would have a 10-node full-mesh multimaster architecture with 
barman backup on 2 separate SANs.
But we are a small company that has to balance performance, 
consistency, security and, last but not latter, costs. And margins 
are tightening.



**

Another way would have been to move these tables to a different 
tablespace, in cheaper storage, but it still would have been 3 
times the buckets cost.


can you actually mount those Cloud Storage Buckets under a 
supported FS in linux and just move them to tablespaces backed by 
this storage ?


Never tried, I mounted this via FUSE and had some simple operations 
in the past, but not sure it can handle database operations in 
terms of I/O bandwidth



Why are you considering to get data back to database tables?
Because now if we need to migrate from cloud to on-premise, or 
just upgrade or move the specific server which holds those data I 
will have an extra headache. Also this is a single point of 
failure, or best case a cause for fragmented technology introduced 
jus

Re: Logical replication, need to reclaim big disk space

2025-05-19 Thread Moreno Andreo



On 16/05/25 21:33, Achilleas Mantzios wrote:

On 16/5/25 18:45, Moreno Andreo wrote:


Hi,
    we are moving our old binary data approach, moving them from 
bytea fields in a table to external storage (making database smaller 
and related operations faster and smarter).
In short, we have a job that runs in background and copies data from 
the table to an external file and then sets the bytea field to NULL.

(UPDATE tbl SET blob = NULL, ref = 'path/to/file' WHERE id = )

This results, at the end of the operations, to a table that's less 
than one tenth in size.
We have a multi-tenant architecture (100s of schemas with identical 
architecture, all inheriting from public) and we are performing the 
task on one table per schema.


So? toasted data are kept on separate TOAST tables, unless those bytea 
cols are selected, you won't even touch them. I cannot understand what 
you are trying to achieve here.


Years ago, when I made the mistake to go for a coffee and let my 
developers "improvise" , the result was a design similar to what you 
are trying to achieve. Years after, I am seriously considering moving 
those data back to PostgreSQL.
The "related operations" I was talking about are backups and database 
maintenance when needed, cluster/replica management, etc. With a smaller 
database size they would be easier in timing and effort, right?
We are mostly talking about costs, here. To give things their names, I'm 
moving bytea contents (85% of total data) to files into Google Cloud 
Storage buckets, that has a fraction of the cost of the disks holding my 
database (on GCE, to be clear ).
This data is not accessed frequently (just by the owner when he needs to 
do it), so no need to keep it on expensive hardware.
I've already read in these years that keeping many big bytea fields in 
databases is not recommended, but might have misunderstood this.
Another way would have been to move these tables to a different 
tablespace, in cheaper storage, but it still would have been 3 times the 
buckets cost.


Why are you considering to get data back to database tables?




The problem is: this is generating BIG table bloat, as you may imagine.
Running a VACUUM FULL on an ex-22GB table on a standalone test server 
is almost immediate.
If I had only one server, I'll process a table a time, with a nightly 
script, and issue a VACUUM FULL to tables that have already been 
processed.


But I'm in a logical replication architecture (we are using a 
multimaster system called pgEdge, but I don't think it will make big 
difference, since it's based on logical replication), and I'm 
building a test cluster.


So you use PgEdge , but you wanna lose all the benefits of 
multi-master , since your binary data won't be replicated ...
I don't think I need it to be replicated, since this data cannot be 
"edited", so either it's there or it's been deleted. Buckets have 
protections for data deletions or events like ransomware attacks and such.
Also multi-master was an absolute requirement one year ago because of a 
project we were building, but it has been abandoned and now a simple 
logical replication would be enough, but let's do one thing a time.
I've been instructed to issue VACUUM FULL on both nodes, nightly, but 
before proceeding I read on docs that VACUUM FULL can disrupt logical 
replication, so I'm a bit concerned on how to proceed. Rows are 
cleared one a time (one transaction, one row, to keep errors to the 
record that issued them)


PgEdge is based on the old pg_logical, the old 2ndQuadrant extension, 
not the native logical replication we have since pgsql 10. But I might 
be mistaken.
Don't know about this, it keeps running on latest pg versions (we are 
about to upgrade to 17.4, if I'm not wrong), but I'll ask
I read about extensions like pg_squeeze, but I wonder if they are 
still not dangerous for replication.


What's pgEdge take on that, I mean the bytea thing you are trying to 
achieve here.
They are positive, it's they that suggested to do VACUUM FULL on both 
nodes... I'm quite new to replication, so I'm searching some advise here.

Thanks for your help.
Moreno.-













Re: Logical replication, need to reclaim big disk space

2025-05-19 Thread Moreno Andreo



On 19/05/25 14:41, Achilleas Mantzios wrote:



On 5/19/25 09:14, Moreno Andreo wrote:


On 16/05/25 21:33, Achilleas Mantzios wrote:

On 16/5/25 18:45, Moreno Andreo wrote:


Hi,
    we are moving our old binary data approach, moving them from 
bytea fields in a table to external storage (making database 
smaller and related operations faster and smarter).
In short, we have a job that runs in background and copies data 
from the table to an external file and then sets the bytea field to 
NULL.

(UPDATE tbl SET blob = NULL, ref = 'path/to/file' WHERE id = )

This results, at the end of the operations, to a table that's less 
than one tenth in size.
We have a multi-tenant architecture (100s of schemas with identical 
architecture, all inheriting from public) and we are performing the 
task on one table per schema.


So? toasted data are kept on separate TOAST tables, unless those 
bytea cols are selected, you won't even touch them. I cannot 
understand what you are trying to achieve here.


Years ago, when I made the mistake to go for a coffee and let my 
developers "improvise" , the result was a design similar to what you 
are trying to achieve. Years after, I am seriously considering 
moving those data back to PostgreSQL.
The "related operations" I was talking about are backups and database 
maintenance when needed, cluster/replica management, etc. With a 
smaller database size they would be easier in timing and effort, right?
Ok, but you'll lose replica functionality for those blobs, which means 
you don't care about them, correct me if I am wrong.
I'm not saying I don't care about them, the opposite, they are protected 
with Object Versioning and soft deletion, this should assure a good 
protection against e.g. ransomware, if someone manages to get in there 
(and if this happens, we'll have bigger troubles than this)
We are mostly talking about costs, here. To give things their names, 
I'm moving bytea contents (85% of total data) to files into Google 
Cloud Storage buckets, that has a fraction of the cost of the disks 
holding my database (on GCE, to be clear ).

May I ask the size of the bytea data (uncompressed) ?.
single records vary from 150k to 80 MB, the grand total is more than 8,5 
TB in a circa 10 TB data footprint
This data is not accessed frequently (just by the owner when he needs 
to do it), so no need to keep it on expensive hardware.
I've already read in these years that keeping many big bytea fields 
in databases is not recommended, but might have misunderstood this.


Ok, I assume those are unimportant data, but let me ask, what is the 
longevity or expected legitimacy of those ? I haven't worked with 
those just reading  :


https://cloud.google.com/storage/pricing?_gl=1*1b25r8o*_up*MQ..&gclid=CjwKCAjwravBBhBjEiwAIr30VKfaOJytxmk7J29vjG4rBBkk2EUimPU5zPibST73nm3XRL2h0O9SxRoCaogQAvD_BwE&gclsrc=aw.ds#storage-pricing

would you choose e.g. "*Anywhere Cache storage" ?
*

Absolutely not, this is *not* unimportant data, and we are using 
Standard Storage, for 0,02$/GB/month + operations, that compared to a 
0.17$/GB/month of an SSD or even more for the Hyperdisks we are using, 
is a good price drop.


**

Another way would have been to move these tables to a different 
tablespace, in cheaper storage, but it still would have been 3 times 
the buckets cost.


can you actually mount those Cloud Storage Buckets under a supported 
FS in linux and just move them to tablespaces backed by this storage ?


Never tried, I mounted this via FUSE and had some simple operations in 
the past, but not sure it can handle database operations in terms of I/O 
bandwidth



Why are you considering to get data back to database tables?
Because now if we need to migrate from cloud to on-premise, or just 
upgrade or move the specific server which holds those data I will have 
an extra headache. Also this is a single point of failure, or best 
case a cause for fragmented technology introduced just for the sake of 
keeping things out of the DB.
This is managed as an hierarchical disk structure, so the calling server 
may be literally everywhere, it just needs an account (or a service 
account) to get in there



The problem is: this is generating BIG table bloat, as you may 
imagine.
Running a VACUUM FULL on an ex-22GB table on a standalone test 
server is almost immediate.
If I had only one server, I'll process a table a time, with a 
nightly script, and issue a VACUUM FULL to tables that have already 
been processed.


But I'm in a logical replication architecture (we are using a 
multimaster system called pgEdge, but I don't think it will make 
big difference, since it's based on logical replication), and I'm 
building a test cluster.


So you use PgEdge , but you wanna lose all the benefits of 
multi-master , since your binary data won't be replicated ...
I don't think I need it