Re: Is there a continuous backup for pg ?

2018-03-04 Thread raf
Gary M wrote:

> Vick,
> 
> I would love to use ZFS, this project requires RHEL/SeLinux MLS. Without
> MLS and RH support, ZFS is a no-go.

btrfs could be an option. it does mostly the same things as zfs.

> On Fri, Mar 2, 2018 at 2:34 PM, Vick Khera  wrote:
> 
> > On Fri, Mar 2, 2018 at 2:34 PM, Gary M  wrote:
> >
> >> Thanks Josh,
> >>
> >> PITR is the option I was recommending. Project management diligence
> >> dictates I need at least one other option from a different recommending
> >> source, two other options optimally.
> >>
> >
> > File system snapshot on a robust system like ZFS. Rollback to any snapshot
> > almost trivially, and view any snapshot trivially.
> >



Inconsistent compilation error

2018-04-18 Thread raf
Hi,

postgresql-9.5.12 on debian-9

I have a stored function with code that looks like:

create or replace function tla_audit_delete_thing()
returns boolean stable language plpgsql as $$
declare
r record;
status boolean := 1;
begin
for r in select _.* from blah_history _ where _.original_id not in 
(select id from blah)
loop
raise notice '% %', 'blah_history.original_id', r;
status := 0;
end loop;
[...]
end
$$
security definer
set search_path = public, pg_temp;
revoke all on function tla_audit_delete_thing() from public;
grant execute on function tla_audit_delete_thing() to staff;

And I have a program that loads stored functions from disk
when they are different to what's in the database and I have
just loaded a very old database backup, brought the schema up
to date, and tried to bring the stored functions up to date.

But I'm getting this compilation error when it tries to load this
function:

ERROR:  too many parameters specified for RAISE
CONTEXT:  compilation of PL/pgSQL function "tla_audit_delete_thing" near 
line 9

Traceback (most recent call last):
  File "lib/loadfunc.py", line 228, in main
db.cursor().execute(src)
  File "/usr/lib/python2.7/dist-packages/pgdb.py", line 1026, in execute
return self.executemany(operation, [parameters])
  File "/usr/lib/python2.7/dist-packages/pgdb.py", line 1050, in executemany
rows = self._src.execute(sql)
ProgrammingError: ERROR:  too many parameters specified for RAISE
CONTEXT:  compilation of PL/pgSQL function "tla_audit_delete_thing" near 
line 9

The line in question is:

raise notice '% %', 'blah_history.original_id', r;

Which looks fine. The really wierd thing is that this happens when done on a
debian9 host but when I load the function from another host (my macos laptop)
with the same function into the same database, it works fine.

I've never encountered an inconsistency like this before.

Any suggestions as to what might be causing it?

The python versions are slightly different and the pgdb module versions
are different but I wouldn't have thought that that would affect the
compilation performed by the database server itself:

  debian9:   python-2.7.13 pgdb-5.0.3
  macos-10.11.6: python-2.7.14 pgdb-4.2.2

And the sql sent to the database server is identical from both hosts.

And I don't think anything much has changed on the debian host recently.

And it's not just the old backup. The same is happening with other copies of
essentially the same database.

And all the other stored functions were loaded fine. It's just this one that
went wrong.

Thanks in advance for any insights you can share.

cheers,
raf




Re: Inconsistent compilation error

2018-04-25 Thread raf
Tom Lane wrote:

> r...@raf.org writes:
> > I have a stored function with code that looks like:
> 
> > raise notice '% %', 'blah_history.original_id', r;
> 
> > But I'm getting this compilation error when it tries to load this
> > function:
> 
> > ERROR:  too many parameters specified for RAISE
> 
> That is ... weird.  The code is checking that the number of % symbols in
> the string literal matches the number of comma-separated expressions after
> the literal, and it sure looks like those match.
> 
> > Which looks fine. The really wierd thing is that this happens when done on a
> > debian9 host but when I load the function from another host (my macos 
> > laptop)
> > with the same function into the same database, it works fine.
> 
> Weirder yet.
> 
> The only idea that comes to mind is that '%%', with nothing between the
> percent signs, means a literal '%' character not two parameter markers.
> If the contents of the string literal were '%%' then this is exactly
> the error message you'd get.
> 
> So here's a long-shot theory: what's in your source code is not a plain
> space but something weird like a no-break space, and when you transfer
> that text from machine A to machine B, the no-break space is getting
> dropped.
> 
>   regards, tom lane

thanks but it is normal space character.




Re: Inconsistent compilation error

2018-04-25 Thread raf
David G. Johnston wrote:

> On Wed, Apr 18, 2018 at 7:04 PM, Melvin Davidson 
> wrote:
> 
> >
> > Normally, literals are inside the first quotes. IE: raise notice '
> > blah_history.original_id' %', r;
> >
> 
> ​But a compiler doesn't care about human concepts like "normally" - it just
> cares about "syntactically correct" and as David and Tom observe the
> original post as visually observed is correct.  Which leads one to think
> that reality and the original post somehow don't match, or as Tom said what
> the server sees and what is sent end up being different (client encoding
> issues or some such).
> 
> And pretty sure "r" being NULL just puts an empty string where the % is.
> 
> David J.

passing a null value to raise notice outputs it as the string "NULL".




Re: Inconsistent compilation error

2018-04-25 Thread raf
Adrian Klaver wrote:

> On 04/18/2018 06:02 PM, r...@raf.org wrote:
> > Hi,
> > 
> > postgresql-9.5.12 on debian-9
> > 
> > I have a stored function with code that looks like:
> > 
> >  create or replace function tla_audit_delete_thing()
> >  returns boolean stable language plpgsql as $$
> >  declare
> >  r record;
> >  status boolean := 1;
> >  begin
> >  for r in select _.* from blah_history _ where _.original_id not in 
> > (select id from blah)
> >  loop
> >  raise notice '% %', 'blah_history.original_id', r;
> >  status := 0;
> >  end loop;
> >  [...]
> >  end
> >  $$
> >  security definer
> >  set search_path = public, pg_temp;
> >  revoke all on function tla_audit_delete_thing() from public;
> >  grant execute on function tla_audit_delete_thing() to staff;
> 
> So if you cut and paste the above directly into the database via psql on the
> Debian host do you see the same issue?

No. Cutting and pasting the text into psql works fine.

> > And I have a program that loads stored functions from disk
> > when they are different to what's in the database and I have
> > just loaded a very old database backup, brought the schema up
> > to date, and tried to bring the stored functions up to date.
> 
> -- 
> Adrian Klaver
> adrian.kla...@aklaver.com



Re: Inconsistent compilation error

2018-04-25 Thread raf
Adrian Klaver wrote:

> On 04/18/2018 06:02 PM, r...@raf.org wrote:
> > Hi,
> > 
> 
> > 
> > But I'm getting this compilation error when it tries to load this
> > function:
> > 
> >  ERROR:  too many parameters specified for RAISE
> >  CONTEXT:  compilation of PL/pgSQL function "tla_audit_delete_thing" 
> > near line 9
> > 
> >  Traceback (most recent call last):
> >File "lib/loadfunc.py", line 228, in main
> >  db.cursor().execute(src)
> >File "/usr/lib/python2.7/dist-packages/pgdb.py", line 1026, in 
> > execute
> >  return self.executemany(operation, [parameters])
> >File "/usr/lib/python2.7/dist-packages/pgdb.py", line 1050, in 
> > executemany
> >  rows = self._src.execute(sql)
> >  ProgrammingError: ERROR:  too many parameters specified for RAISE
> >  CONTEXT:  compilation of PL/pgSQL function "tla_audit_delete_thing" 
> > near line 9
> > 
> > The line in question is:
> > 
> >  raise notice '% %', 'blah_history.original_id', r;
> > 
> > Which looks fine. The really wierd thing is that this happens when done on a
> > debian9 host but when I load the function from another host (my macos 
> > laptop)
> > with the same function into the same database, it works fine.
> > 
> > I've never encountered an inconsistency like this before.
> > 
> > Any suggestions as to what might be causing it?
> > 
> > The python versions are slightly different and the pgdb module versions
> > are different but I wouldn't have thought that that would affect the
> > compilation performed by the database server itself:
> > 
> >debian9:   python-2.7.13 pgdb-5.0.3
> 
> Hmm, wonder if there is an oops in the below:
> 
> http://www.pygresql.org/contents/changelog.html
> 
> Version 5.0 (2016-03-20)
> Changes in the DB-API 2 module (pgdb):
> "SQL commands are always handled as if they include parameters, i.e. literal
> percent signs must always be doubled. This consistent behavior is necessary
> for using pgdb with wrappers like SQLAlchemy."

well spotted! but i'm not sure. it depends on what they mean by
"literal percent signs". that might just mean percent signs that
appear in SQL string literals that need to be output ultimately
as actual percent signs but i thought that they always had to be
doubled. so i'm not sure what they are saying has changed in
that version. so maybe you are right.

but if they are suggesting that every single percent sign needs
to be doubled by the caller before passing sql to the pgdb
module, that sounds like an annoying change to have made.
but no doubt they have their reasons.

i've encountered other new behaviour with pygresql-5+ that i had
to find ways to disable/revert so it's not surprising that there
might be other oddities to encounter. i'm surprised it's only
become a problem now.

i think you're definitely right. when i change my function
loading program to duplicate all percent signs in all the source
code before passing it to pgdb, they all load successfully and a
subsequent audit of the code inside the database and on disk
still shows that they match, so pgdb must be de-duplicating all
the duplicated percent signs.

thanks so much for spotting this.

cheers,
raf




Re: Rationale for aversion to the central database?

2018-04-27 Thread raf
> On Sun, 8 Apr 2018 14:39:49 -0700
> Guyren Howe mailto:guy...@gmail.com>> wrote:
> 
> When it comes to databases, I have universally encountered the
> attitude that one should treat the database as a dumb data bucket.
> There is a *very* strong aversion to putting much of any business
> logic in the database. I encounter substantial aversion to have
> multiple applications access one database, or even the reverse: all
> abstraction should be at the application layer.

That seems un-pragmatic to me. IMHO if any business logic needs
access to lots of data, it's best implemented by code that
resides in the database itself, close to the data. I once had a
job where, one night a week, I couldn't go home until a certain
program had been run and completed successfully. That meant I
wasn't going home until midnight. I realised that the person
that wrote it was (at least for the purposes of that program)
treating the database like "a dumb data bucket". Millions of
records were being selected, transferred over a network to
another host, effectively grouped and summarised, then each
resulting summary record was inserted into another table, one
stored function call at a time (with all the network round trip
times that that implies). It took 2-3 hours to complete. I
replaced it with a stored function that took two minutes to run
and I was able to start going home hours earlier. So, as you can
imagine, it would take an incredibly good argument to convince
me that business logic shouldn't reside in the database. :-)

I've always assumed (perhaps incorrectly) that not wanting
business logic in the database (and not using all of the
features that databases provide) was just a way to justify
programmers not having to learn SQL but it's just another
language and paradigm and programmers know so many languages and
paradigms anyway that I never understood why knowing SQL was a
problem. My assumption is probably wrong. And I guess as long as
the resulting software runs as quickly as it needs to, it
probably isn't an issue. If it's a widely held view, then it
must be workable.

I use stored functions exclusively, partly so that the business
logic is close to the data it needs to be close to for
efficiency, but also for security reasons. Users have no
permissions to select, update, insert or delete anything. All
they can do is execute stored functions that have been
previously created by the database owner who does have those
permissions. For a system that's accessible to the public, It's
a great way to guarantee that SQL injection can't do any harm
(even if all the other protections fail to work or are
bypassed). For a system that's only accessible to internal
staff, it's a great way to defend against their workstations
being infected by malware that goes looking for databases to
attack.

cheers,
raf




Re: CSVQL? CSV SQL? tab-separated table I/O? RENAME COLUMN

2018-05-02 Thread raf
Ron wrote:

> On 05/02/2018 04:49 PM, David G. Johnston wrote:
> [snip]
> > 
> > - the microsoft patented CSV would be required for implementation. it
> > handles special data with commas and double-quotes in them
> > 
> > 
> > If true this seems like a show-stopper to anything PostgreSQL would 
> > implement
> 
> If MSFT really holds a patent on the CSV format, then Postgresql is already
> in a world of hurt.

Even if the CSV format was patented, don't patents only last 17 years?
Has anyone found the patent? When was it granted? I would guess in the 1980s.
And can anyone remember MS ever demanding a fee for using it?




Re: Control PhoneNumber Via SQL

2018-05-15 Thread raf
haman...@t-online.de wrote:

> >> Hi,
> >> 
> >> I would like to know if it is possible to control the phone number in SQL
> >> before inserting it to the destination DB?
> >> 
> >> I have a model in Django:
> >> 
> >> class BasePerson(TimeStampedModel):
> >>  phone_number = PhoneNumberField(max_length=50, verbose_name=_(u'phone
> >> number'), blank=True)
> >> 
> >> The data for phone number that I am migrating doesn't have country code. I
> >> want to determine first if the number has country code in it, if it doesn't
> >> then I will add the country code on the number before INSERTING it to the
> >> destination database.
> >> 
> Hi, something like
> insert into newtable (phone, ...)
>  select case when phone ~ '^0[1-9]' then regex_replace('0', '+49', phone) else
>   case when phone ~ '^00' then regex_replace('00', '+', phone) else phone end 
> end, ...
>  from oldtable;
> 
> Regards
> Wolfgang

it might be better to ask this on a django forum since it sounds
like you want django's orm to handle this. you probably just
need to subclass PhoneNumberField so its constructor will
reformat whatever is given to it as the phone number to be
inserted. maybe you need a Manager class for the model (probably
not). maybe you just need a function that takes the default
country code or country dialling code and the phone number and
returns what you want to insert and then always use its return
value when assigning a value to the phone_number field. i
suspect that subclassing PhoneNumberField is probably the best
approach.

cheers,
raf




Re: How to search particular line/text code in all Postgres all database objects

2018-08-27 Thread raf
Raghavendra Rao J S V wrote:

> Hi All,
> 
> How to search particular line/text code in all Postgres all database
> object's like functions,triggers,views etc ?
> 
> Is there any data dictionary table in Postgres?
> 
> Eg:- My requirement is , I want to found data related to employee table in
> any function, trigger,view etc.
> 
> Kindly help me.
> -- 
> Regards,
> Raghavendra Rao J S V
> Mobile- 8861161425

This may not be helpful but all function, trigger and view
source code (and table definitions) should exist outside the
database in a git repository or similar and that can be
searched. Otherwise, you probably can't audit whether or
not the code in the database is up to date because you have
nothing to compare it to.

I only use functions and the code for them resides in pg_proc.
My tool for auditing functions contains this query which might
be a starting point for you:

select
p.proname, -- name
p.proretset, -- returns setof?
p.proisstrict, -- strict 't' or 'f'
p.provolatile, -- volatile or stable 'v' or 's'
p.prosecdef, -- security definer 't' or 'f'
p.pronargs, -- number of in arguments
p.prorettype, -- return type
p.proargtypes, -- space-separated list of in arg types
p.proallargtypes, -- array of in/out arg types (iff there are out 
args)
p.proargmodes, -- array of in/out arg modes like {i,o,o} (iff there 
are out args)
p.proargnames, -- array of in/out arg names like {id,code,name}
p.prosrc, -- source code
cast(cast(p.oid as regprocedure) as text) -- nice signature
from
pg_user u,
pg_proc p
where
u.usename = current_user and
p.proowner = u.usesysid and
p.proname like 'aps%' -- 'aps' is the prefix for our functions
order by
p.proname

Or just: select proname, prosrc from pg_proc where proname like ...

cheers,
raf




question about client/server version mismatches

2019-08-12 Thread raf
Hi,

macos-10.11.6
postgresql-9.5.16 (server)
postgresql-9.6.12 (client)

I've just rebooted my macos laptop for the first time in months
and have started having a problem loading database backups there.
I get this error output:

  pg_restore: [archiver (db)] Error while INITIALIZING:
  pg_restore: [archiver (db)] could not execute query: ERROR:  unrecognized 
configuration parameter "idle_in_transaction_session_timeout"
  Command was: SET idle_in_transaction_session_timeout = 0;

It seems that I was using 9.6 clients and a 9.5 server.
I'm not sure what the situation was before the reboot.
The server was probably 9.6.12 and all was well.
I never got around to fully upgrading the server
(or the old version wouldn't still be there).

Loading a 9.5 backup using the 9.6 pg_restore lead to
the above error. Changing to 9.5 clients fixed it.

Since the backup itself was from a 9.5.12 server, it
seems that the 9.6 parameter, idle_in_transaction_session_timeout,
must have been set by the 9.6 client even though it was
connected to a 9.5 server. Is that expected behaviour?

Do the clients know when each configuration parameter
was introduced and only use them when connected to servers
where they mean something? Or am I just misunderstaing
what's happening? I would have thought a new client
would be able to work with an old but supported server.

It's not a big deal. I'll get around to completing the
upgrade and it'll be fine again but I'm curious.

cheers,
raf





Re: Work hours?

2019-08-27 Thread raf
Rob Sargent wrote:

> On 8/27/19 4:59 PM, Adrian Klaver wrote:
> > On 8/27/19 3:27 PM, stan wrote:
> > > I am just starting to explore the power of PostgreSQL's time and date
> > > functionality. I must say they seem very powerful.
> > > 
> > > I need to write a function that, given a month, and a year as input
> > > returns
> > > the "work hours" in that month. In other words something like
> > > 
> > > 8 * the count of all days in the range Monday to Friday) within that
> > > calendar month.
> > > 
> > > Any thoughts as to the best way to approach this?
> > 
> > Use generate_series:
> > 
> > https://www.postgresql.org/docs/11/functions-srf.html
> > 
> > to generate all the days in the month.
> > 
> > Loop over the days and use EXTRACT:
> > 
> > https://www.postgresql.org/docs/11/functions-datetime.html#FUNCTIONS-DATETIME-EXTRACT
> > 
> > 
> > to find the dates with a dow(The day of the week as Sunday (0) to
> > Saturday (6)) or isodow(The day of the week as Monday (1) to Sunday (7))
> > that falls in Mon-Fri and add to counter.
> > 
> > > 
> Don't you also need a feed from something like google US holidays (assuming
> OP is stateside)

The definition of "work hours" differs depending on who
it's being calculated for. If your definition above is
sufficient for your needs then ignore the following but
some people work an 8 hour day, others a 7.6 hour day,
etc. It varies from one agreement to another. Some
people work 7 days a week with several weeks "on" and
several weeks "off". Some are full-time. Others are
part-time. Some might have regular hours. Others might
have an arbitrary roster that changes from week to
week. Some public holidays are state-wide. Others are
regional so you need to know where they work and the
definitions of the regions. Some public holidays aren't
even for the whole day. And no doubt every country is
different.

All of that is by far the biggest component of such a
calculation. The postgres-specific bit is easy and yes,
what Adrian suggests will be fine if you can use the
dates returned by generate_series to look up the
working conditions of the person involved. I've done it
in plpgsql with a loop over the dates rather using
generate_series in plain sql. Both are fine but plain
sql is probablby faster. Do whatever is most readable.

cheers,
raf





update returning order by syntax error question

2019-09-11 Thread raf
Hi,

postgresql-9.6.15

I just tried something like:

  select * from
  (update tblname t set ... where ... returning ...)
  order by ...;

assuming it would work but it didn't.
That's OK. I found on stackoverflow
that a CTE can be used to do it:

  with u as
  (update tblname t set ... where ... returning ...)
  select * from u order by ...;

What surprises me is the syntax error:

  ERROR:  syntax error at or near "t"
  LINE 2:  tblname t
   ^
If the syntax was invalid because an update returning
statement can't appear in a from clause, I'd expect the
error to be at the token "update".

It's almost as if the parser sees "update" as a possible
table name (rather than a reserved word) and "tblname"
as the alias for that table and it's expecting a comma
or left/right/full etc. when it seess the "t".

Anyone care to explain why the error is what it is?
It's no big deal. I'm just curious.

cheers,
raf





Re: Redis 16 times faster than Postgres?

2019-09-29 Thread raf
Steve Litt wrote:

> On Mon, 30 Sep 2019 07:46:14 +1000
> Nathan Woodrow  wrote:
> 
> > Redis is a in memory database so I would except it to be always much
> > faster..
> 
> Is there a way to have Redis periodically update an on-disk backup?
> That would be great, but otherwise you're at the mercy of your power
> company (here in Central Florida it's routine for power to go down and
> stay down for five hours).
> 
> SteveT
>  
> Steve Litt
> Author: The Key to Everyday Excellence
> http://www.troubleshooters.com/key
> Twitter: http://www.twitter.com/stevelitt

i don't know but voltdb, another in-memory database,
replicates to other instances which can be in different
geographical locations and so not prone to a single
power failure. perhaps all in-memory databases are
aware of the need for this.

cheers,
raf





Re: Users, Roles and Connection Pooling

2019-10-02 Thread raf
> On Wed, 2019-10-02 at 08:46 +1000, Matt Andrews wrote:
> > Here’s a question I’ve been asking for a while and just can’t find an
> > answer to, so I thought I’d ask it here. The answer could be
> > subjective, but here goes...
> > 
> > When a web app connects to Postgres via a connection pooler, what is
> > the best way to manage privileges for the connecting user? Especially
> > when their is a complex hierarchy of privileges?
> > 
> > Should each signed up user have their own role which inherits from
> > whichever roles they are members of? This means that the connection
> > pool role should then be granted in every user?
> > 
> > Does creating a Postgres role for every app user scale? Roles can
> > only have names, is an email address a good name for individual app
> > users?
> > 
> > Are their any resources out there that anyone knows of that addresses
> > these questions?
> > 
> > There’s plenty of stuff out there on roles and permissions, I’ve read
> > a lot of it, but not much on dealing with individual app users
> > through a connection pool. 
> > 
> > I’m thinking that any queries should start with SET ROLE ‘username’
> > and end with RESET ROLE. Is this how it could work?
> > 
> > Any help would be greatly appreciated. 

I can't help with questions about scale but I like to give roles/users
almost no permissions at all. i.e. They can't select, insert, update
or delete anything. All they have permission to do is to execute stored
functions that were installed by a role with the necessary permissions
and they are security defining functions so the permissions of the role
that created them apply when the functions are called. This means that
there will never be any successful SQL injection, even if the application
code is buggy, so it's more important for web applications, but I apply
this method to internal systems as well. This approach might help with
scaling because fewer users might be needed but I'm not sure.

cheers,
raf





Re: Users, Roles and Connection Pooling

2019-10-02 Thread raf
Rob Sargent wrote:

> On 10/2/19 5:27 PM, raf wrote:
> > > 
> > I can't help with questions about scale but I like to give roles/users
> > almost no permissions at all. i.e. They can't select, insert, update
> > or delete anything. All they have permission to do is to execute stored
> > functions that were installed by a role with the necessary permissions
> > and they are security defining functions so the permissions of the role
> > that created them apply when the functions are called. This means that
> > there will never be any successful SQL injection, even if the application
> > code is buggy, so it's more important for web applications, but I apply
> > this method to internal systems as well. This approach might help with
> > scaling because fewer users might be needed but I'm not sure.
> > 
> > cheers,
> > raf
> > 
> How easy is it to introduce an new function call all the way up to the app
> user?

It's easy when you have the right tools to make it
easy: i.e. you write the stored function, then run a
tool to generate the python class for the result set
and the python function that the client applications
can then call to execute the stored function and return
its results.

> Does this approach preclude making use of any query generation
> techniques available?

Yes, it does. I'm happy to write my own plpgsql and
sql. I find that usually results in faster results
(mainly by reducing the temptation to process data
outside the database) as well as being more secure.

I once worked in a job were I couldn't go home until
some program had finished and it was sucking data out
of the database just to summarise it and insert the
summaries. It tooks hours. I replaced it with a stored
procedure that took two minutes and I started going
home much earlier. Rightly or wrongly, that made me
prefer prcessing data inside the database.

cheers,
raf





Re: jsonb_set() strictness considered harmful to data

2019-10-20 Thread raf
Steven Pousty wrote:

> I would think though that raising an exception is better than a
> default behavior which deletes data.

I can't help but feel the need to make the point that
the function is not deleting anything. It is just
returning null. The deletion of data is being performed
by an update statement that uses the function's return
value to set a column value.

I don't agree that raising an exception in the function
is a good idea (perhaps unless it's valid to assume
that this function will only ever be used in such a
context). Making the column not null (as already
suggested) and having the update statement itself raise
the exception seems more appropriate if an exception is
desirable. But that presumes an accurate understanding
of the behaviour of jsonb_set.

Really, I think the best fix would be in the
documentation so that everyone who finds the function
in the documentation understands its behaviour
immediately. I didn't even know there was such a thing
as a strict function or what it means and the
documentation for jsonb_set doesn't mention that it is
a strict function and the examples of its use don't
demonstrate this behaviour. I'm referring to
https://www.postgresql.org/docs/9.5/functions-json.html.

All of this contributes to the astonishment encountered
here. Least astonishment can probably be achieved with
additional documentation but it has to be where the
reader is looking when they first encounter the
function in the documentation so that their
expectations are set correctly and set early. And
documentation can be "fixed" sooner than postgresql 13.

Perhaps an audit of the documentation for all strict
functions would be a good idea to see if they need
work. Knowing that a function won't be executed at all
and will effectively return null when given a null
argument might be important to know for other functions
as well.

cheers,
raf





Re: jsonb_set() strictness considered harmful to data

2019-10-21 Thread raf
Steven Pousty wrote:

> On Sun, Oct 20, 2019 at 4:31 PM raf  wrote:
> 
> > Steven Pousty wrote:
> >
> > > I would think though that raising an exception is better than a
> > > default behavior which deletes data.
> >
> > I can't help but feel the need to make the point that
> > the function is not deleting anything. It is just
> > returning null. The deletion of data is being performed
> > by an update statement that uses the function's return
> > value to set a column value.
> >
> > I don't agree that raising an exception in the function
> > is a good idea (perhaps unless it's valid to assume
> > that this function will only ever be used in such a
> > context). Making the column not null (as already
> > suggested) and having the update statement itself raise
> > the exception seems more appropriate if an exception is
> > desirable. But that presumes an accurate understanding
> > of the behaviour of jsonb_set.
> >
> > Really, I think the best fix would be in the
> > documentation so that everyone who finds the function
> > in the documentation understands its behaviour
> > immediately.
> >
> Hey Raf
> 
> In a perfect world I would agree with you. But often users do not read ALL
> the documentation before they use the function in their code OR they are
> not sure that the condition applies to them (until it does).

I'm well aware of that, hence the statement that this
information needs to appear at the place in the
documentation where the user is first going to
encounter the function (i.e. in the table where its
examples are). Even putting it in a note box further
down the page might not be enough (but hopefully it
will be).

cheers,
raf





Re: PostgreSQL vs PostgresQL

2019-12-10 Thread raf
Олег Самойлов wrote:

> My vote to change official name form PostgreSQL to PostgresQL. The
> reason is: many young people look at PostgreSQL and think that the
> short name is Postgre. Sounded awfully, especially in Russian, tied to
> fix neighbours. With PostgresQL they will think that the short name
> is Postgres and this is good. Also will be opinion that PostgreQL =
> Postgres Query Language, this is also true, due to lack in Postgres
> many standard SQL features still.

-1





Re: Restrict connection from pgadmin.

2020-02-04 Thread raf
Tom Lane wrote:

> Pawan Sharma  writes:
> > Yes I will show the pgadmin in stat_activity but how  can block that..
> > I think I need to create a job to terminate the pgadmin connections and
> > schedule it for  every 5 min and so that I will check any new connections
> > from pgadmin.
> 
> I think onlookers are still completely mystified as to why you consider
> this a useful activity.
> 
> pgadmin is nothing but a GUI.  Whatever can be done through it can be
> done equally well through psql, or any other client software.  So if
> you're looking for security against unwanted SQL commands, you're going
> about it the wrong way (better to look at preventing logins of privileged
> accounts, and/or use of SQL permissions to limit what can be done).
> If your objective is something else, you haven't explained what that is.
> 
>   regards, tom lane

You could give normal/application users/roles very
limited permissions (i.e. just the ability to execute
pre-existing security-defining functions and nothing
else), and have a database owner user/role with all the
permissions to create those functions. That's my insane
setup and I love it. Then use pg_hba.conf to limit
which IP addresses the database owner user/role can log
in from. You could also uninstall pg_admin. :-)

Also, if you are worried about "doing a lot of damage
in a minute", always script everything and test it
first either in a transaction that will rollback or on
a test server before executing it in production. I
don't think a GUI is suitable for this.

cheers,
raf





Re: Backing out of privilege grants rabbit hole

2020-04-02 Thread raf
It's probably more sensible to grant permissions to roles that
represent groups, and have roles for individual users that
inherit the permissions of the group roles. Then you don't
need to revoke the permissions just because an individiual
has left.

cheers,
raf

AC Gomez wrote:

> Thanks for the quick response. The problem is, in most cases the owner is
> not the grantee. So if a role, let's say a temp employee, gets grants, then
> leaves, I can't do a drop owned because that temp never owned those
> objects, he just was granted access. Is there a "drop granted" kind of
> thing?
> 
> On Thu, Apr 2, 2020, 11:37 PM Guyren Howe  wrote:
> 
> > https://www.postgresql.org/docs/12/sql-drop-owned.html
> >
> > On Apr 2, 2020, at 20:34 , AC Gomez  wrote:
> >
> > Do I understand correctly that if a role was assigned countless object
> > privileges and you want to delete that role you have to sift through a
> > myriad of privilege grants in what amounts to a time consuming trial and
> > error exercise until you've got them all?
> >
> > Or is there a single command that with just delete the role and do a
> > blanket grant removal at the same time?




Re: timestamp and timestamptz

2020-04-15 Thread raf
Tim Cross wrote:

> Niels Jespersen  writes:
> 
> > Hello all
> >
> > We have some data that have entered a timestamp column from a csv. The data 
> > in the csv are in utc. We want to access the data in
> > our native timezone (CET). 
> >
> > I am considering a few alternatives: 
> >
> > 1.   Early in the process, convert to timestamptz and keep this 
> > datatype. 
> >
> > 2.   Early in the process, convert to timestamp as understood in CET.  
> > This will imply by convention that the data in the timestamp
> > column represents CET. Users will need to be told that data represents CET, 
> > even if data is somwhere in the future kept in another
> > country in another timezone. 
> >
> > I probably should choose 1 over 2. But I am a bit hesitant, probably 
> > because we almost never have used timestamptz. 
> >
> > Can we agree that the below query is selecting both the original utc 
> > timestamps and 2 and 1 (as decribed above)?
> >
> > set timezone to 'cet';
> >
> > select read_time read_time_utc, (read_time at time zone 'utc')::timestamp 
> > read_time_cet, (read_time at time zone 'utc')::timestamptz
> > read_time_tz from t limit 10;
> >
> > We are on Postgres 12. 
> 
> Keep your life simple - just go with option 1. Keep all timestamps in
> UTC and let clients deal with it in whatever way they need to.

That suggestion (keep timestamps in UTC) doesn't sound like option 1 to me
(i.e. convert timestamps-that-are-in-UTC-but-don't-record-the-fact to 
timestamptz).
Option 1 could be done using any timezone. The point is to record what the
timezone is. It doesn't matter whether it's left as UTC or converted to CET
as long as that decision is recorded in the data by using timestamptz (and
as long as you get postgres to perform the time zone conversion because it
will get it right).

But presumably there is some reason why the data is already in UTC
and there's probably no reason to change that. But its timezone should
be recorded.

> This will
> also help deal with issues associated with daylight savings time (DST
> can be a real pain as different locations have it and others don't and
> the 'switchover' date is subject to political whims and can change).

That's a dim view of the tzdata database which gets updated regularly
to take such political decisions into account. As long as postgres uses
the same tzdata as all good UNIX-based operating systems do, and it's kept
up to date, it should be fine (as long as you never trust what a Windows
host says the timezone is for arbitrary timestamps).

This is required even if you leave data in UTC if it ever needs to be
displayed in any other time zone. Postgres still needs up to date tzdata
to perform the conversions later for users.

> Your option 2 will cause all sorts of issues and keep in mind that most
> larger countries have multiple timezones, so even if your data is all
> associated with a single country, you can potentially have multiple
> conversion routines required. On most *nix systems, clock time is UTC as
> well, so having everything in UTC really helps when you want to do
> diagnosis across your database and system log files etc. 

I don't see much difference in storing a timestamptz in UTC or a timestamptz
in CET. As long as the intended offset from UTC is recorded (which it is
in a timestamptz) it should be fine. If the timestamp is CET, then that fact
is in the data and the user doesn't need to be told it separately. It's obvious
when they see the data because the timezone is part of the data.

And you don't need conversion routines. Postgres can compare timestamptz
values and convert from one time zone to another for display purposes.

However, option 2 seems to be converting the data to CET but not using the
timestamptz datatype. That would be a bad idea. I think timestamptz should
always be preferred to timestamp. Not recording the timezone is where the
problems come from.

Although having a single timezone for log files is a great idea. They hardly
ever include timezone information so keeping everything in the same timezone
is important. Mind you, they hardly ever even include the year. What's with
that? (yes, I'm looking at you /var/log).

> -- 
> Tim Cross

cheers,
raf





Re: timestamp and timestamptz

2020-04-16 Thread raf
David G. Johnston wrote:

> On Wed, Apr 15, 2020 at 4:53 PM raf  wrote:
> 
> > I don't see much difference in storing a timestamptz in UTC or a
> > timestamptz
> > in CET. As long as the intended offset from UTC is recorded (which it is
> > in a timestamptz) it should be fine.
> 
> I only really skimmed the entire response but this framing of how
> timestamptz stores results is wrong.  Once you've stored a timestamptz in
> PostgreSQL you no longer have any knowledge of the timezone.  If you truly
> need that you need to record that in a different field.  What you do know
> is that PostgreSQL has a known point-in-time in UTC and can give you back
> the same value expressed in any other timezone according to the rules in
> the timezone database.
> 
> Or, as written verbatim in the documentation:
> """
> For timestamp with time zone, the internally stored value is always in UTC
> (Universal Coordinated Time, traditionally known as Greenwich Mean Time,
> GMT). An input value that has an explicit time zone specified is converted
> to UTC using the appropriate offset for that time zone. If no time zone is
> stated in the input string, then it is assumed to be in the time zone
> indicated by the system's TimeZone parameter, and is converted to UTC using
> the offset for the timezone zone.
> """
> 
> https://www.postgresql.org/docs/12/datatype-datetime.html
> 
> David J.

You are misinterpreting the documentation, or the
documentation is incomplete/misleading at that
location. It doesn't just convert and store the time in
UTC. It stores the time in UTC and it also stores the
offset from UTC as determined by the time zone
specified on input (that section of the documentation
might not mention that fact but it is true
nonetheless).

I store localtimes in the database and always see the
offset when selecting it later. The timezone
information doesn't just disappear as you (or that
piece of documentation) might be suggesting.

If you don't believe me, try this:

  create table example (t timestamptz not null);
  insert into example (t) values (timestamptz '2020-04-16 17:12:33.71768 
Australia/Sydney');
  select * from  example;
  drop table example;

When I run that, I see:

  CREATE TABLE
  INSERT 0 1
t   
  --
   2020-04-16 17:12:33.71768+10
  (1 row)
  
  DROP TABLE

So the timezone is stored (or the offset is stored if
you prefer). Had it been daylight savings on that date
in Sydney, the offset would have been "+11". It's all
fine.

cheers,
raf




Re: timestamp and timestamptz

2020-04-16 Thread raf
Steve Baldwin wrote:

> I'm pretty sure you are mistaken. Postgres doesn't store the 'creating'
> time zone in a timestamptz column.
> 
> Try doing this before re-running your test:
> 
> set timezone to 'utc';
> 
> What you are seeing in your test is an artifact of that timezone setting.
> 
> Steve

Thanks. You're right.

  create table example (t timestamptz not null);
  insert into example (t) values (timestamptz '2020-04-16 17:12:33.71768 
Australia/Sydney');
  select * from  example;
  set timezone to 'utc';
  select * from  example;
  drop table example;

Does this:

  CREATE TABLE
  INSERT 0 1
  t   
  --
   2020-04-16 17:12:33.71768+10
  (1 row)

  SET
t   
  --
   2020-04-16 07:12:33.71768+00
  (1 row)

  DROP TABLE

So it doesn't store the offset, but I've used postgres
for 12 years without knowing that and it hasn't been
a problem. Yay, postgres!

It doesn't store the offset but, by using timestamptz,
it knows that the timezone is UTC. That's what matters.
The fact that it knows the time zone is what makes everything
work. Timestamp without time zone is best avoided I think.

cheers,
raf







Re: timestamp and timestamptz

2020-04-16 Thread raf
Adrian Klaver wrote:

> On 4/16/20 1:23 AM, raf wrote:
> > Steve Baldwin wrote:
> > 
> > > I'm pretty sure you are mistaken. Postgres doesn't store the 'creating'
> > > time zone in a timestamptz column.
> > > 
> > > Try doing this before re-running your test:
> > > 
> > > set timezone to 'utc';
> > > 
> > > What you are seeing in your test is an artifact of that timezone setting.
> > > 
> > > Steve
> > 
> > Thanks. You're right.
> > 
> >create table example (t timestamptz not null);
> >insert into example (t) values (timestamptz '2020-04-16 17:12:33.71768 
> > Australia/Sydney');
> >select * from  example;
> >set timezone to 'utc';
> >select * from  example;
> >drop table example;
> > 
> > Does this:
> > 
> >CREATE TABLE
> >INSERT 0 1
> >t
> >--
> > 2020-04-16 17:12:33.71768+10
> >(1 row)
> > 
> >SET
> >  t
> >--
> > 2020-04-16 07:12:33.71768+00
> >(1 row)
> > 
> >DROP TABLE
> > 
> > So it doesn't store the offset, but I've used postgres
> > for 12 years without knowing that and it hasn't been
> > a problem. Yay, postgres!
> > 
> > It doesn't store the offset but, by using timestamptz,
> > it knows that the timezone is UTC. That's what matters.
> 
> Well that is somewhat misleading. The value entered is stored as timestamp
> with an implicit tz of 'UTC'. The issue that trips people up is the format
> of the input value. If you enter an input with an offset or correct tz value
> then you have given Postgres an explicit value to work off for converting it
> to 'UTC'.

When I said "it knows that the timezone is UTC", I was only referring
to the values once stored as a timestamptz, not the input. Sorry that
wasn't clear enough. I meant to say that once a value is stored in a
timestamptz (as opposed to a timestamp), postgres knows that it is
stored in UTC and will do the right things with it.

> For what is correct see here:
> 
> https://www.postgresql.org/docs/12/datatype-datetime.html#DATATYPE-TIMEZONES
> 
> If you input a value that does not have the above then Postgres uses the SET
> TimeZone value to implicitly set the input value's tz(as pointed out by
> David Johnson upstream). In other words Postgres does not assume an input
> value is in 'UTC'. For the OP's case that could cause issues if the
> timestamp in the CSV file does not have a proper offset/tz and the client is
> using a tz other then 'UTC'(as pointed by others upstream). The bottom line
> is that when dealing with timestamps explicit is better then implicit.

I couldn't agree more.

> > The fact that it knows the time zone is what makes everything
> > work. Timestamp without time zone is best avoided I think.
> > 
> > cheers,
> > raf
> 
> -- 
> Adrian Klaver
> adrian.kla...@aklaver.com
> 




Re: GPG signing

2020-05-26 Thread raf
Marc Munro wrote:

> On Tue, 2020-05-26 at 12:04 -0700, Adrian Klaver wrote:
> > On 5/26/20 12:01 PM, Marc Munro wrote:
> > > I need to be able to cryptographically sign objects in my database
> > > using a public key scheme.
> > > [ . . . ]
> > > Any other options?  Am I missing something?
> > 
> > https://www.postgresql.org/docs/12/pgcrypto.html#id-1.11.7.34.7
> 
> I looked at that but I must be missing something.  In order to usefully
> sign something, the private, secret, key must be used to encrypt a
> disgest of the thing being signed (something of a simplification, but
> that's the gist).  This can then be verified, by anyone, using the
> public key.
> 
> But the pgcrypto functions, for good reasons, do not allow the private
> (secret) key to be used in this way.  Encryption and signing algorithms
> are necessarily different as the secret key must be protected; and we
> don't want signatures to be huge, and it seems that pgcrypto has not
> implemented signing algorithms.
> 
> What am I missing?
> 
> __
> Marc

That page linked to above says:

  F.25.3.10. Limitations of PGP Code

No support for signing.





Re: Oracle vs. PostgreSQL - a comment

2020-06-02 Thread raf
Ron wrote:

> On 6/2/20 1:56 PM, Tim Clarke wrote:
> > On 02/06/2020 19:43, Stephen Frost wrote:
> > > > But require a new port, and Enterprises have Processes that must be 
> > > > followed.
> > > Sure they do.  Automate them.
> > > 
> > > :)
> > > 
> > > Thanks,
> > > Stephen
> > 
> > +1 for automation, isoX != slow
> It is when FW rules must be manually approved (and they do review them all),
> then the TASK is converted to a CHANGE and that goes before a CAB meeting.
> 
> That's all bypassed with SQL Server and Oracle, though.

Presumably, these processes have to be followed for SQL
Server and Oracle at least once too. If someone with
the same process requirements wanted to use Postgresql
instead, and they knew that they might need multiple
ports, presumably the process could be followed once
for a (possibly pre-allocated) set of ports. It doesn't
have to be any less efficient. Same number of meetings,
just in relation to a different number of ports. The
only problem would be when the processes were followed
for a single port before it was realised that more
ports would be needed later. Then the process would
have to be followed twice, once for the first port, and
once again for all the other ports that might become
necessary.

cheers,
raf





Re: Changing from security definer to security invoker without dropping ?

2020-06-12 Thread raf
Laura Smith wrote:

> ‐‐‐ Original Message ‐‐‐
> On Thursday, 11 June 2020 08:39, Pavel Stehule  
> wrote:
> 
> > Hi
> >
> > čt 11. 6. 2020 v 9:29 odesílatel Laura Smith 
> >  napsal:
> >
> > > Hi,
> > >
> > > Just curious if there is a way to switch a function from definer to 
> > > invoker without dropping ?
> >
> > create function foo(a int) returns int as $$ begin return $1; end $$ 
> > language plpgsql;
> >
> > postgres=# alter function foo (int) security definer;
> > ALTER FUNCTION
> > postgres=# alter function foo (int) security invoker;
> > ALTER FUNCTION
> >
> > regards
> >
> > Pavel
> 
> Thanks Pavel !  Didn't realise it was that easy.

"create or replace function..." with "security invoker"
(or without "security definer" since security invoker
is the default) is probably another way to do it, but
it would be slower than "alter function" since it needs
to parse the code again. That might matter if you have
many functions to change.

Bear in mind that things might break with such a change.
There might be a reason that the functions needed to be
created as security definers. I'd recommend checking
each function's need to be a security definer before
changing it (or just test it thoroughly somewhere).

cheers,
raf





Re: Hiding a GUC from SQL

2020-06-21 Thread raf
Laurenz Albe wrote:

> On Wed, 2020-06-17 at 13:23 -0700, Michel Pelletier wrote:
> > In my extension pgsodium I'm defining a custom variable at startup to store 
> > a key:
> > 
> > https://github.com/michelp/pgsodium/blob/master/src/pgsodium.c#L1107
> > 
> > I'm using the flags GUC_NO_SHOW_ALL | GUC_NO_RESET_ALL | GUC_NOT_IN_SAMPLE 
> > | GUC_DISALLOW_IN_FILE,
> > and a custom "no show" show hook that obscures the value.  This idea was 
> > inspired from the
> > pgcryptokey module from Bruce Momjian.
> > 
> > The value cannot be shown either with SHOW or current_setting() and it does 
> > not appear in pg_settings.
> >  From what I can tell, the value is inaccessible from SQL, but I think it's 
> > worth asking
> > the experts if there is some other demonstrable way, from SQL, that this 
> > value could be
> > leaked even to a superuser.  no sql level user should be able to see this 
> > value, only a C function,
> > like the pgsodium_derive() from which to derive other keys, should be able 
> > to see it.
> > I realize that someone with external process access can get the key, my  
> > goal is to prevent
> > accessing it from SQL.
> > 
> > Any thoughts on weaknesses to this approach would be welcome.  Thanks!
> > 
> > -Michel
> 
> A superuser can access files and start programs on the server machine.
> 
> A dedicated superuser may for example attach to PostgreSQL with a debugger
> and read the value of the variable.
> 
> And if that doesn't work, there may be other things to try.
> 
> It is mostly useless to try to keep a superuser from doing anything that
> the "postgres" operating system user can do.
> 
> Yours,
> Laurenz Albe

But only mostly useless. :-) There are ways to limit the power of the
superuser. On Linux, for instance, "sysctl kernel.yama.ptrace_scope=3"
prevents tracing, debugging, and reading another process's memory, even
by the superuser, and the only way to turn it off is via a (hopefully
noticeable) reboot. And, if the keys aren't present on the server at
boot time, and aren't fetched from their remote source (or read from a
user) unless that yama setting is in place, then it will be very hard
for a superuser to obtain the keys. If a remote source KMS is used,
ideally, you'd also want it to cryptographically verify that its client
hadn't been tampered with (somehow), or to not hand out the keys except
for planned reboots. The point is that it's not useless to make things
harder for a superuser.

You might not stop a legitimate sitewide superuser whose family is being
held hostage, but you can stop, or at least make things much more
difficult, for a superuser process on a single host that is the result
of a software vulnerability that wasn't nobbled by apparmor or selinux
or grsecurity.

cheers,
raf





Re: Hiding a GUC from SQL

2020-06-21 Thread raf
Laurenz Albe wrote:

> On Mon, 2020-06-22 at 09:44 +1000, raf wrote:
> > A superuser can access files and start programs on the server machine.
> > > A dedicated superuser may for example attach to PostgreSQL with a debugger
> > > and read the value of the variable.
> > > 
> > > And if that doesn't work, there may be other things to try.
> > > 
> > > It is mostly useless to try to keep a superuser from doing anything that
> > > the "postgres" operating system user can do.
> > 
> > But only mostly useless. :-) There are ways to limit the power of the
> > superuser. On Linux, for instance, "sysctl kernel.yama.ptrace_scope=3"
> > prevents tracing, debugging, and reading another process's memory, even
> > by the superuser, and the only way to turn it off is via a (hopefully
> > noticeable) reboot.
> 
> Interesting.  Will this block a user from debugging his own processes?

Yes.

> Perhaps you can plug that hole that way, but that was just the first thing
> that popped in my head.  Don't underestimate the creativity of attackers.
> I for one would not trust my ability to anticipate all possible attacks,
> and I think that would be a bad security practice.

Yes, but that's no reason not to perform as much risk
assessment and mitigation as you can afford/justify.
Not being able to prevent all attacks is no reason not
to prevent those that you can. :-) Nobody said anything
about underestimating anyone or trusting anyone.

> Yours,
> Laurenz Albe

cheers,
raf





Re: PostgreSQL 12 - ERROR: could not rename file "pg_logical/snapshots/1A-7C00D890.snap.13594.tmp" to "pg_logical/snapshots/1A-7C00D890.snap": No space left on device

2020-06-30 Thread raf
FOUTE K. Jaurès wrote:

> Hi everyone,
> 
> I am use a native logical replication in PostgreSQL 12 on Ubuntu 16.04 and
> all is working fine until i faced this error below.
> 
> ERROR:  could not rename file
> "pg_logical/snapshots/1A-7C00D890.snap.13594.tmp" to
> "pg_logical/snapshots/1A-7C00D890.snap": No space left on device
> 
> Is there someone who faced this problem?
> Any idea how I can solve it ?
> 
> BTW: I don't have any problem with space.
> 
> Best Regard
> 
> -- 
> Jaurès FOUTE

If you really haven't run out of space,
you might have run out of inodes.
The -i option of df should show you.
100,000 empty files could cause that.
I wouldn't think that renaming would
require a new inode, but I also wouldn't
think that renaming would require any
more space on a file system.

cheers,
raf





Re: PostgreSQL 12 - ERROR: could not rename file "pg_logical/snapshots/1A-7C00D890.snap.13594.tmp" to "pg_logical/snapshots/1A-7C00D890.snap": No space left on device

2020-07-01 Thread raf
FOUTE K. Jaurès wrote:

> Le mer. 1 juil. 2020 à 00:11, raf  a écrit :
> 
> > FOUTE K. Jaurčs wrote:
> >
> > > Hi everyone,
> > >
> > > I am use a native logical replication in PostgreSQL 12 on Ubuntu 16.04
> > and
> > > all is working fine until i faced this error below.
> > >
> > > ERROR:  could not rename file
> > > "pg_logical/snapshots/1A-7C00D890.snap.13594.tmp" to
> > > "pg_logical/snapshots/1A-7C00D890.snap": No space left on device
> > >
> > > Is there someone who faced this problem?
> > > Any idea how I can solve it ?
> > >
> > > BTW: I don't have any problem with space.
> > >
> > > Best Regard
> > >
> > > --
> > > Jaurčs FOUTE
> >
> > If you really haven't run out of space,
> > you might have run out of inodes.
> > The -i option of df should show you.
> > 100,000 empty files could cause that.
> > I wouldn't think that renaming would
> > require a new inode, but I also wouldn't
> > think that renaming would require any
> > more space on a file system.
> 
> 
> The result of df -i
> 
> Sys. de fichiers   Inœuds IUtil.   ILibre IUti% Monté sur
> udev  3065149433  30647161% /dev
> tmpfs 3072780665  30721151% /run
> /dev/sdb259973632 356029 596176031% /
> tmpfs 3072780  7  30727731% /dev/shm
> tmpfs 3072780 10  30727701% /run/lock
> tmpfs 3072780 17  30727631% /sys/fs/cgroup
> /dev/sdb1   0  00 - /boot/efi
> tmpfs 3072780 19  30727611% /run/user/108
> tmpfs 3072780  5  30727751% /run/user/1001
> tmpfs 3072780  5  30727751% /run/user/1000

So that's not it. It would take ~60 million
files to fill up your / inode table. I can't
think of another explanation for that error
message if df without -i also shows free space.

I tried googling for "No space left on device"
and everything suggested checking inodes with "df -i".
One suggested looking for files that were deleted
but that were still open by a process (can be found
with "lsof / | grep deleted") but if that were the
problem, df would show a full disk (but du wouldn't
be able to account for it).

Could it be some other "device"? like shared memory
segment space or IPC resources? That seems unlikely if
the operation was just a file system rename. Perhaps
strace/stap could help check if it really was a file
system rename that went wrong (if there's any reason to
doubt it). It does seem to clearly be a file rename
though.

If there any chance that the drive is failing? But
you'd expect "I/O error" messages if that were the case.

Is it on a journalling file system and could the journal
be the device that is full? I don't know how to check
that (or if it even makes sense).

cheers,
raf





Re: survey: psql syntax errors abort my transactions

2020-07-05 Thread raf
Jeremy Schneider wrote:

> Survey for the user community here on the pgsql-general list: it would
> be great if lots of people could chime in by answering two questions
> about your very own production environment:
> 
> question 1) are you worried about scripts in your production environment
> where damage could be caused by a different default in a future new
> major version of postgresql?  (not aborting transactions in interactive
> mode when syntax errors occur)

No. Any production scripts would be a single transaction.
I think anything else is a disaster waiting to happen
(and waiting for me to clean up afterwards).

> question 2) do you think the increased user-friendliness of changing
> this default behavior would be worthwhile for specific users in your
> organization who use postgresql?  (including both yourself and others
> you know of)

No. In fact, I think aborting on error is more
user-friendly than not doing so. I often send ad hoc
sql to psql via vim key bindings. I think that counts
as interactive from psql's point of view. The only
problem I have with that is when I mistyped "begin
transaction;" (or forgot the semi colon) and so an
error half way through doesn't result in a full
rollback. But I don't see what can be done about that
(other than automating the insertion of "begin
transaction;" and "commit transaction;" in my ad hoc
sql vim buffers).

What might be actually user-friendly is the ability,
after such an error, of being able to edit the contents
of the failed statement/transaction in the interactive
session in an editor to fix the typo. But that's for
real interactive use where stdin is a pty. That way,
you wouldn't have to retype or copy and paste the good
bits. That might already be possible. Yes, it's "\e".
And it works after an error, bringing up the
erroroneous sql in an editor, and it executes it when
you save and exit the editor. I think that's probably
what you want and it's already there. But it only edits
the last statement, not the whole transaction. It might
not be exactly what you need.

> As someone working at a large company with an aweful lot of PostgreSQL,
> thinking of the internal production systems I'm aware of, I'd personally
> vote pretty strongly in favor of changing the default.
> 
> -Jeremy
> 
> Jeremy Schneider
> Database Engineer
> Amazon Web Services

cheers,
raf





Re: TDE implementation in postgres which is in docker container

2020-07-26 Thread raf
On Sat, Jul 25, 2020 at 10:24:26AM -0500, Ron  wrote:

> On 7/24/20 11:38 AM, Vasu Madhineni wrote:
> > Hi All,
> > 
> > How to implement TDE in postgres which is running docker containers.
> 
> Postgres does not have TDE.  You need to encrypt at the filesystem level.

ecryptfs works really well with postgres (i.e. no performance penalty that
I could measure) and should probably be usable with docker (but I haven't
done that so I don't know the details).

cheers,
raf





Re: Implement a new data type

2020-08-11 Thread raf
On Tue, Aug 11, 2020 at 06:38:39AM -0700, Miles Elam 
 wrote:

> Also of note: PostgreSQL already has a money type (
> https://www.postgresql.org/docs/current/datatype-money.html)
> But you shouldn't use it (
> https://wiki.postgresql.org/wiki/Don%27t_Do_This#Don.27t_use_money).
> 
> I only bring it up so that you can know to make your money type a slightly
> different name to avoid a conflict. Money is deceptively hard to implement
> correctly. I'd recommend reading the second link if you have not already to
> avoid previously known issues.

I use decimal(10,2) for whole cents, and decimal(12,6)
for sub-cents. Single currency only. I didn't know
there was a money type originally, but it wouldn't be
usable for me anyway without the ability to specify the
scale and precision.

I recommend considering passing values to the database
as "decimal '1.23'" rather than bare numeric literals,
just so there's no chance of the value being
interpreted as a float at any stage by postgres. Maybe
that's being too paranoid but that's a good idea when
it comes to money. :-)

Perhaps the incorporation of currency would make a new
money type interesting. Currencies probably don't
change as often as timezones but there would probably
still be some ongoing need for updates.

cheers,
raf





Re: Implement a new data type

2020-08-12 Thread raf
On Wed, Aug 12, 2020 at 12:44:21PM -0400, Philip Semanchuk 
 wrote:

> > On Aug 11, 2020, at 8:01 PM, raf  wrote:
> > 
> > On Tue, Aug 11, 2020 at 06:38:39AM -0700, Miles Elam 
> >  wrote:
> > 
> >> Also of note: PostgreSQL already has a money type (
> >> https://www.postgresql.org/docs/current/datatype-money.html)
> >> But you shouldn't use it (
> >> https://wiki.postgresql.org/wiki/Don%27t_Do_This#Don.27t_use_money).
> >> 
> >> I only bring it up so that you can know to make your money type a slightly
> >> different name to avoid a conflict. Money is deceptively hard to implement
> >> correctly. I'd recommend reading the second link if you have not already to
> >> avoid previously known issues.
> > 
> > I use decimal(10,2) for whole cents, and decimal(12,6)
> > for sub-cents. Single currency only. I didn't know
> > there was a money type originally, but it wouldn't be
> > usable for me anyway without the ability to specify the
> > scale and precision.
> > 
> > I recommend considering passing values to the database
> > as "decimal '1.23'" rather than bare numeric literals,
> > just so there's no chance of the value being
> > interpreted as a float at any stage by postgres. Maybe
> > that's being too paranoid but that's a good idea when
> > it comes to money. :-)
> 
> Yes, I agree, this is also important (and easy to overlook) if
> you’re accessing the database via a non-SQL language. We use Python
> which, like most (all?) languages that rely on the underlying C
> library for floating point support, is vulnerable to floating point
> noise. Python has a fixed precision type, and like Postgres it also
> accepts character and float input. The float input can give surprising
> results.
> 
> >>> decimal.Decimal('1.79')   # This is OK
> Decimal('1.79')
> >>> decimal.Decimal(1.79) # This will not end well!
> Decimal('1.79003552713678800500929355621337890625')
> >>>
> 
> In the case of a Postgres column like numeric(10,2), input like
> 1.79003552713678800500929355621337890625 will get rounded
> to 1.79 anyway and no harm will be done. But like you said, raf, it’s
> a good idea to be too paranoid. :-)
> 
> Cheers
> Philip

Chris Travers pointed out to me that Postgres itself
parses floating point literals as the numeric type
(i.e. "select pg_typeof(1.23);" returns numeric) so
Postgres has made the right choice for its parser,
unlike most(?) languages (except raku). But yes, in
Python, it's decimal.Decimal with integer/string input
all the way (with dec=decimal.Decimal if you have a lot
of them).

cheers,
raf





Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?

2020-09-14 Thread raf
On Mon, Sep 14, 2020 at 05:39:57PM -0400, Tom Lane  wrote:

> tutilu...@tutanota.com writes:
> > 1. All non-ANSI characters are turned into "?"s for application_name.
> 
> Yeah, that's hard to do much with unfortunately.  We cannot assume that
> all databases in an installation share the same encoding, so for globally
> visible strings like application_name, the only safe solution is to
> restrict them to ASCII.
> 
> On the other hand, the very same thing could be said of database names
> and role names, yet we have never worried much about whether those were
> encoding-safe when viewed from databases with different encodings, nor
> have there been many complaints about the theoretical unsafety.  So maybe
> this is just overly anal-retentive and we should drop the restriction,
> or at least pass through data that doesn't appear to be invalidly
> encoded.

Perhaps recode database/role names from the source
database's encoding into utf8, and then recode from utf8
to the destination database's encoding?

For "globally visible strings", maybe recode to the
client_encoding setting, or a new encoding setting for
this purpose since client_encoding seems to be linked
to the database that the client is connected to. I'd've
thought that the application name would arrive encoded
as client_encoding (which defaults to the database
encoding). Maybe globally visible strings need to be
recoded on arrival from the client_encoding to utf8 (or
a "server_encoding" or "global_encoding" setting) so
they are always stored in a known encoding so they can
be recoded as necessary when viewed via connections to
other databases using a different encoding.

Just some thoughts. If they don't make any sense, feel
free to ignore them. :-)

> > 5. Ever since my early days with PG in the mid-2000s, I've tried
> > numerous times to read the manual, wikis and comments for the
> > configuration files, specifically the performance directives, and asked
> > many, many times for help about that, yet never been able to figure out
> > what they want me to enter for all the numerous options. At this point,
> > it isn't me being lazy/stupid; it's objectively very difficult to
> > understand all of that.
> 
> We do the best we can as far as the documentation goes; if you have
> concrete suggestions about how to improve that, we'll surely listen.

I thought the documentation on postgres performance tuning was fine.
Here's what I got from it (for a dedicated database server host):

  shared_buffers = 1GB # Should be 1/4 of RAM
  work_mem = 16MB # Should be bigger to do more sorts in-memory but it's per 
sort per user so not too big
  maintenance_work_mem = 128MB # Should be <= 256MB to make vacuum fast without 
taking away too much RAM from other tasks
  min_wal_size = 480MB # Should be at least 3 * 10 * 16MB to avoid too many 
checkpoints slowing down writes
  checkpoint_completion_target = 0.9 # Should be 0.9 if checkpoint_segments was 
increased to spread out checkpoint writes
  random_page_cost = 4.0 # Should be 4.0 for HDD, 1.5-2.5 for SSD, 1.0-1.01 if 
db fits entirely in RAM
  effective_cache_size = 3GB # Should be 1/2 to 3/4 of RAM

It's the "Should..." comments that matter. And it might be out of date...
Actually, it is out of date. The comment for checkpoint_completion_target
refers to checkpoint_segments which no longer exists (since 9.5) so
disregard that.

cheers,
raf





Re: PostgreSQL processes use large amount of private memory on Windows

2020-09-17 Thread raf
On Thu, Sep 17, 2020 at 10:06:07AM -0400, Tom Lane  wrote:

> =?UTF-8?Q?=C3=98ystein_Kolsrud?=  writes:
> > So my question is: When does a postgres process forked for a connection use
> > private memory instead of shared, and what can I do to avoid this?
> 
> The only significant long-term consumption of private memory is for
> caches.  There are catalog caches, which can get large if the session
> accesses a whole lot of database objects (e.g., thousands of different
> tables).  Some of the PLs maintain caches with parsed versions of any
> function that's been executed.  (An ex-employer of mine had a lot of
> trouble in that regard, because they had hundreds of thousands of lines
> worth of plpgsql functions.)  There isn't any user-accessible knob for
> limiting the size of those caches.  If you have a problem of that sort,
> about the only way to mitigate it is to use fewer backends so that the
> total memory consumption stays in bounds, or redesign your application.
> In some cases it might help to restart your sessions when they get too
> big, but that seems like at best a band-aid.
> 
>   regards, tom lane

For what it's worth, I have 171305 lines of plpgsql/sql
functions and it hasn't caused any problem on a server
with 4GB RAM. With a small number of long-lived
connections.

cheers,
raf





I'm surprised that this worked

2020-09-22 Thread raf
Hi,

I just wrote a query that I didn't expect to work but I
was pleasantly surprised that it did. It looked
something like this:

  select
a.aaa,
c.ccc,
d.ddd1,
d.ddd2
  from
tbla a,
tblb b,
tblc c,
funcd(c.id) d
  where
a.something = something and
b.something = a.something and
c.something = b.something

How does it know which c.id to use for the function
without going all cartesian product on me? Maybe it
makes no sense for such a parameterised function to be
part of a cartesian product. Actually, that function
returns a setof record but only ever a single record.
That might be relevant.

I was sure I'd done something similar once that
(sensibly) didn't work, and I needed a loop to call the
function in, but I might be thinking of something in an
outer join's "on" clause. Does that make sense?

Even more impressed with Postgresql than usual. :-)

cheers,
raf





Re: PostgreSQL on Windows' state

2020-09-23 Thread raf
On Wed, Sep 23, 2020 at 12:52:58PM +0200, Alessandro Dentella 
 wrote:

> On Wed, Sep 23, 2020 at 12:21:39PM +0200, Pavel Stehule wrote:
> > st 23. 9. 2020 v 10:38 odesílatel Alessandro Dentella <
> > sandro.dente...@gmail.com> napsal:
> > 
> > >
> > > Hi,
> > >
> > > disclaimer: I've never been a Windows user and I send this email just on
> > > behalf
> > > of a friend that has problems convincing his team to use PostgreSQL.
> > >
> > > I'd like to understand if what the team replies to him is correct or not.
> > >
> > > The final project should work with Python (we know it just works) and with
> > > c#.
> > > The team states that:
> > >
> > >   "It does not exist a native driver. You need to use ODBC, that is an old
> > >   technology and doesn't give warranties in terms of security, reliability
> > > and
> > >   performance."
> > >
> > > Is it any true? is it biased?
> > >
> > 
> > https://kb.objectrocket.com/postgresql/how-to-install-psycopg2-in-windows-1460
> > 
> > So minimally psycopg2 - native driver for Python exists for Win
> 
> Thanks Pavel, but psycopg2 (that I always use is just for Python). T
> hey claim/complain that from c# there's no native solution.
> 
> Personally I don't know even if ODBC is really to be considered a second class
> choice. It sounds awkward to me that such a great db has flaws like this (I'm
> NOT stating it *has* flaws)
> 
> sandro

Surely, it's a flaw in the C# ecosystem, not a flaw in the database?
(assuming it's true, and assuming there's anything wrong with ODBC). :-)

cheers,
raf





Re: Which is the setup with lowest resources you know Postgres is used in?

2020-10-07 Thread raf
On Wed, Oct 07, 2020 at 01:53:44PM +0300, Dmitry Igrishin  
wrote:

> In many cases concurrency is not a problem and in fact SQLite may
> handle concurrent requests faster than Postgres. Since SQLite is
> server-less and access overhead is near to zero (compared to Postgres)
> each writer does its work quickly and no lock lasts for more than a
> few dozen milliseconds.
> On the other hand, Postgres is better in cases of really high concurrency.

Presumably, this is no longer a problem, but many years
ago (between 14 and 10 years ago) I was using sqlite
for a low traffic website (probably no more than 40
users at a time), and the database became corrupted so
often that I had had to automate rebuilding it from the
latest backup and my own sql logs. I was very silly.
Switching to postgres was the real solution.

cheers,
raf





Re: Which is the setup with lowest resources you know Postgres is used in?

2020-10-07 Thread raf
On Thu, Oct 08, 2020 at 01:14:02AM +0300, Dmitry Igrishin  
wrote:

> чт, 8 окт. 2020 г. в 00:14, raf :
> >
> > On Wed, Oct 07, 2020 at 01:53:44PM +0300, Dmitry Igrishin 
> >  wrote:
> >
> > > In many cases concurrency is not a problem and in fact SQLite may
> > > handle concurrent requests faster than Postgres. Since SQLite is
> > > server-less and access overhead is near to zero (compared to Postgres)
> > > each writer does its work quickly and no lock lasts for more than a
> > > few dozen milliseconds.
> > > On the other hand, Postgres is better in cases of really high concurrency.
> >
> > Presumably, this is no longer a problem, but many years
> > ago (between 14 and 10 years ago) I was using sqlite
> > for a low traffic website (probably no more than 40
> > users at a time), and the database became corrupted so
> > often that I had had to automate rebuilding it from the
> > latest backup and my own sql logs. I was very silly.
> > Switching to postgres was the real solution.
>
> As for now SQLite is a very robust solution if used properly.

That's great to hear.





pg_restore: error: schema "public" already exists

2020-10-27 Thread raf
Hi,

A colleague is getting this error when trying to load a
database dump:

  pg_restore: error: could not execute query: ERROR:  schema "public" already 
exists

I'm wondering if anyone can explain it.

Here's the background.

I have a "dump" script that calls pg_dump and pipes the
output into gpg to encrypt it, and a I have "load"
script that decrypts a dump with gpg and pipes the
output into pg_restore (followed by a bunch of other
things).

The dump+encrypt and decrypt+load commands look like this:

For dump:

  pg_dump -U OWNERNAME -Fc DBNAME | \
  gpg --recipient ke...@domain.com --encrypt --output 
MMDD-HHMMSS-LABEL.pgdump.gpg.key20

For load:

  dropdb -U postgres DBNAME # if it exists
  createdb -U postgres -T template0 -E utf8 -O OWNERNAME DBNAME
  gpg --decrypt MMDD-HHMMSS-LABEL.pgdump.gpg.key20 | pg_restore -1 -U 
postgres -d DBNAME -Fc

I've just added support for choosing the key to encrypt
to, so that I can encrypt a backup to the key of a new
colleague. The only real change to the above commands
was to replace gpg's "--default-recipient XXX" option
with "--recipient XXX", which shouldn't really change
anything.

The dumps happen on debian-9, postgresql-9.6.15,
gpg-2.1.18. The loads happen there as well, but also on
macOS hosts (10.14 and 10.15), postgresql-9.6 (and 12),
gpg-2.2.23.

I use macOS 10.14, postgresql-9.6, gpg-2.2.23 (or
gpg-1.4.23) and the above has worked without problem.

But my colleague (macOS-10.15, postgresql-12,
gpg-2.2.23) is enecountering a strange error when he
tries to load a dump:

  Restoring...
  gpg: encrypted with 4096-bit RSA key, ID A44C904AA8B9BB48, created 2020-09-21
 "colleague "
   pg_restore: while PROCESSING TOC:
pg_restore: from TOC entry 3; 2615 2200 SCHEMA public postgres
  pg_restore: error: could not execute query: ERROR:  schema "public" already 
exists
  Command was: CREATE SCHEMA public;

  pgrestore encountered errors
  Continue? ([y]/n): y

The next bit was from a subsequent SQL statement
failing because pg_restore had failed:

  Failed to run pg_restore: ERROR:  relation "lock_company" does not exist
  LINE 1: delete from lock_company

To me, this looks like the decryption worked, but there
was something else wrong.

My colleague decrypted the backup to a file, and then
used the load script to load the unencrypted dump, and
that worked fine.

Does anyone have any idea why this might have happened?

Under what circumstances (immediately after createdb)
might "CREATE SCHEMA public" result in "ERROR:  schema
"public" already exists"?

And why might it differ depending on whether the
unencrypted dump was read from stdin or from a file?

cheers,
raf





Re: Christopher Browne

2020-11-04 Thread raf
On Wed, Nov 04, 2020 at 06:29:18PM -0500, Steve Singer  
wrote:

> 
> It is with much sadness that I am letting the community know that Chris
> Browne passed away recently.
> 
> Chris had been a long time community member and was active on various
> Postgresql mailing lists.  He was a member of the funds committee, the PgCon
> program committee and served on the board of the PostgreSQL Community
> Association of Canada. Chris was a maintainer of the Slony replication
> system and worked on various PG related tools.
> 
> I worked with Chris for over 10 years and anytime someone had a problem he
> would jump at the chance to help and propose solutions. He always had time
> to listen to your problem and offer ideas or explain how something worked.
> 
> I will miss Chris
> 
> Steve

He sounds exactly like the kind of person you want to have around.





Re: How to set up a schema default date to '2020-01-01'?

2020-11-09 Thread raf
On Tue, Nov 10, 2020 at 10:51:02AM +0530, mark armon <1994hej...@gmail.com> 
wrote:

> On Mon, Nov 9, 2020 at 9:44 PM David G. Johnston 
> wrote:
> 
> > On Mon, Nov 9, 2020 at 9:08 AM mark armon <1994hej...@gmail.com> wrote:
> >
> >> How to set up a schema default date (now) to '2020-01-01'?  Whatever
> >> timezone would be OK.
> >
> > What is a "schema default" (date or otherwise)?
>
> > David J.
> >
> like I create a schema: test, I want the default date to 2020-01-01, so
> when I do
> 
> select test.now;
> 
> the result is 2020-01-01

I would recommend that you create a stored
function/procedure that returns the "default" that you
want it to return, and execute a call to that, rather
than executing "select test.now". The statement "select
test.now" doesn't even seem to be valid sql. Bear in
mind that I have no idea what you are talking about so
my advice might not be helpful.

I have timestamped database backups and timestamped
schema update files so that my database load script
knows which schema updates to apply when loading an old
database backup (i.e. anything whose timestamp is later
than the timestamp of the backup), which seems like it
might be related to what you are after, but I don't
understand the idea of a "default" date. The "date" for
my schema is always the present so as to match the
corresponding software in its current state.

Perhaps you can explain in more detail what you are
after.

cheers,
raf





Re: Is the Halloween problem an issue in Postgres

2020-12-02 Thread raf
On Wed, Dec 02, 2020 at 11:02:07PM +0100, Thomas Kellerer  
wrote:

> guy...@icloud.com schrieb am 02.12.2020 um 21:27:
> > The Halloween problem is that it is a challenge for the database if
> > you’re updating a field that is also in the WHERE clause of the same
> > query.
> > 
> > I just saw a presentation from someone about how in SQL Server he
> > recommended writing changes to a temp table and then writing them to
> > the table as being much more efficient.
> 
> It sounds strange to me, that this _is_ actually a problem.
> 
> Why exactly is that a problem in SQL Server?
> And what are the consequences if you do it nevertheless.

According to wikipedia, this problem was discovered on
Halloween day, 1976. I find it hard to believe that any
database would still exhibit that behaviour 44 years
later.

cheers,
raf





Re: Is the Halloween problem an issue in Postgres

2020-12-02 Thread raf
On Wed, Dec 02, 2020 at 02:08:41PM -0800, Adrian Klaver 
 wrote:

> On 12/2/20 2:02 PM, Thomas Kellerer wrote:
> > guy...@icloud.com schrieb am 02.12.2020 um 21:27:
> > > The Halloween problem is that it is a challenge for the database if
> > > you’re updating a field that is also in the WHERE clause of the same
> > > query.
> > > 
> > > I just saw a presentation from someone about how in SQL Server he
> > > recommended writing changes to a temp table and then writing them to
> > > the table as being much more efficient.
> > 
> > It sounds strange to me, that this _is_ actually a problem.
> > 
> > Why exactly is that a problem in SQL Server?
> 
> Yeah that was a new one to me. A quick search found:
> 
> https://www.sqlshack.com/the-halloween-problem-in-sql-server-and-suggested-solutions/
> 
> > And what are the consequences if you do it nevertheless.

It looks like the anser is no (unless I've misunderstood the problem):

  create table a (id serial not null primary key, a integer not null, b integer 
not null);
  create index a_a on a(a);
  insert into a (a, b) values (1, 2);
  insert into a (a, b) values (2, 3);
  insert into a (a, b) values (3, 4);
  insert into a (a, b) values (4, 5);
  insert into a (a, b) values (5, 6);
  insert into a (a, b) values (6, 7);
  update a set a = a + 1 where a < 4;
  select * from a order by id;
  drop table a cascade;

results in:

  id | a | b 
 +---+---
   1 | 2 | 2
   2 | 3 | 3
   3 | 4 | 4
   4 | 4 | 5
   5 | 5 | 6
   6 | 6 | 7

It's the same with or without the index on a(a).

cheers,
raf





Re: How to keep format of views source code as entered?

2021-01-07 Thread raf
On Thu, Jan 07, 2021 at 05:33:48PM +, "Markhof, Ingolf" 
 wrote:

> So, it looks like PostgreSQL does support saving the original source code of 
> a view.
> 
> What's best practise to use as a code repository?
> 
> I would expect support of multi-user access, access-right management and 
> perhaps versioning as well…?
> 
> Thanks for your help!
> 
> Ingolf

Hi, I've only used stored functions (not views or
triggers), and I have tools for auditing, loading, and
dropping stored functions to match the code that is in
git (so updates can be easily applied to multiple
copies of the database), and Postgres has never altered
the code that it stores, otherwise, auditing the code
in the database against the code in git wouldn't work.

But since postgres does store a possibly altered parsed
version, you could alter your source to match
Postgres's parsed version of it. Maybe I encountered
this too long ago to remember having to adjust.

For version control, I'd recommend using git, or
whatever you are using for the rest of your code.

For multi-user access rights management, I'm not sure.
You can grant multiple users the right to create things
in the database. See the documentation on the grant
statement. e.g.:

  https://www.postgresql.org/docs/12/sql-grant.html

I'm guessing that you want:

  grant create on database ... to ...

But I don't know if it can be restricted to only
creating views. If not, it might grant too much access.
You'll also want to make sure that they all have write
access to the same git repository where the views are.

cheers,
raf

> From: Markhof, Ingolf [mailto:ingolf.mark...@de.verizon.com]
> Sent: 07 January 2021 17:19
> To: pgsql-general@lists.postgresql.org
> Subject: [E] How to keep format of views source code as entered?
> 
> Hi!
> 
> Switching from Oracle SLQ to PostgreSQL I am facing the issue that
> the SQL code the system returns when I open a views source code is
> different from the code I entered. The code is formatted differently,
> comments are gone and e.g. all text constants got an explicit cast to
> ::text added. (see sample below).
> 
> I want the SLQ code of my views stored as I entered it. Is there any
> way to achieve this? Or will I be forced to maintain my views SQL code
> outside of PostgreSQL views?
> 
> Any hints welcome!
> 
> Here is an example:
> 
> I enter this code to define a simple view:
> 
> create or replace view myview as
> select
>   product_id,
>   product_acronym
> from
>   products -- my comment here
> where
>   product_acronym = 'ABC'
> ;
> 
> However, when I open the view my SQL client (DBeaver) again, this is what I 
> get:
> 
> CREATE OR REPLACE VIEW myview
> AS SELECT product_id,
> product_acronym
>FROM products
>   WHERE product_acronym = 'ABC'::text;
> 
> So, the formatting changed, keywords are capitalized, the comment I
> added in the from-part has gone and the text constant 'ABC' changed to
> 'ABC'::text.
> 
> Verizon Deutschland GmbH - Sebrathweg 20, 44149 Dortmund, Germany - 
> Amtsgericht Dortmund, HRB 14952 - Geschäftsführer: Detlef Eppig - 
> Vorsitzender des Aufsichtsrats: Francesco de Maio




Re: How to keep format of views source code as entered?

2021-01-08 Thread raf
On Fri, Jan 08, 2021 at 08:38:29AM +, "Markhof, Ingolf" 
 wrote:

> Thanks for your comments and thoughts.
> [...]
> And you could not even store your source code on the UNIX
> system. Instead, you'd be forced to do so in a separate system, like
> GitHub. Stupid, isn't it? Right.
> [...] 
> Regards,
> Ingolf

I don't think your conclusions are correct. There is
nothing that can stop you from "storing your source
code on the UNIX system". You don't have to use github.
But even if you do you github, you would first need to
store your source code on a file system, so that a
local git repository could see it, and push it to
github. You don't even have to use git if you really
don't want to.

cheers,
raf





Re: How to keep format of views source code as entered?

2021-01-09 Thread raf
On Sat, Jan 09, 2021 at 02:22:25PM +, "Markhof, Ingolf" 
 wrote:

> Tom, all,
> 
> when I change a tables column name in Oracle SQL , SQLdeveloper (a SQL
> client) marks views that refer to the table using the old column name
> as erroneous. So, I can easily identify these cases. And of course
> I, as a user, I am acting in my context, i.e. my schema. So it is
> perfectly clear what table I am referring to.
> 
> Please note: I am not developing any PL/SQL code. I don't have big
> development projects. I have more the role of an data analyst. I
> just create rather complex SQL queries which, from time to time, may
> need to be adopted to some new requirements. Or peers want to (re-)
> use (part of) my SQL queries. There is not really much versioning
> required.
> 
> What I understood so far is: I can use e.g. DBeaver to interact with
> PostgreSQL, to develop my SQL code. But I finally need to copy the SQL
> code into e.g. Github. Which is a manual process. I'd mark the SQL
> code in the DBeaver editor window and copy&paste it into some file in
> e.g. GitHub. Using Github, I'd get version control and other enhanced
> collaboration features which I don't really need. At the price that
> code transfer from SQL (DBeaver) to the code repository and vice versa
> is complete manually?! This doesn't really look like an enhancement.
> 
> Most likely, there are more professional ways to do that. I'd be glad
> to get advice.
> 
> What I would like to have is something that would automatically update
> the SQL code in the software repository when I run a CREATE OR REPLACE
> VIEW.
> 
> Ingolf

Hi,

If there is a software repository, then I would
recommend considering not using tools like DBeaver to
develop your SQL views. Instead, develop them in
relation with the repository tools, and use psql or
similar to load the view into the database(s).

Alternatively, if you do modify the views "live" in the
databse, get (or have someone create) a tool to fetch
the code of the view from the database, and write it to
a file that can be committed into the repository.

It doesn't have to be copy and paste. A simple program
can be written to extract view source code and write it
to a file. Perhaps your colleagues that want to re-use
your source code can implement it. Such a tool would be
useful with or without a repository.

Here's an example of such a query but it's for
procedures/functions, and would need to be
very different for views.

select
p.proname, -- name
p.proretset, -- returns setof?
p.proisstrict, -- strict 't' or 'f'
p.provolatile, -- volatile or stable 'v' or 's'
p.prosecdef, -- security definer 't' or 'f'
p.pronargs, -- number of in arguments
p.prorettype, -- return type
p.proargtypes, -- space-separated list of in arg types
p.proallargtypes, -- array of in/out arg types (iff there are out args)
p.proargmodes, -- array of in/out arg modes like {i,o,o} (iff there are 
out args)
p.proargnames, -- array of in/out arg names like {id,code,name}
p.prosrc, -- source code
cast(cast(p.oid as regprocedure) as text) -- nice signature
from
pg_user u,
pg_proc p
where
u.usename = current_user and
p.proowner = u.usesysid and
p.proname like 'myfunc_%' -- Your naming convention
order by
p.proname

The above returns enough information to construct a
corresponding create function statement (except for any
knowledge of precision and scale of numeric parameters).

Actually, I just had a look at the pg_views system
catalog where the source code for views is stored, and
it doesn't seem to contain enough information to
reconstruct a create view statement. It only contains
these columns:

  schemaname
  viewname
  viewowner
  definition

But definition is just the query itself.

There is no list of column names (like there is with
procedures in pg_proc).

You can tell the difference between a temporary and
non-temporary view because the schemaname is different
for temporary views (e.g. pg_temp_3, rather than
public).

I don't know if you could tell whether a view is
recursive or not.

And it doesn't look like you can determine if a view
has a local or cascaded check_option parameter, or the
security_barrier parameter.

Is all of that information stored somewhere else in the
system catalogs?

Without them, this query would only find the names and
query code of views:

select
v.viewname,
v.definition
from
    pg_views v
where
v.viewname like 'myview_%'; -- Your naming convention

Is there a query that can be used to obtain all of the
information needed to reconstruct the create view
statement that corresponds to a view in pg_views?

cheers,
raf





Re: How to keep format of views source code as entered?

2021-01-10 Thread raf
On Sat, Jan 09, 2021 at 05:26:04PM -0700, "David G. Johnston" 
 wrote:

> On Saturday, January 9, 2021, raf  wrote:
> 
> > Actually, I just had a look at the pg_views system
> > catalog where the source code for views is stored, and
> > it doesn't seem to contain enough information to
> > reconstruct a create view statement. It only contains
> > these columns:
> >
> >   schemaname
> >   viewname
> >   viewowner
> >   definition
> >
> > But definition is just the query itself.
> >
> > There is no list of column names (like there is with
> > procedures in pg_proc).
> >
> > Is all of that information stored somewhere else in the
> > system catalogs?
> >
> 
> Views are relation-like and thus are primarily recorded on pg_class.
> 
> David J.

Hi David,

Thanks. That's very helpful. The following query should
return enough information to re-construct create view
statements:

select
c.relname as "view_name",
a.attname as "column_name",
format_type(a.atttypid, a.atttypmod) as "column_type",
v.definition as "view_sourcecode",
c.reloptions as "view_options"
from
pg_class c,
pg_attribute a,
pg_views v
where
c.relkind = 'v' and
c.relname like 'myview%' and -- Your naming convention
a.attrelid = c.oid and
v.viewname = c.relname
order by
c.relname,
a.attnum;

Note that view_options can look like
"{check_option=local,security_barrier=false}".

Also, this won't find temporary views (I'm probably not
looking in the right place for them).

Also, it probably doesn't handle recursive views.

But for simple views, it should make a basis for
extracting views into files that can be added to a
source code repository.

I've attached a Python script that can output "create
or replace view" statements for the views in a database
that match a particular naming convention.

Note that:

  - Login details are hard-coded and must be changed.
  - View naming convention is hard-coded and must be changed.
  - It doesn't handle temporary or recursive views.
  - It does handle check_option and security_barrier options.
  - All views are output to stdout, not separate files.
  - Comments are not included (code is in post-parse state).
  - It should probably drop then create each view.

I hope it's useful to show how this could be done, but
the loss of comments would bother me. I'd want the
documentation in the source code repository. Having
access to the original source would be much nicer (and
enable auditing views in the database against the views
in the source code repository to see if they've been
changed).

cheers,
raf

#!/usr/bin/env python

# pg_extract_views.py - Sample script to extract views from postgres and
# construct equivalent "create or replace view" statements (doesn't handle
# temporary or recursive views).
#
# 20210110 raf 

import pgdb # pip install PyGreSQL

# You MUST change these and the password MUST be in ~/.pgpass (mode 600)
DBHOST = 'XXX_localhost'
DBNAME = 'XXX_database_name'
DBUSER = 'XXX_user_name'

# You MUST change this to match your view naming convention
VIEW_NAME_RE = '^XXX_myview_.*$'

def print_views(db):
	'''Load all of the views from the database.'''
	cursor = db.cursor()
	views = {}
	for view_name, view_options, view_sourcecode in select_views(cursor):
		print('create or replace view %s' % view_name)
		print('(')
		columns = select_columns(cursor, view_name)
		for i in range(len(columns)):
			column_name, column_type = columns[i]
			print('%s%s' % (column_name, ',' if i < len(columns) - 1 else ''))
		print(')')
		if view_options is not None:
			options = []
			if 'check_option=local' in view_options:
options.append('check_option = "local"')
			if 'check_option=cascaded' in view_options:
options.append('check_option = "cascaded"')
			if 'security_barrier=true' in view_options:
options.append('security_barrier = true')
			if 'security_barrier=false' in view_options:
options.append('security_barrier = false')
			print('with')
			print('(')
			for i in range(len(options)):
print('%s%s' % (options[i], ',' if i < len(options) - 1 else ''))
			print(')')
		print('as')
		print(view_sourcecode)
		print('')

def select_views(cursor):
	'''Given a cursor object, return a list of view names.'''
	sql = '''
		select

Re: Do we need a way to moderate mailing lists?

2021-01-16 Thread raf
On Sat, Jan 16, 2021 at 11:28:37PM +0100, Paul Förster 
 wrote:

> Also, could it be possible to make messages plain text? I see a lot of
> varying fancy fonts and I hate that. I even hate it more when people
> post messages not properly trimmed or messages that need formatting
> preserved such as select output, i.e. table data, explain plans,
> etc. Proportional fonts (Outlook with its darn Arial) is one of the
> worst...
> 
> [...]
> 
> I think, an automatic conversion of incoming posts to plain text and
> dropping all non plain text attachments would help a lot already.
> 
> Cheers,
> Paul

I once wrote a program to do that very thing:

  http://raf.org/textmail/
  https://github.com/raforg/textmail/

It converts everything it can into plain text (using
lots of helper applications which also need to be
installed), and it deletes everything else, all highly
configurable, of course. It might be possible to
incorporate it into a mailing list, but perhaps that's
a bit draconian. You could probably incorporate it into
your own email flow as emails arrive before you see
them. I've used procmail for that, but imapfilter
(https://github.com/lefcha/imapfilter) might be more
appropriate if your email is in an imap account.

cheers,
raf





Re: psql behavior change on upgrade from version 12.x to 13.1

2021-02-09 Thread raf
On Tue, Feb 09, 2021 at 12:19:21PM -0800, Bryn Llewellyn  
wrote:

> 
> 
> On 09-Feb-2021, at 12:11, Tom Lane  wrote:
> 
> Bryn Llewellyn  writes:
> > Here’s what I get when I do "otool -L /usr/local/bin/psql";
> 
> > /usr/local/bin/psql:
> > /usr/local/lib/libpq.5.dylib (compatibility version 5.0.0, current 
> > version 5.13.0)
> > /usr/local/opt/readline/lib/libreadline.8.dylib (compatibility version 
> > 8.0.0, current version 8.0.0)
> > /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current 
> > version 1292.0.0)
> 
> Right, so that's using a version of libreadline that's supplied by
> Homebrew (the /usr/local/opt path is the giveaway on that).
> 
> I don't know whether these things represent an intentional change
> of libreadline's behavior in Homebrew's build, or a bug, but in
> either case you should take the issue to the Homebrew support forums.
> If it's intentional, I imagine there's a way to get the old behavior
> back.
> 
> Also, libreadline is fairly configurable, so maybe this boils down
> to some unintentional change in your ~/.inputrc ?
> 
>   regards, tom lane
> 
> —
> 
> Thank you very much, Tom. It seems, then, that we have the
> “microscopic” explanation. I’ll have to to a fair bit of research to
> find out what to do to fix this problem.

This sounds exactly like changes that happened in
debian a while ago. I was taken by surprise as well,
but it's actually much better behaviour than previous
behaviour. It's nice to know tht you have to confirm
the execution of a pasted shell command (especially
when pasting commands as root). It feels safer. You
might come to like it. But of course, the readline
library is probably configurable enough to change the
behaviour.

According to https://tiswww.case.edu/php/chet/readline/rluserman.html,
this could be what you're looking for:

  enable-bracketed-paste
  When set to `On', Readline will configure the
  terminal in a way that will enable it to insert each
  paste into the editing buffer as a single string of
  characters, instead of treating each character as if
  it had been read from the keyboard. This can prevent
  pasted characters from being interpreted as editing
  commands. The default is `On'.

So try putting this in your ~/.inputrc file:

  set enable-bracketed-paste off

cheers,
raf





Re: Are stored procedures/triggers common in your industry

2022-04-20 Thread raf
On Wed, Apr 20, 2022 at 12:18:23PM -0700, Guyren Howe  wrote:

> I’ve really only ever worked in web development. 90+% of web
> developers regard doing anything at all clever in the database with
> suspicion.
>
> I’m considering working on a book about implementing business logic in
> Postgres, and I’m curious about how common that actually is.

I'm used to putting all business logic in the database
(after choosing a great FLOSS database that you'll
never want to migrate away from - like Postgres). And
I've never regretted it (in decades of doing it).

One of the main reasons is speed. I once had a job
where a program selected data out of a database,
dragged it over a network, effectively grouped it into
summaries, sent the summaries back over the network,
and inserted them back into the database one at a
time(!). Replacing it with a stored procedure changed
it from taking 2-3 hours to 2 minutes. And that was a
place that already made heavy use of stored procedures,
so I don't know what went wrong there. The point is
that whenever a lot of data activity is needed, it's
much faster when it's done where the data lives.

The other main reason is security. The database can
provide an effective "firewall" between the data and
the client. I never liked the idea of trusting
arbitrary SQL sent from the client. It means you have
to trust every single client application and every
single user (even the ones with good intentions that
produce bad queries in some reporting software and
throwing it at the database and bringing it to its
knees) and every single developer (who might not know
SQL and relies on ORMs that trick them into thinking
they don't need to). But when the clients are only
permitted to execute security defining stored
procedures that have been loaded by the privileged
database owner, you know exactly what code can run
inside the database. SQL injections become impossible
no matter how many bugs and flaws there are in the
client software or its supply chain.

Another good but less critical reason is that when you
use multiple languages, or you migrate partially or
completely from the old cool language to the new cool
language, you don't have to replicate the business
logic in the new language, and you can eliminate the
risk of introducing bugs into mission critical code.
The existing business logic and its test suite can stay
stable while all the bells and whistles on the outside
change however they like.

There are other nice benefits but that's enough.

I think it's safe to disregard the suspicions of the
90+% of web developers you mentioned. The requirements
that they have for a database might be quite
undemanding. Most individual actions on a website
probably don't result in a lot of data activity (or
rather activity that involves a lot of data). The CRUD
model is probably all they need. So their views are
understandable, but they are based on limited
requirements. However, I still use stored procedures
for everything on websites for security reasons.

Everyone's mileage varies. We're all in different places.

cheers,
raf





Re: Are stored procedures/triggers common in your industry

2022-04-21 Thread raf
On Thu, Apr 21, 2022 at 08:42:10AM -0500, Alex Aquino  
wrote:

> You mentioned testing, and reminds me of another benefit.  Way faster, more
> reliable, cheaper to test on the DB side.  Testing logic in SPs or SQL is
> much easier, especially when testing requires a sequence of calls for a use
> case.  It is easier because of the DBs support for transactions.  With
> transactions and state management built into the DB, a testing process can
> always revert to a reliable starting point and end point, thereby
> facilitating more dependable, automated test harnesses.  The alternative
> done mostly now is testing via UIs or APIs where there is no inherent
> transaction management, so a lot of work goes into preparing the test bed
> to be a known state and introspecting the results to verify.  This is
> usually done with some mix of manual and automated processes.

Actually, my full work tests take ages to run (~40m).
I know that mocking the db to make unit tests fast is popular,
but that's not helpful when the most important code being tested
is in the database. :-) It's more important to me that the tests
actually test everything than that they be fast.

But yes, being able to do complex system testing with transaction
rollback is great.

cheers,
raf





Re: Stored procedure code no longer stored in v14 and v15, changed behaviour

2022-12-01 Thread raf
On Thu, Dec 01, 2022 at 04:38:57PM +0100, Dominique Devienne 
 wrote:

> On Thu, Dec 1, 2022 at 4:23 PM Tom Lane  wrote:
> > "David G. Johnston"  writes:
> > > On Thu, Dec 1, 2022 at 7:59 AM Martijn Tonies (Upscene Productions) <
> > > m.ton...@upscene.com> wrote:
> > >> Can you modify the server code to store the original body in proc.prosrc
> > >> again? It would be very helpful.
> >
> > > I seem to recall that this option had been discussed and rejected when 
> > > this
> > > feature went in.  The parsed body is a feature because its contents can be
> > > updated due to, e.g., renaming of objects.  The text source of the 
> > > original
> > > command would be unable to be updated in the same way and so it is 
> > > possible
> > > the text and the parsed contents would diverge over time, which is a
> > > situation we do not want to have.
> >
> > Indeed.  We used to have a similar situation with respect to column
> > default expressions and CHECK constraint expressions.  Eventually we got
> > rid of the textual storage of both, because it couldn't be maintained
> > in a reasonable way.
> >
> > I think the answer here is "don't use the new syntax if you want the
> > function body stored textually".  You can have one set of benefits,
> > or the other set, but not both at once.
> 
> FWIW, we have a custom schema introspection and diff'ing ad-hoc framework,
> and the fact the original SQL is not conserved as-is has also created
> issues for us.
> 
> On Oracle, our SQL was preserved as-is, so could be compared reliably. While 
> on
> PostgreSQL, some names-in-SQL are rewritten, the text reformatted,
> etc... So this
> byte-perfect comparison is no longer possible, and we must rely on heuristics
> (a.k.a, hacks...) to do a fuzzy-compare (since we lack a real PG-compatible 
> SQL
> parsers to do an infoset-comparison instead, at the AST level for example).
> 
> So it's not just a matter of browsing the schema. For us, it's a
> *functional* issue. --DD

Same here. Accessing the loaded stored procedure source
is how I audit the state of stored procedures in the
database against the code in the code repository.
Without the ability to make that comparison, there is
no way to audit the stored procedures, and the only way
to make sure that the database is up to date with the
stored procedures would be to reload every single
stored procedure. I might have to alter the procedure
loading program to store its own copy of the source code
in the database somewhere, and just hope that nobody
loads stored procedures using any other tool. Even so,
it will slow down loading old database backups and
bringing their schema and code up to date. But that's
probably OK compared with the time to load the backup
itself.

cheers,
raf





Re: Stored procedure code no longer stored in v14 and v15, changed behaviour

2022-12-02 Thread raf
On Fri, Dec 02, 2022 at 03:00:09PM +0200, Pasi Oja-Nisula  wrote:

> On Fri, 2 Dec 2022 at 02:24, raf  wrote:
> > Same here. Accessing the loaded stored procedure source
> > is how I audit the state of stored procedures in the
> > database against the code in the code repository.
> 
> Exactly. If our software is audited, how can I reliably prove to auditor
> that the running version of the procedure has not been tampered with
> either by customer, rogue developer or some malicious party?

> Suggested solution "store the text in user table" does not work, because
> it requires following a process. And the process overriding changes are those
> we are interested in.
> 
> I'm not so much concerned about whether the stored procedure text compiles
> anymore (because of object name changes etc.) I just really would like to
> have the exact source code stored with the database. So if release management
> goes horribly wrong, there is at least the possibility to compare procedure
> text manually.
> 
> Pasi

That's a different concern to mine. I just want to be
able to check that multiple test systems all have the
same schema and stored procedures, and to be able to
reload an old database, apply any schema updates made
since it was backed up, and load/drop any stored
procedures that were added/changed/removed since then
without having to reload all of them. So storing
procedure code in a user table would be OK. I just have
to update my db loading/auditing tools for v14+. It was
great to be able to make use of the source in pg_proc
for my needs, but that's not what it was for. I was
just lucky it was there.

If you're concerned about tampering by
customers/users/developers, you can either set
permissions to prevent it in some cases, and when you
can't prevent it, make it tamper-evident by logging
actions to somewhere remote and monitoring for what
concerns you. That should satisfy auditors.

cheers,
raf





Re: Stored procedure code no longer stored in v14 and v15, changed behaviour

2022-12-02 Thread raf
On Fri, Dec 02, 2022 at 11:02:18AM -0500, Tom Lane  wrote:

> Also, we've *never* had storage of the original text for views.
> I'm a little mystified by people claiming they use original
> text for vetting functions when they clearly have no equivalent
> ability for views ... or are your applications completely
> view-free?

In my case, yes, all access is via procedures.
No views or triggers. So it was OK.

cheers,
raf





Re: Maintaining blank lines in psql output?

2023-01-17 Thread raf
On Tue, Jan 17, 2023 at 02:22:22PM -0700, "David G. Johnston" 
 wrote:

> On Tue, Jan 17, 2023 at 1:48 PM Ron  wrote:
> 
> >
> > White space can of course make things easy to read, but psql seems to
> > ignore
> > those blank lines.  Is there any way to retain them in psql output?
> >
> >
> Nope, there is no setting for psql to print all blank lines it encounters
> to stdout.  If you want to format your output with stuff other than query
> results it provides \echo
> 
> David J.

Perhaps the best you can do is something like adding:

  select '';

or

  raise notice '';

It won't result in just a blank line, but it will separate things.

cheers,
raf





Re: Maintaining blank lines in psql output?

2023-01-17 Thread raf
On Tue, Jan 17, 2023 at 04:10:50PM -0700, "David G. Johnston" 
 wrote:

> On Tue, Jan 17, 2023 at 4:07 PM raf  wrote:
> 
> > On Tue, Jan 17, 2023 at 02:22:22PM -0700, "David G. Johnston" <
> > david.g.johns...@gmail.com> wrote:
> >
> > > On Tue, Jan 17, 2023 at 1:48 PM Ron  wrote:
> > >
> > > >
> > > > White space can of course make things easy to read, but psql seems to
> > > > ignore
> > > > those blank lines.  Is there any way to retain them in psql output?
> > > >
> > > >
> > > Nope, there is no setting for psql to print all blank lines it encounters
> > > to stdout.  If you want to format your output with stuff other than query
> > > results it provides \echo
> > >
> > > David J.
> >
> > Perhaps the best you can do is something like adding:
> >
> >   select '';
> >
> > or
> >
> >   raise notice '';
> >
> > It won't result in just a blank line, but it will separate things.
> >
> >
> Those both seem much more complicated than \echo for the same (or worse)
> effect.  You'd have to wrap the raise notice inside a do block which itself
> would then be executed by the server...
> 
> David J.

Good point. I didn't read it right. select/raise make more sense
out of the context of psql (i.e. in sql/plpgsql). \echo makes more
sense in psql.

cheers,
raf





Re: Is pg_restore in 10.6 working?

2018-11-12 Thread raf
Tom Lane wrote:

> David  writes:
> > I have some experience with different versions of Postgres, but I'm just
> > getting around to using pg_restore, and it's not working for me at all.
> > ...
> > But a matching pg_restore command does nothing.
> > pg_restore -U postgres -f predata.sql -v
> 
> This command expects to read from stdin and write to predata.sql, so
> it's not surprising that it's just sitting there.  What you want
> is something along the lines of
> 
> pg_restore -U postgres -d dbname -v  
>   regards, tom lane

Does that mean there's a bug in the usage message?

pg_restore --help says (admittedly on 9.5 but it's probably the same with 10.6):

  Usage:
pg_restore [OPTION]... [FILE]

...

If no input file name is supplied, then standard input is used.

To me, that says that a filename on the command line after the options
will be read as the source of the restore. Only if it is absent would
stdin be used.

Apologies if the usage message for 10.6 doesn't say the same thing.

cheers,
raf




Re: Is pg_restore in 10.6 working?

2018-11-12 Thread raf
raf wrote:

> Tom Lane wrote:
> 
> > David  writes:
> > > I have some experience with different versions of Postgres, but I'm just
> > > getting around to using pg_restore, and it's not working for me at all.
> > > ...
> > > But a matching pg_restore command does nothing.
> > > pg_restore -U postgres -f predata.sql -v
> > 
> > This command expects to read from stdin and write to predata.sql, so
> > it's not surprising that it's just sitting there.  What you want
> > is something along the lines of
> > 
> > pg_restore -U postgres -d dbname -v  > 
> > regards, tom lane
> 
> Does that mean there's a bug in the usage message?
> 
> pg_restore --help says (admittedly on 9.5 but it's probably the same with 
> 10.6):
> 
>   Usage:
> pg_restore [OPTION]... [FILE]
> 
> ...
> 
> If no input file name is supplied, then standard input is used.
> 
> To me, that says that a filename on the command line after the options
> will be read as the source of the restore. Only if it is absent would
> stdin be used.
> 
> Apologies if the usage message for 10.6 doesn't say the same thing.

Ah, I didn't notice the -f (output) option. Never mind.

cheers,
raf




Re: Invoking user of the function with SECURITY DEFINER

2018-11-24 Thread raf
Laurenz Albe wrote:

> Madan Kumar wrote:
> > How to get the user who is invoking the function with SECURITY DEFINER? 
> > When we define the function to be SECURITY DEFINER, it will execute in the
> > context of the user who created it. Let's say I've given execute permission
> > for this function to other users and wish to know who is executing it.
> > Is there a way to find that out?
> > I tried CURRENT_USER and SESSION_USER but they return the function owner
> > since they execute in that context. So is there any way to figure out the
> > user who is invoking the function?
> 
> It works for me:
> 
> As user "postgres":
> 
> CREATE OR REPLACE FUNCTION tellme() RETURNS text LANGUAGE plpgsql
>SECURITY DEFINER AS 'BEGIN RETURN session_user; END;';
> 
> As user "laurenz":
> 
> SELECT tellme();
>  tellme  
> -
>  laurenz
> (1 row)
> 
> Yours,
> Laurenz Albe

session_user has always worked for me.

cheers,
raf




Re: why not using a mountpoint as PGDATA?

2019-02-27 Thread raf
Peter J. Holzer wrote:

> On 2019-02-27 12:33:02 +0100, Julien Rouhaud wrote:
> > On Wed, Feb 27, 2019 at 12:22 PM Luca Ferrari  wrote:
> > >
> > > What's wrong with using a mountpoint?
> > 
> > You can see most obvious reasons at
> > https://bugzilla.redhat.com/show_bug.cgi?id=1247477
> 
> I see only one good reason there: The fact that pg_upgrade needs write
> access to the parent directory. Of course that alone might suffice.
> 
> The other reasons aren't good IMHO.
> 
> The first one (initdb checks for an empty directory) is more "We
> disallow it, therefore it is a bad idea" than a reason for disallowing
> it.
> 
> The second is just wrong: You can have a non-root owned mount-point on
> any Unixoid system I've worked with. (And I don't see why that would be
> a security problem)
> 
> The third is wrong at least on Debian: All server processes have
> /var/lib/postgresql/$version/$cluster as their working directory, so it
> cannot be unmounted while the database is up. Even if you could, the
> server would either immediately lose access to all files (in which case
> you could recover) or it would keep access to all files (so, not a
> problem). Plus being in a subdirectory wouldn't change that. Maybe it's
> a potential problem with other layouts.
> 
> hp
> 
> -- 
>_  | Peter J. Holzer| we build much bigger, better disasters now
> |_|_) || because we have much more sophisticated
> | |   | h...@hjp.at | management tools.
> __/   | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>

I didn't know you weren't supposed to do that.
I've been using ecryptfs mountpoints for my data
directories for ages without problem. Ecryptfs doesn't
create lost+found and the underlying directory is owned
by the postgres user and so the mountpoint is as well
(even though it is mounted by root).

However, the parent directory is root-owned so I guess
pg_upgrade wouldn't work. But I've never used
pg_upgrade so I've never encountered that problem.

Also, if it's not mounted, the underlying directory
isn't empty but postgresql can tell something's not
right and it won't start if told to do so prematurely.

But the next time I upgrade, I might put the data
directory in a sub-directory of the mountpoint in case
I ever want to start using pg_upgrade. It would also
mean I only need a single ecryptfs mountpoint. At the
moment, I create a new one for each major upgrade which
seems silly.

cheers,
raf




Re: Inserts restricted to a trigger

2019-06-17 Thread raf
Adrian Klaver wrote:

> On 6/17/19 4:54 PM, Miles Elam wrote:
> > Is there are way to restrict direct access to a table for inserts but
> > allow a trigger on another table to perform an insert for that user?
> > 
> > I'm trying to implement an audit table without allowing user tampering
> > with the audit information.
> 
> Would the below not work?:
> CREATE the table as superuser or other privileged user
> Have trigger function run as above user(use SECURITY DEFINER)

and make sure not to give any other users insert/update/delete
permissions on the audit table.

> > Thanks in advance,
> > 
> > Miles Elam
> 
> -- 
> Adrian Klaver
> adrian.kla...@aklaver.com




Re:

2019-07-02 Thread raf
Laurenz Albe wrote:

> On Tue, 2019-07-02 at 13:09 +0530, Prakash Ramakrishnan wrote:
> > We are getting below error while during import the csv file please do the 
> > needful.
> 
> I'd say the needful thing here is for you to read the documentation...
> 
> > -bash-4.2$ more ckr_sto.csv
> >  4937880 | 12 | 2015-01-05  || 2015-01-05 05:51:47 |
> >  || 5000128790679 |   |   |   |
> >  4939355 | 12 | 2015-01-05  || 2015-01-05 05:51:47 |
> >  || 5000128639345 |   |   |   |
> >  4939744 | 12 | 2015-01-05  || 2015-01-05 05:51:47 |
> >  || 5000128684510 |   |   |   |
> >  4939750 | 12 | 2015-01-05  || 2015-01-05 05:51:47 |
> >  || 5000128683100 |   |   |   |
> >  4936360 | 12 | 2015-01-05  || 2015-01-05 05:51:47 |
> >  || 5000128567527 |   |   |   |
> >  4940308 | 12 | 2015-01-05  || 2015-01-05 05:51:47 |
> >  || 5000128781329 |   |   |   |
> >  4938006 | 12 | 2015-01-05  || 2015-01-05 05:51:47 |
> >  || 4000128912554 |   |   |   |
> >  4937457 | 12 | 2015-01-05  || 2015-01-05 05:51:47 |
> >  || 5000128426574 |   |   |   |
> > 
> > error
> > --
> > postgres=# \copy ckr_sto from /data/ckr_sto.csv CSV HEADER;
> > ERROR:  invalid input syntax for integer: " 4939355 | 12 | 2015-01-05  
> > || 2015-01-05 05:51:47 | || 5000128639345 | 
> >   |   |   | "
> > CONTEXT:  COPY ckr_sto, line 2, column pod_id: " 4939355 | 12 | 
> > 2015-01-05  || 2015-01-05 05:51:47 | || 
> > 500012863934..."
> 
> >From the documentation of COPY:
> 
> DELIMITER
> 
> Specifies the character that separates columns within each row (line) of 
> the file.
> The default is a tab character in text format, a comma in CSV format.
> This must be a single one-byte character. This option is not allowed when 
> using binary format.
> 
> Yours,
> Laurenz Albe
> -- 
> Cybertec | https://www.cybertec-postgresql.com

in other words, ckr_sto.csv is not a csv file.
it just has .csv at the end of its name.
that's why psql tried to interpret the entire
line as the first column: there were no commas.

its contents should look something like:

4937880,12,2015-01-05,,2015-01-05 05:51:47,,,5000128790679
4939355,12,2015-01-05,,2015-01-05 05:51:47,,,5000128639345
4939744,12,2015-01-05,,2015-01-05 05:51:47,,,5000128684510
4939750,12,2015-01-05,,2015-01-05 05:51:47,,,5000128683100
4936360,12,2015-01-05,,2015-01-05 05:51:47,,,5000128567527
4940308,12,2015-01-05,,2015-01-05 05:51:47,,,5000128781329
4938006,12,2015-01-05,,2015-01-05 05:51:47,,,4000128912554
4937457,12,2015-01-05,,2015-01-05 05:51:47,,,5000128426574





Re: plpgsql: How to modify a field in an array of records

2019-07-02 Thread raf
Dirk Mika wrote:

> Hi,
> 
> Another Oracle -> PostgreSQL Question. ☺
> 
> I try to migrate a package procedure. Local types were declared in the Oracle 
> package:
> 
>   TYPE t_class_record IS RECORD
>   (
>  id_class   classes.id_class%TYPE,
>  field1 number,
>  field2 number
>   );
> 
>   TYPE t_classes_table IS TABLE OF t_class_record
>  INDEX BY BINARY_INTEGER;
> 
>   l_classes_table t_classes_table;
> 
> l_classes_table is initialized by a SELECT statement where later single 
> fields of single array elements are modified like this:
> 
>   l_classes_table(i).field1 := l_value;
> 
> So far I have done the following in PostgreSQL:
> 
> 
>   *   Defined a composite type that corresponds to the structure listed above:
> CREATE TYPE t_class_record AS (id_class CHARACTER VARYING,
>field1   INTEGER,
>field2   INTEGER);
> 
>   *   Defined a procedure with a local variable of type Array of 
> t_class_record:
> l_classes_array   t_class_record [];
> 
> But when I try to modify a field of a record in the array I get a syntax 
> error.
> 
> l_classes_array[i].field1 := l_value;
> 
> The error is ERROR: syntax error at or near "." Position: 12414 where 
> position points to the . after the [i]. I've no idea what's causing this 
> syntax error.
> 
> My goal is to store an array of records, fetched via SELECT Statement, in a 
> variable in a way, that I am able to modify individual fields of individual 
> array elements later in the function.
> 
> Am I on the right track or should I approach the problem completely 
> differently?
> 
> Dirk
> 
> --
> Dirk Mika
> Software Developer

Hi Dirk,

I don't know the answer to this but I have encountered
a situation where the plpgsql syntax didn't support
something I would have liked (specifically, it
can't/couldn't select into array elements so I needed
to select into multiple scalar variables and then
assign their values to the array elements).

If the problem you are facing is similar in nature, you
might be able to work around it by having a variable of
the same type as the array elements, assign to it the
record that you want to modify, make the modification
in the single record variable and then assign that
single record variable back into the array element that
it came from.

i.e. something like:

  declare
  a rectype[];
  r rectype;
  i  integer;
  begin
  ...
  r := a[i];
  r.field := newvalue;
  a[i] := r;
  ...

I didn't even realise that you could have an array of records.
I've only ever used arrays of scalar values.

cheers,
raf





Re: SOC II Type 2 report.

2023-06-14 Thread raf
On Tue, Jun 13, 2023 at 07:54:21AM -0700, Christophe Pettus  
wrote:

> Hi,
> 
> > On Jun 12, 2023, at 11:57, Raj Kiran  wrote:
> > Prokopto is completing our annual vendor review process. Please share your 
> > most recent SOC II Type 2 report.
> 
> The PostgreSQL project isn't SOC2 certified, and will almost certainly
> never be.  If you require SOC2 compliance, you'll need to work with a
> commercial (that is to say, not free) vendor who is SOC2 certified in
> order to bring PostgreSQL under their certification.

And I doubt that Postgres could be described as a
vendor. By definition, a vendor is a person or
organisation that sells something.

  OED: vendor(n): One who disposes of a thing by sale; a seller

cheers,
raf





Help with "gpg -d ... | pg_restore ..." with unimportant pg_restore errors

2024-09-03 Thread raf
Hi,

I need help!

I'm upgrading an ancient (but still awesome) postgresql-9.6.24 (via 
EnterpriseDB)
to (a no doubt even more awesome) postgresql-15.8 (via debian (stable) packages)
but am unable to load database backups that were encrypted via gpg. Loading 
from 
unencrypted backups works fine (and the millions of tests all pass! Yay!).

I have a convenience program for handling loading called "load"
and the underlying commands that it executes look like this:

  dropdb -h payroll -p 5433 -U postgres payroll_tst
  createdb -h payroll -p 5433 -U postgres -T template0 -E utf8 -O admin 
payroll_tst
  gpg --decrypt 20240904-011254-payroll_tst.pgdump.gpg.aps24 | pg_restore -1 -h 
payroll -p 5433 -U postgres -d payroll_tst -Fc

(The ".aps24" is a label to indicate which gpg key was used)

Below is the output from the gpg | pg_restore pipeline:

  gpg: encrypted with 2048-bit RSA key, ID 1373FBE2D5B2229A, created 2024-01-15
"Payroll "
  pg_restore: [archiver (db)] Error while PROCESSING TOC:
  pg_restore: [archiver (db)] Error from TOC entry 1925; 1255 937975638 
FUNCTION xml_is_well_formed(text) postgres
  pg_restore: [archiver (db)] could not execute query: ERROR:  could not find 
function "xml_is_well_formed" in file "/usr/lib/postgresql/15/lib/pgxml.so"
  Command was: CREATE FUNCTION public.xml_is_well_formed(text) RETURNS 
boolean
  LANGUAGE c IMMUTABLE STRICT
  AS '$libdir/pgxml', 'xml...
  gpg: error writing to '-': Broken pipe
  gpg: error flushing '[stdout]': Broken pipe
  gpg: handle plaintext failed: Broken pipe
  pgrestore encountered errors

I'm not worried about the xml_is_well_formed error (or the xml_valid error that
would happen next). I think those functions are ancient and irrelevant and not
in use, and I'm happy for pg_restore to continue, like it does when gpg is not
involved.

But can anyone explain why gpg is getting SIGPIPE? Does pg_restore behave
differently in the face of sql errors when input is stdin rather than from a
named file? Is it closing stdin when this error happens (and -e is not 
supplied)?
If so, is there a way to stop it closing stdin when an sql error happens?

I could just tell the relevant people (or modify the load script) to
decrypt the file to disk and then load the decrypted database backup
from disk, and delete the temporary unencrypted database backup, but
I'd rather be able to continue to pipe the gpg output into pg_restore.

I assume this problem will go away for future backups, but there are many
old backups containing these xml functions that might need to be loaded
at any time.

Any advice?

Update: The load was using the 9.6 version of pg_restore. When using the
15.8 version of pg_restore, the output changes to:

  pg_restore: error: could not execute query: ERROR:  schema "public" already 
exists
  Command was: CREATE SCHEMA public;


  gpg: error writing to '-': Broken pipe
  gpg: error flushing '[stdout]': Broken pipe
  gpg: handle plaintext failed: Broken pipe

And it still doesn't load.

cheers,
raf





Re: Help with "gpg -d ... | pg_restore ..." with unimportant pg_restore errors

2024-09-03 Thread raf
On Tue, Sep 03, 2024 at 06:43:22PM -0700, "David G. Johnston" 
 wrote:

> On Tuesday, September 3, 2024, raf  wrote:
> 
> > Hi,
> >
> > I need help!
> >
> > I'm upgrading an ancient (but still awesome) postgresql-9.6.24 (via
> > EnterpriseDB)
> > to (a no doubt even more awesome) postgresql-15.8 (via debian (stable)
> > packages)
> > but am unable to load database backups that were encrypted via gpg.
> > Loading from
> > unencrypted backups works fine (and the millions of tests all pass! Yay!).
> >
> > I have a convenience program for handling loading called "load"
> > and the underlying commands that it executes look like this:
> >
> >   dropdb -h payroll -p 5433 -U postgres payroll_tst
> >   createdb -h payroll -p 5433 -U postgres -T template0 -E utf8 -O admin
> > payroll_tst
> 
> Given the following command
> 
> >   gpg --decrypt 20240904-011254-payroll_tst.pgdump.gpg.aps24 | pg_restore
> > -1 -h payroll -p 5433 -U postgres -d payroll_tst -Fc
> 
> And this error
> 
> >   pg_restore: [archiver (db)] could not execute query: ERROR:  could not
> > find function "xml_is_well_formed" in file "/usr/lib/postgresql/15/lib/
> > pgxml.so"
> >   Command was: CREATE FUNCTION public.xml_is_well_formed(text)
> > RETURNS boolean
> >   LANGUAGE c IMMUTABLE STRICT
> >   AS '$libdir/pgxml', 'xml...
> 
> This should be expected.  In particular…
> 
> >   gpg: error writing to '-': Broken pipe
> >   gpg: error flushing '[stdout]': Broken pipe
> >   gpg: handle plaintext failed: Broken pipe
> >   pgrestore encountered errors
> >
> > I'm not worried about the xml_is_well_formed error (or the xml_valid
> > error that would happen next). I think those functions are ancient
> > and irrelevant and not in use, and I'm happy for pg_restore to
> > continue, like it does when gpg is not involved.
> 
> You specified “-1” so I don’t get why you believe pg_restore should be
> continuing to execute in the face of the SQL error.

The reason I believe pg_restore should be continuing to execute in the face of
the SQL error is because I didn't supply the -e option which is described
thusly in the pg_restore manual entry:

 -e
 --exit-on-error
 Exit if an error is encountered while sending SQL commands to the database.
 The default is to continue and to display a count of errors at the end of
 the restoration.

So, since I didn't specify the -e option, pg_restore should continue to
execute, and not close stdin. As I explained, when restoring from a file on
disk, the pg_restore command does continue and work fine. It's only when
restoring from stdin that I'm seeing this problem.

Ah, I see. The manual entry also says that -1 implies -e.

And when reading from a file on disk, my load script doesn't include -1.
Instead, it uses the -j option. Now it all makes sense.

Many thanks. It's working without the -1. I'll change the load
script so that it only uses the -1 option when restoring from
new backups taken after the upgrade to 15.8 (where these vestigial
xml functions won't be present in the backup).

> David J.

cheers,
raf





Re: Help with "gpg -d ... | pg_restore ..." with unimportant pg_restore errors

2024-09-03 Thread raf
On Tue, Sep 03, 2024 at 10:28:44PM -0400, Tom Lane  wrote:

> raf  writes:
> > On Tue, Sep 03, 2024 at 06:43:22PM -0700, "David G. Johnston" 
> >  wrote:
> >> You specified “-1” so I don’t get why you believe pg_restore should be
> >> continuing to execute in the face of the SQL error.
> 
> > The reason I believe pg_restore should be continuing to execute in the face 
> > of
> > the SQL error is because I didn't supply the -e option which is described
> > thusly in the pg_restore manual entry:
> 
> But you'd better also read the para about -1:
> 
>   -1
>   --single-transaction
> Execute the restore as a single transaction (that is, wrap the
> emitted commands in BEGIN/COMMIT). This ensures that either all
> the commands complete successfully, or no changes are
> applied. This option implies --exit-on-error.
> 
>   regards, tom lane
> 

Yes, I saw that. Many thanks. All good now.

cheer,
raf





Re: Need help with clarification on stored procedure support in PostGreSQL database

2021-03-08 Thread raf
On Sun, Mar 07, 2021 at 11:01:07AM -0800, Adrian Klaver 
 wrote:

> On 3/7/21 10:48 AM, Bysani, Ram wrote:
> > PostgreSQL version: 11.11
> > 
> > Operating system:   Linux 64 bit
> > 
> > Description:
> > 
> > Greetings
> > 
> > We have found posts on several forums that Stored Procedures are not
> > supported in PostGreSQL databasel.
> > 
> > We looked at the documentation but it is not clear when Stored Procedure
> > support was added. We are aware that Functions are supported.
> > 
> > Please provide details / clarify if Stored Procedures are available in
> > versions 11.x, 12.x, 13.x Please also confirm when it was actually
> > added.
> > 
> > https://www.postgresql.org/docs/ <https://www.postgresql.org/docs/>
> > 
> > Thanks for your time
> 
> Go to page David Johnson posted:
> 
> https://www.postgresql.org/docs/current/sql-createprocedure.html
> 
> and look at 'Supported Versions:' at top of page.
> 
> > Regards
> > Ram Bysani
> 
> -- 
> Adrian Klaver
> adrian.kla...@aklaver.com

Is there any important different between a stored procedure and
a stored function that returns void? I've been using functions
for everything but calling them procedures in my head (even though
most of them do return a value or result set).

cheers,
raf





Re: Need help with clarification on stored procedure support in PostGreSQL database

2021-03-08 Thread raf
On Mon, Mar 08, 2021 at 06:31:56PM -0500, Tom Lane  wrote:

> raf  writes:
> > Is there any important different between a stored procedure and
> > a stored function that returns void? I've been using functions
> > for everything but calling them procedures in my head (even though
> > most of them do return a value or result set).
> 
> A procedure can commit or rollback a transaction (and then start
> a new one).
> 
>   regards, tom lane

thanks.

cheers,
raf





Re: Recreating functions after starting the database server.

2018-02-01 Thread raf
Konrad Witaszczyk wrote:

> Hi,
> 
> I'm considering changing my database schema upgrade routines to recreate
> functions and all objects depending on them (e.g. triggers, views) just after
> the database server is started. It would make the routines much simpler since
> they wouldn't have to depend on the history of the schema.
> 
> Does anyone has any experience with recreating all functions and triggers to
> upgrade a database schema assuming that during the upgrade there are no client
> connections to the database?
> 
> Does anyone see any problems with such approach in terms of consistency and
> performance? I'm not familiar with PostgreSQL internals and I'm not sure how 
> it
> would affect the planner when using various function volatile categories.
> 
> Konrad

Hi,

I think the usual way to do this is to have a table containing a
schema version number that can be consulted to work out where
the schema is up to so as to know which schema migration scripts
still need to be run to bring the database up to date. This is
useful for upgrading a system on multiple database servers. All
the cool kids would be doing this. However, I think this is more
common with ORMs where all the code lives outside the database.

I often have to load old database backups that need to be
brought up to date so that the current software will work with
it. I have date/time stamps in the filename of the database
backup and my script to load the backup consults a directory
full of all schema migration scripts that are also date/time
stamped. It then applies the schema migration scripts that are
dated after the time that the backup was made.

After applying all the schema migration scripts, it runs another
script to audit the state of the database functions with respect
to the "source". It then reloads any functions that differ from
the current source. I don't use views or triggers but the same
idea would apply.

My main advice is to not call any functions (or rely on any
views or triggers) inside schema migration scripts because you
never know if they'll exist or not when they are needed.

Also, if any update scripts massage data rather than just change
the schema, don't assume that the data exists just because it
exists when you wrote the migration script. The script might run
on a very old backup taken before the data existed. Or at least
write migrations that avoid producing errors if the data isn't
present. Minimise the output so that errors don't go unnoticed.

I'd also recommend writing a tool that compares the functions,
triggers and views in the database against the code for them on
disk somewhere and reload the ones that differ from source. Once
you have such an audit tool, it's very easy to see what needs to
be reloaded and the tool can do that too.

A very important part of such an audit tool is that it
(probably?) needs to perform a topological sort on the
functions, triggers and views to load them in a sensible order.
Inspect the source for them all looking for references between
them and do a topoligical sort so you can load things that are
depended on before loading the things that depend on them. This
might not actually be necessary but I think it is (or I wouldn't
have bothered doing it unless it's cargo cult behaviour on my
part from Sybase days). :-)

I use the function auditing tool to load new or changed
functions on multiple servers all the time but I only apply old
schema migration scripts when loading backups.

One last thing, if you need to keep very old backups
indefinitely for legal reasons, make a point of reloading them
every year to apply the last year's worth of schema changes and
make a backup of that updated version. It makes it less likely
that bit rot will ruin your day in the distant future. But also
keep the original backup because sometimes you need the backup
as it was without any changes made after the fact. The ancient
version won't work with current application code but that can be
the problem of whoever has demanded the data in its original
form.

I've never encountered any problems with consistency and
performance issues in doing something like this but I only have
small databases so that doesn't mean there won't be any.

Good luck!

cheers,
raf