Onfly Query - cumulative sum the stock change values by articles

2021-01-07 Thread Durumdara
Hello!

I have a query where I can show the positive and negative future changes of
the articles.
For example:

select art.id, art.name, art.actual_stock, art.min_stock,
change.stock_change, change.date
from change left join art on art.id = change.art_id
order by art.id, change.id

Ok, I have a list of the changes.

I need to sum these changes to get the rows where the stock is negative or
below the minimum.

1, bla, 100, 20, +20,  2021-02-01, [120]
1, bla, 100, 20, -10,  2021-02-01, [110]
1, bla, 100, 20, -100,  2021-02-01, [10] - below minimum
1, bla, 100, 20, -20,  2021-02-01, [-10] - below zero
2, bli, 20, 10, 10,   2021-02-01, [20]
2, bli, 20, 10, -15,   2021-02-01, [5]   below minimum

How do I use a window function (can I use it) to "reset" the stock in every
first record of the articles - to cumulate the data?

Or do I need to store data in a temporary table and use stored procedure to
fill up the cumulated stock value?

Thank you for the answer!

Best regards

ddd


RE: Keep needing to run manual analyze

2021-01-07 Thread Rob Northcott
From: Michael Lewis 
Sent: 06 January 2021 18:11
To: Rob Northcott 
Cc: pgsql-gene...@postgresql.org
Subject: Re: Keep needing to run manual analyze

On Wed, Jan 6, 2021 at 10:29 AM Rob Northcott 
mailto:rob.northc...@compilator.com>> wrote:
We have an application that uses a Postgres database (currently 9.6).  All the 
autovacuum/analyze settings are just left at the defaults.
We’ve had a few instances recently where users have complained of slow 
performance and running a manual Analyze (not vacuum) on their database has 
resolved it.
Should it be considered normal that this needs to be done manually periodically 
even when autovacuum is on?
It’s not a very heavy application by a lot of people’s standards (it’s just a 
point-of-sale/stock-control system) but there are a couple of hundred databases 
on the server, all in fairly constant use through the day.

>You may need more workers, and a higher cost limit before work is paused for 
>cost_delay. Depending how many tables per database >in the cluster, more 
>workers would likely be ideal, or *maybe* a smaller naptime if there are tons 
>of tables overall and all of them >are relatively small/see little changes.

>It really depends on your workload and *why* the tables aren't getting 
>analyzed as frequently as you need. If your cost limit/delay >mean that the 
>auto vacuum/analyze is rather throttled (and default settings would be that 
>situation given today's I/O throughput on >any decent production machine), and 
>you have some large tables with many large indexes are constantly in need of 
>vacuuming and >you don't have sufficient maintenance work memory configured to 
>avoid re-scanning the indexes repeatedly to get the work done... >you may 
>never be getting around to the other tables. If you have a table that is 
>(nearly) all inserts, then a periodic >vacuum/analyze done manually is prudent 
>before PG13.

>Are you logging all auto vaccums/analyzes and able to run a pg badger or 
>similar analysis on it? It would be helpful to see some stats >on what is 
>going on currently.

Thanks for the tips.  I don’t think it’s being logged unfortunately (but we 
could always turn it on if we need more info), but what you’ve said at least 
confirms that manual analyze shouldn’t be necessary.  I’ll have to go and read 
up on autovacuum settings (it’s not something I’ve really looked into in detail 
before and just left the default settings, which look like they’re not doing 
what we need).


RE: Keep needing to run manual analyze

2021-01-07 Thread Rob Northcott


-Original Message-
From: Laurenz Albe  
Sent: 07 January 2021 02:06
To: Rob Northcott ; pgsql-gene...@postgresql.org
Subject: Re: Keep needing to run manual analyze

On Wed, 2021-01-06 at 17:28 +, Rob Northcott wrote:
>> We have an application that uses a Postgres database (currently 9.6).  All 
>> the autovacuum/analyze settings are just left at the defaults.
>> We’ve had a few instances recently where users have complained of slow 
>> performance and running a manual Analyze  (not vacuum) on their database has 
>> resolved it.
>> Should it be considered normal that this needs to be done manually 
>> periodically even when autovacuum is on?
>> It’s not a very heavy application by a lot of people’s standards (it’s 
>> just a point-of-sale/stock-control system)  but there are a couple of 
>> hundred databases on the server, all in fairly constant use through the day.
>> Is it possible that the autovacuum/analyze isn’t running for some 
>> reason?  If so, is there anything I  should look at to make it work 
>> better or should I just be looking at scheduling a manual vacuum/analyze  
>> periodically on all databases?

>That is not necessary.

>You should figure out what tables need more frequent ANALYZE runs to get good 
>execution plans and tune autoanalyze for these tables, >for example by 
>reducing the scale factor for them.

Thanks - that confirms what the previous poster said, that we should be able to 
set up autovacuum to do what we need.  I'll go away and read up on it.  Thanks 
for the tips.


Re: Using more than one LDAP?

2021-01-07 Thread Paul Förster
Hi Magnus,

> On 06. Jan, 2021, at 16:57, Magnus Hagander  wrote:
> 
> Yes. But you have a really hacky environment :P

actually not. We have an old LDAP which we want to retire this year. And we 
also have Windows AD, which offers LDAP. So the idea is to switch the LDAP 
environments in PostgreSQL. The old LDAP uses aaa-u1, aaa-u2, etc. which are 
also accounts in the database. But our Windows AD has bbb-u1, bbb-u2, etc. So 
just switching LDAPs doesn't work. I'd also have to rename all users. Though 
it's just a one-liner, it would mean that users have to use their new names 
from one second to the next. But we want a transition phase if that's possible.

> You could have a third LDAP instance that federates the other two.
> 
> Another option could be to proxy it through something like FreeRADIUS.
> I'm fairly certain it can also move on to a secondary server if the
> first one reports login failure.

I can't. I'm no sysadmin and have no rights on systems to install anything 
except the PostgreSQL software. Also, the network guys wouldn't be too happy. 
And then, there is a problem introducing new software, which is possible, but 
can take months for us to get the necessary permissions.

> I assume you're not using any of the standard packagings then, as I
> believe they all come with support for GSSAPI. Yet another reason why
> it's a good idea to use that :)

no, we always compile from source and only what we need. I can build packages 
with GSSAPI compiled into it but it does require me do have a small service 
interruption if I install packages with the same PostgreSQL version number, a 
situation, which I'd like to avoid, if possible.

> And no, gssapi does not use certificates.

that's good news as I'm not really happy about all that certificate stuff. ;-)

> pg_ident only works for authentication methods where the username
> comes from the other system, such as with Kerberos. It does not work
> for LDAP, where the username is specified in PostgreSQL.

I don' understand that. The doc says it should work for all external 
authentication services. Maybe I misread something?...

Cheers,
Paul



Re: LDAP(s) doc misleading

2021-01-07 Thread Paul Förster
Hi Stephen,

> On 06. Jan, 2021, at 18:14, Stephen Frost  wrote:
> 
> When in an Active Directory environment, it's far more secure to use
> Kerberos/GSSAPI and not LDAP (or LDAPS).  Using the ldap authentication
> method with PostgreSQL will result in the credentials of users being
> sent to the database server, such that if the database server is
> compromised so will all of those user accounts.

I understand. But users can't login on the database server, just on the 
database. Database servers and client machines are located in different network 
zones with firewalls between them.

Also, my point was not about using LDAP(S) versus Kerberos or GSSAPI. My point 
was, that I find the description of the ldapscheme entry misleading.

Cheers,
Paul



Re: LDAP(s) doc misleading

2021-01-07 Thread Magnus Hagander
On Wed, Jan 6, 2021 at 8:36 AM Paul Förster  wrote:
>
> Hi,
>
> I found what I believe to be misleading in the LDAP documentation:
>
> https://www.postgresql.org/docs/current/auth-ldap.html
>
> It says:
> "ldapscheme
> Set to ldaps to use LDAPS."...
>
> IMHO, it should say:
> "ldapscheme
> Set to ldapscheme to use LDAPS (ldapscheme=ldaps)."...

No, I think this is correct.

"Set to ldaps to use ldaps" means you set it to the value "ldaps" in
order to use ldaps.

I think you missed the "to" in the sentence -- without that one, your
reading of it would make more sense. See also the following parameter,
ldaptls, which uses similar language.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/




Re: LDAP(s) doc misleading

2021-01-07 Thread Paul Förster
Hi Magnus,

> On 07. Jan, 2021, at 11:04, Magnus Hagander  wrote:
> 
> No, I think this is correct.
> 
> "Set to ldaps to use ldaps" means you set it to the value "ldaps" in
> order to use ldaps.
> 
> I think you missed the "to" in the sentence -- without that one, your
> reading of it would make more sense. See also the following parameter,
> ldaptls, which uses similar language.

argh! Sorry, my bad. Yes, I missed the "to". :-( Thanks.

Cheers,
Paul




Re: How to convert escaped text column - force E prefix

2021-01-07 Thread Durumdara
Dear Members!



Pavel Stehule  ezt írta (időpont: 2021. jan. 6.,
Sze, 12:03):

>
>
>
> it cannot  work, because \ will be  replaced by \\
>
> postgres=# CREATE OR REPLACE FUNCTION public.unistr(text)
>  RETURNS text
>  LANGUAGE plpgsql
>  IMMUTABLE STRICT
> AS $function$
> declare r text;
> begin
>   execute 'select ' || quote_literal($1) into r;
>   return r;
> end;
> $function$
> ;
> CREATE FUNCTION
> postgres=# select unistr('Az ad\u00f3kulcsonk\u00e9nti');
> ┌──┐
> │unistr│
> ╞══╡
> │ Az ad\u00f3kulcsonk\u00e9nti │
> └──┘
> (1 row)
>
>
>
> Gavan Schneider
>>
>
Thank you for the answer!

We will try your solution.

Only one question about it:
Could we use PG's JSON interpreter somehow. I don't know it, but pseudo.

select
  GET_JSON_FIELD_VALUE(
'name',
FROM_JSON_TEXT(   '{name:' || chr(39) || thistable.thisfield || chr(39)
|| '}' )
  ) from thistable

or use FORMAT instead of CONCAT.

Is this possible to work? What do you think about the vulnerability?

Thank you!

dd


Re: How to convert escaped text column - force E prefix

2021-01-07 Thread Pavel Stehule
čt 7. 1. 2021 v 12:13 odesílatel Durumdara  napsal:

> Dear Members!
>
>
>
> Pavel Stehule  ezt írta (időpont: 2021. jan. 6.,
> Sze, 12:03):
>
>>
>>
>>
>> it cannot  work, because \ will be  replaced by \\
>>
>> postgres=# CREATE OR REPLACE FUNCTION public.unistr(text)
>>  RETURNS text
>>  LANGUAGE plpgsql
>>  IMMUTABLE STRICT
>> AS $function$
>> declare r text;
>> begin
>>   execute 'select ' || quote_literal($1) into r;
>>   return r;
>> end;
>> $function$
>> ;
>> CREATE FUNCTION
>> postgres=# select unistr('Az ad\u00f3kulcsonk\u00e9nti');
>> ┌──┐
>> │unistr│
>> ╞══╡
>> │ Az ad\u00f3kulcsonk\u00e9nti │
>> └──┘
>> (1 row)
>>
>>
>>
>> Gavan Schneider
>>>
>>
> Thank you for the answer!
>
> We will try your solution.
>
> Only one question about it:
> Could we use PG's JSON interpreter somehow. I don't know it, but pseudo.
>
> select
>   GET_JSON_FIELD_VALUE(
> 'name',
> FROM_JSON_TEXT(   '{name:' || chr(39) || thistable.thisfield ||
> chr(39) || '}' )
>   ) from thistable
>
> or use FORMAT instead of CONCAT.
>
> Is this possible to work? What do you think about the vulnerability?
>

The vulnerability is almost the same although it is a little bit harder to
create attack strings.

Regards

Pavel


> Thank you!
>
> dd
>
>


Re: Using more than one LDAP?

2021-01-07 Thread Magnus Hagander
On Thu, Jan 7, 2021 at 10:40 AM Paul Förster  wrote:
>
> Hi Magnus,
>
> > On 06. Jan, 2021, at 16:57, Magnus Hagander  wrote:
> >
> > Yes. But you have a really hacky environment :P
>
> actually not. We have an old LDAP which we want to retire this year. And we 
> also have Windows AD, which offers LDAP. So the idea is to switch the LDAP 
> environments in PostgreSQL. The old LDAP uses aaa-u1, aaa-u2, etc. which are 
> also accounts in the database. But our Windows AD has bbb-u1, bbb-u2, etc. So 
> just switching LDAPs doesn't work. I'd also have to rename all users. Though 
> it's just a one-liner, it would mean that users have to use their new names 
> from one second to the next. But we want a transition phase if that's 
> possible.
>
> > You could have a third LDAP instance that federates the other two.
> >
> > Another option could be to proxy it through something like FreeRADIUS.
> > I'm fairly certain it can also move on to a secondary server if the
> > first one reports login failure.
>
> I can't. I'm no sysadmin and have no rights on systems to install anything 
> except the PostgreSQL software. Also, the network guys wouldn't be too happy. 
> And then, there is a problem introducing new software, which is possible, but 
> can take months for us to get the necessary permissions.

This would be the hacky part of the environment: )



> > I assume you're not using any of the standard packagings then, as I
> > believe they all come with support for GSSAPI. Yet another reason why
> > it's a good idea to use that :)
>
> no, we always compile from source and only what we need. I can build packages 
> with GSSAPI compiled into it but it does require me do have a small service 
> interruption if I install packages with the same PostgreSQL version number, a 
> situation, which I'd like to avoid, if possible.

And this would be the second hacky part of the environment :)


> > And no, gssapi does not use certificates.
>
> that's good news as I'm not really happy about all that certificate stuff. ;-)
>
> > pg_ident only works for authentication methods where the username
> > comes from the other system, such as with Kerberos. It does not work
> > for LDAP, where the username is specified in PostgreSQL.
>
> I don' understand that. The doc says it should work for all external 
> authentication services. Maybe I misread something?...

The docs say "When using an external authentication system such as
Ident or GSSAPI, the name of the operating system user that initiated
the connection might not be the same as the database user (role) that
is to be used."

I think that's a bit of a left-over to when it was really just ident.
First of all it should probably say peer rather than ident, and it's
not actually operating systems that are relevant here.

So I can understand you getting ab it confused by that. but the
property that matter is where the username comes from. In GSSAPI, or
peer, or certificate, etc, the username is provided by the external
system, and the mapping is applied *after* that.

With LDAP authentication, the username is provided by the client, and
is then passed to the external system.

Mapping applies *after* the authentication, which inthe case of LDAP
would be too late to make any difference.

The references to "unix user" and "operating system users" are
probably a leftover from the old days and actually contribute to some
of the confusion I think.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/




Re: Onfly Query - cumulative sum the stock change values by articles

2021-01-07 Thread Durumdara
Hello!


Durumdara  ezt írta (időpont: 2021. jan. 7., Cs,
10:17):

> Hello!
>
> I have a query where I can show the positive and negative future changes
> of the articles.
> For example:
>
> select art.id, art.name, art.actual_stock, art.min_stock,
> change.stock_change, change.date
> from change left join art on art.id = change.art_id
> order by art.id, change.id
>
> Ok, I have a list of the changes.
>
> I need to sum these changes to get the rows where the stock is negative or
> below the minimum.
>
> 1, bla, 100, 20, +20,  2021-02-01, [120]
> 1, bla, 100, 20, -10,  2021-02-01, [110]
> 1, bla, 100, 20, -100,  2021-02-01, [10] - below minimum
> 1, bla, 100, 20, -20,  2021-02-01, [-10] - below zero
> 2, bli, 20, 10, 10,   2021-02-01, [20]
> 2, bli, 20, 10, -15,   2021-02-01, [5]   below minimum
>
> How do I use a window function (can I use it) to "reset" the stock in
> every first record of the articles - to cumulate the data?
>
> Or do I need to store data in a temporary table and use stored procedure
> to fill up the cumulated stock value?
>
>
>
Sorry... I solved it!

First I need to window sum "change.stock_change"  (partition by art.id),
then I need to add the starting stock value to all.
In this case I don't need to check what is the first item of the article
(where I need to add the actual stock).

Best wishes
  dd


Re: Using more than one LDAP?

2021-01-07 Thread Paul Förster
Hi Magnus,

> On 07. Jan, 2021, at 12:43, Magnus Hagander  wrote:
> 
> The docs say "When using an external authentication system such as
> Ident or GSSAPI, the name of the operating system user that initiated
> the connection might not be the same as the database user (role) that
> is to be used."
> 
> I think that's a bit of a left-over to when it was really just ident.
> First of all it should probably say peer rather than ident, and it's
> not actually operating systems that are relevant here.
> 
> So I can understand you getting ab it confused by that. but the
> property that matter is where the username comes from. In GSSAPI, or
> peer, or certificate, etc, the username is provided by the external
> system, and the mapping is applied *after* that.
> 
> With LDAP authentication, the username is provided by the client, and
> is then passed to the external system.
> 
> Mapping applies *after* the authentication, which inthe case of LDAP
> would be too late to make any difference.
> 
> The references to "unix user" and "operating system users" are
> probably a leftover from the old days and actually contribute to some
> of the confusion I think.

that explains it.

The use case in our company is: Developers connect with tools like DbVisualizer 
or SQL Developer (Oracle using the nasty PostgreSQL Hack :-( ) providing their 
username via JDBC to the database. Developers work on Windows, the databases 
run on Linux (SLES) and the AD obviously runs on Windows.

Ok, since LDAP doesn't work that way, I either need to build GSSAPI packages 
and have the AD admins to provide me with the keytab file or make the 
transition a "hard" one, i.e. no transition phase. Though I'd rather have liked 
to see a transition phase where either account could have been used I 
personally can live with that. It's the developers who will have to change 
quickly, not me. ;-)

Cheers,
Paul



Re: How to convert escaped text column - force E prefix

2021-01-07 Thread David G. Johnston
On Thursday, January 7, 2021, Pavel Stehule  wrote:

>
>
> The vulnerability is almost the same although it is a little bit harder to
> create attack strings.
>

Would making the function run as “security definer” and setting up a
minimal permissions user/owner help with mitigation?

David J.


Re: How to convert escaped text column - force E prefix

2021-01-07 Thread Pavel Stehule
čt 7. 1. 2021 v 15:50 odesílatel David G. Johnston <
david.g.johns...@gmail.com> napsal:

> On Thursday, January 7, 2021, Pavel Stehule 
> wrote:
>
>>
>>
>> The vulnerability is almost the same although it is a little bit harder
>> to create attack strings.
>>
>
> Would making the function run as “security definer” and setting up a
> minimal permissions user/owner help with mitigation?
>

yes. It is a very different usage of security definer functions, but it can
work.

Regards

Pavel


> David J.
>


How to keep format of views source code as entered?

2021-01-07 Thread Markhof, Ingolf
Hi!

Switching from Oracle SLQ to PostgreSQL I am facing the issue that the SQL code 
the system returns when I open a views source code is different from the code I 
entered. The code is formatted differently, comments are gone and e.g. all text 
constants got an explicit cast to ::text added. (see sample below).

I want the SLQ code of my views stored as I entered it. Is there any way to 
achieve this? Or will I be forced to maintain my views SQL code outside of 
PostgreSQL views?

Any hints welcome!

Here is an example:

I enter this code to define a simple view:

create or replace view myview as
select
  product_id,
  product_acronym
from
  products -- my comment here
where
  product_acronym = 'ABC'
;

However, when I open the view my SQL client (DBeaver) again, this is what I get:

CREATE OR REPLACE VIEW myview
AS SELECT product_id,
product_acronym
   FROM products
  WHERE product_acronym = 'ABC'::text;

So, the formatting changed, keywords are capitalized, the comment I added in 
the from-part has gone and the text constant 'ABC' changed to 'ABC'::text.






Verizon Deutschland GmbH - Sebrathweg 20, 44149 Dortmund, Germany - Amtsgericht 
Dortmund, HRB 14952 - Geschäftsführer: Detlef Eppig - Vorsitzender des 
Aufsichtsrats: Francesco de Maio


Re: How to keep format of views source code as entered?

2021-01-07 Thread Christophe Pettus



> On Jan 7, 2021, at 08:19, Markhof, Ingolf  
> wrote:
> I want the SLQ code of my views stored as I entered it. Is there any way to 
> achieve this? Or will I be forced to maintain my views SQL code outside of 
> PostgreSQL views?

The text that you get back from the PostgreSQL system catalogs is based on the 
parsed version of the view definition, rather than the literal text you 
entered.  Generally, you maintain your view definition separately in a source 
code control system in its original form.

--
-- Christophe Pettus
   x...@thebuild.com





Re: How to keep format of views source code as entered?

2021-01-07 Thread Tom Lane
"Markhof, Ingolf"  writes:
> I want the SLQ code of my views stored as I entered it. Is there any way to 
> achieve this?

No.  Lots of people prefer to keep their SQL code in some sort of
source-code-control system, anyway.

regards, tom lane




RE: How to keep format of views source code as entered?

2021-01-07 Thread Markhof, Ingolf
So, it looks like PostgreSQL does support saving the original source code of a 
view.

What's best practise to use as a code repository?

I would expect support of multi-user access, access-right management and 
perhaps versioning as well…?

Thanks for your help!

Ingolf

From: Markhof, Ingolf [mailto:ingolf.mark...@de.verizon.com]
Sent: 07 January 2021 17:19
To: pgsql-general@lists.postgresql.org
Subject: [E] How to keep format of views source code as entered?

Hi!

Switching from Oracle SLQ to PostgreSQL I am facing the issue that the SQL code 
the system returns when I open a views source code is different from the code I 
entered. The code is formatted differently, comments are gone and e.g. all text 
constants got an explicit cast to ::text added. (see sample below).

I want the SLQ code of my views stored as I entered it. Is there any way to 
achieve this? Or will I be forced to maintain my views SQL code outside of 
PostgreSQL views?

Any hints welcome!

Here is an example:

I enter this code to define a simple view:

create or replace view myview as
select
  product_id,
  product_acronym
from
  products -- my comment here
where
  product_acronym = 'ABC'
;

However, when I open the view my SQL client (DBeaver) again, this is what I get:

CREATE OR REPLACE VIEW myview
AS SELECT product_id,
product_acronym
   FROM products
  WHERE product_acronym = 'ABC'::text;

So, the formatting changed, keywords are capitalized, the comment I added in 
the from-part has gone and the text constant 'ABC' changed to 'ABC'::text.


Verizon Deutschland GmbH - Sebrathweg 20, 44149 Dortmund, Germany - Amtsgericht 
Dortmund, HRB 14952 - Geschäftsführer: Detlef Eppig - Vorsitzender des 
Aufsichtsrats: Francesco de Maio


Re: How to keep format of views source code as entered?

2021-01-07 Thread Christophe Pettus
Hello,

> On Jan 7, 2021, at 09:33, Markhof, Ingolf  
> wrote:
> 
> So, it looks like PostgreSQL does support saving the original source code of 
> a view.

To be clear, PostgreSQL itself does not.  The suggestion is to use an external 
source code repository, such as GitHub, GitLab, or one of (many!) other tools 
or products to store the view definition.

This has benefits besides just retaining the original source code, as you 
mention below: Version control, tracking, issue management and commit merging, 
etc.
--
-- Christophe Pettus
   x...@thebuild.com





RE: Is there a good discussion of optimizations?

2021-01-07 Thread Kevin Brannen
From: Guyren Howe 

>Most folks, in my experience, who use relational databases don’t really 
>understand the basic theory or even more important the why - the philosophy - 
>of what a relational database is and how to get the most out of them. I see a 
>lot of folks trying to use SQL in an imperative manner - make this temp table, 
>then update it some, then make this other temp table, etc...

>Anyway, I’d like to put together something that explains this. I would go into 
>Codd’s original insight and how simple yet powerful it is. I’d like to discuss 
>how if you stick to SQL, rather than forcing it into this imperative 
>straight-jacket, the database can work out all the details so you don’t have 
>to do the bad things.

Be sure you point out where SQL either fails or comes up short to give a
balanced view. To be fair, sometimes the reason people perceive SQL as failing
them (and why they go with the imperative process) is because they don't know
everything SQL does, or even perhaps their tool doesn't implement the whole
standard.

An example of this is that we have a report we're trying to write that I'd
like to think can be done in SQL, but I can't think of a way to do it. Yet,
if I do the base query and pull the data back into my application, I can do
the last bit with 3 lines of Perl very easily. The problem here revolves
around comparing a row of data to the previous row to know whether the data
changed "significantly enough" to keep the new row.

Another example is doing running totals. A couple of years ago I would have
said SQL can't do that. Now I know about the OVER clause, something that I
would categorize as somewhat obscure, I can do it as needed.

As Michael Lewis pointed out, large dataset can also cause you to choose not
to use SQL in 1 big statemet for everything (i.e. advocating the use to temp
tables). In some ways, using a CTE is a type of temp table, or at least I
view it as such. That allows a person to solve a problem in bite-sized chunks.
I will agree that optimization can do it better at times, but the code also has
to be maintained as well – a balancing act.

To continue the above, the idea of using a sub-select in a FROM clause is
generally not thought of by new learners. So experience is also a factor.

I think your idea is a good one, but I do hope you present that SQL can't
solve everything ... else why do we have plpgsql. :) You’re correct though,
SQL isn’t used as much as it should be in many places.

Kevin
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


RE: Is there a good discussion of optimizations?

2021-01-07 Thread Guyren Howe
On Jan 7, 2021, 13:07 -0800, Kevin Brannen , wrote:
> From: Guyren Howe 
>
> >Most folks, in my experience, who use relational databases don’t really 
> >understand the basic theory or even more important the why - the philosophy 
> >- of what a relational database is and how to get the most out of them. I 
> >see a lot of folks trying to use SQL in an imperative manner - make this 
> >temp table, then update it some, then make this other temp table, etc...

Actually, I’m mostly going to talk about the relational model, rather than SQL. 
Our industry seems to always settle for third-best, and SQL is the worst of all 
the examples of this. The world desperately needs a good relational database 
based on a better query language — datalog, for example.

I put up with SQL so I can use the relational model, and I think that 
understanding SQL has to start with that.

Anyhow.
> An example of this is that we have a report we're trying to write that I'd
> like to think can be done in SQL, but I can't think of a way to do it. Yet,
> if I do the base query and pull the data back into my application, I can do
> the last bit with 3 lines of Perl very easily. The problem here revolves
> around comparing a row of data to the previous row to know whether the data
> changed "significantly enough" to keep the new row.
>
> Another example is doing running totals. A couple of years ago I would have
> said SQL can't do that. Now I know about the OVER clause, something that I
> would categorize as somewhat obscure, I can do it as needed.

Actually, Window functions might be “advanced”, but are certainly not obscure. 
Your example sounds like it’s trivially solved with LAG().
> As Michael Lewis pointed out, large dataset can also cause you to choose not
> to use SQL in 1 big statemet for everything (i.e. advocating the use to temp
> tables). In some ways, using a CTE is a type of temp table, or at least I
> view it as such. That allows a person to solve a problem in bite-sized chunks.
> I will agree that optimization can do it better at times, but the code also 
> has
> to be maintained as well – a balancing act.

This appears to be good advice with SQL Server, which I’m coming to learn has a 
fairly poor query optimizer. But I would have thought Postgres’s optimizer 
would usually use a temporary table where appropriate.

Curious to hear if that’s wrong.
> I think your idea is a good one, but I do hope you present that SQL can't
> solve everything ... else why do we have plpgsql. :) You’re correct though,
> SQL isn’t used as much as it should be in many places.

An important consideration will be when relational is inappropriate. My biggest 
goal, though, is to get folks to understand how much relations *can* do — far 
too many devs in my experience don’t use the power of SQL because they don’t 
understand it.

Thanks for taking the time to give me this feedback.


Re: Is there a good discussion of optimizations?

2021-01-07 Thread Florents Tselai
Apologies for the shameless self-promotion :)

Around a year ago I collected my thoughts on this topic. You can read the
post here Modern Data Practice and the SQL Tradition
 .
It looks like it resonated with a lot of folks in the community. HN
Discussion https://news.ycombinator.com/item?id=21482114

I would specifically underline the fact that the newer generation of
programmers & data pros (my former self included) don't really appreciate
things like triggers and server-side programming. Triggers and DB-side
functions are considered something like Assembly code.

Not many neophytes have been shown with use cases why for example writing
2-3 lines of PL/SQL can save you a huge overhead of back and forth and
environment set up to write the same thing in say Pandas.

I would focus on Triggers, Indices on expressions, and time-related
functions. Probably on generated columns too. They may be considered a new
feature, but the reasoning of building successively columns on top of a few
base ones is quite appealing nowadays, especially for ML purposes.

I also wouldn't hesitate to discuss advanced topics. They are usually
considered obscure because people don't present useful examples, rather toy
and arbitrary ones.

In a recent O'Reilly training, I was skeptical of talking about triggers
for optimization but it looks like it was probably the most useful part of
my training as students could actually "steal and tweak" my code.



*Florents Tselai*

*Data Scientist & Engineer*

Jack of All Trades and Master of Some

http://tselai.com



If I'm late for a meeting: I'll be there in 5'. If not, reread this.

Sent from mobile excuse any...Nah... typos are due to sheer laziness or
neglect.


On Thu, Jan 7, 2021 at 11:21 PM Guyren Howe  wrote:

> On Jan 7, 2021, 13:07 -0800, Kevin Brannen , wrote:
>
> From: Guyren Howe 
>
>
>
> >Most folks, in my experience, who use relational databases don’t really
> understand the basic theory or even more important the why - the philosophy
> - of what a relational database is and how to get the most out of them. I
> see a lot of folks trying to use SQL in an imperative manner - make this
> temp table, then update it some, then make this other temp table, etc...
>
>
> Actually, I’m mostly going to talk about the relational model, rather than
> SQL. Our industry seems to always settle for third-best, and SQL is the
> worst of all the examples of this. The world desperately needs a good
> relational database based on a better query language — datalog, for example.
>
> I put up with SQL so I can use the relational model, and I think that
> understanding SQL has to start with that.
>
> Anyhow.
>
> An example of this is that we have a report we're trying to write that I'd
>
> like to think can be done in SQL, but I can't think of a way to do it. Yet,
>
> if I do the base query and pull the data back into my application, I can do
>
> the last bit with 3 lines of Perl very easily. The problem here revolves
>
> around comparing a row of data to the previous row to know whether the data
>
> changed "significantly enough" to keep the new row.
>
>
>
> Another example is doing running totals. A couple of years ago I would have
>
> said SQL can't do that. Now I know about the OVER clause, something that I
>
> would categorize as somewhat obscure, I can do it as needed.
>
>
> Actually, Window functions might be “advanced”, but are certainly not
> obscure. Your example sounds like it’s trivially solved with LAG().
>
> As Michael Lewis pointed out, large dataset can also cause you to choose
> not
>
> to use SQL in 1 big statemet for everything (i.e. advocating the use to
> temp
>
> tables). In some ways, using a CTE is a type of temp table, or at least I
>
> view it as such. That allows a person to solve a problem in bite-sized
> chunks.
>
> I will agree that optimization can do it better at times, but the code
> also has
>
> to be maintained as well – a balancing act.
>
>
> This appears to be good advice with SQL Server, which I’m coming to learn
> has a fairly poor query optimizer. But I would have thought Postgres’s
> optimizer would usually use a temporary table where appropriate.
>
> Curious to hear if that’s wrong.
>
> I think your idea is a good one, but I do hope you present that SQL can't
>
> solve everything ... else why do we have plpgsql. :) You’re correct though,
>
> SQL isn’t used as much as it should be in many places.
>
>
> An important consideration will be when relational is inappropriate. My
> biggest goal, though, is to get folks to understand how much relations
> *can* do — far too many devs in my experience don’t use the power of SQL
> because they don’t understand it.
>
> Thanks for taking the time to give me this 

Re: Is there a good discussion of optimizations?

2021-01-07 Thread Guyren Howe
On Jan 7, 2021, 13:42 -0800, Florents Tselai , wrote:
> Apologies for the shameless self-promotion :)
>
> Around a year ago I collected my thoughts on this topic. You can read the 
> post here Modern Data Practice and the SQL Tradition .
> It looks like it resonated with a lot of folks in the community. HN 
> Discussion https://news.ycombinator.com/item?id=21482114
>
> I would specifically underline the fact that the newer generation of 
> programmers & data pros (my former self included) don't really appreciate 
> things like triggers and server-side programming. Triggers and DB-side 
> functions are considered something like Assembly code.
>
> Not many neophytes have been shown with use cases why for example writing 2-3 
> lines of PL/SQL can save you a huge overhead of back and forth and 
> environment set up to write the same thing in say Pandas.
>
> I would focus on Triggers, Indices on expressions, and time-related 
> functions. Probably on generated columns too. They may be considered a new 
> feature, but the reasoning of building successively columns on top of a few 
> base ones is quite appealing nowadays, especially for ML purposes.
>
> I also wouldn't hesitate to discuss advanced topics. They are usually 
> considered obscure because people don't present useful examples, rather toy 
> and arbitrary ones.
>
> In a recent O'Reilly training, I was skeptical of talking about triggers for 
> optimization but it looks like it was probably the most useful part of my 
> training as students could actually "steal and tweak" my code.

Thanks for this. May I steal some of your examples if they prove useful? I’ll 
credit you of course.

I’m planning on somewhat emphasizing that a relational database is a logic 
engine. Viewed through this lens, a query or view is a “backward” implication 
and a trigger is a “forward” one. This leads to considering triggers (and the 
moral equivalent in external code) as requiring “truth maintenance”, and is a 
great way to think about when the database is the appropriate place for some 
bit of logic.
> On Thu, Jan 7, 2021 at 11:21 PM Guyren Howe  wrote:
> > On Jan 7, 2021, 13:07 -0800, Kevin Brannen , wrote:
> > > From: Guyren Howe 
> > >
> > > >Most folks, in my experience, who use relational databases don’t really 
> > > >understand the basic theory or even more important the why - the 
> > > >philosophy - of what a relational database is and how to get the most 
> > > >out of them. I see a lot of folks trying to use SQL in an imperative 
> > > >manner - make this temp table, then update it some, then make this other 
> > > >temp table, etc...
> >
> > Actually, I’m mostly going to talk about the relational model, rather than 
> > SQL. Our industry seems to always settle for third-best, and SQL is the 
> > worst of all the examples of this. The world desperately needs a good 
> > relational database based on a better query language — datalog, for example.
> >
> > I put up with SQL so I can use the relational model, and I think that 
> > understanding SQL has to start with that.
> >
> > Anyhow.
> > > An example of this is that we have a report we're trying to write that I'd
> > > like to think can be done in SQL, but I can't think of a way to do it. 
> > > Yet,
> > > if I do the base query and pull the data back into my application, I can 
> > > do
> > > the last bit with 3 lines of Perl very easily. The problem here revolves
> > > around comparing a row of data to the previous row to know whether the 
> > > data
> > > changed "significantly enough" to keep the new row.
> > >
> > > Another example is doing running totals. A couple of years ago I would 
> > > have
> > > said SQL can't do that. Now I know about the OVER clause, something that I
> > > would categorize as somewhat obscure, I can do it as needed.
> >
> > Actually, Window functions might be “advanced”, but are certainly not 
> > obscure. Your example sounds like it’s trivially solved with LAG().
> > > As Michael Lewis pointed out, large dataset can also cause you to choose 
> > > not
> > > to use SQL in 1 big statemet for everything (i.e. advocating the use to 
> > > temp
> > > tables). In some ways, using a CTE is a type of temp table, or at least I
> > > view it as such. That allows a person to solve a problem in bite-sized 
> > > chunks.
> > > I will agree that optimization can do it better at times, but the code 
> > > also has
> > > to be maintained as well – a balancing act.
> >
> > This appears to be good advice with SQL Server, which I’m coming to learn 
> > has a fairly poor query optimizer. But I would have thought Postgres’s 
> > optimizer would usually use a temporary table where appropriate.
> >
> > Curious to hear if that’s wrong.
> > > I think your idea is a good one, but I do hope you present that SQL can't
> > > solve everything ... else why do we have plpgsql. :) You’re correct 
> > > though,
> > > SQL isn’t used as much as it should be in many places.
> >
> > An important consideration will be

Re: Is there a good discussion of optimizations?

2021-01-07 Thread Rob Sargent


On 1/7/21 2:48 PM, Guyren Howe wrote:
On Jan 7, 2021, 13:42 -0800, Florents Tselai 
, wrote:


Apologies for the shameless self-promotion :) 



Around a year ago I collected my thoughts on this topic. You can
read the post here Modern Data Practice and the SQL Tradition
 .

It looks like it resonated with a lot of folks in the community.
HN Discussion https://news.ycombinator.com/item?id=21482114



I would specifically underline the fact that the newer generation
of programmers & data pros (my former self included) don't really
appreciate things like triggers and server-side programming.
Triggers and DB-side functions are considered something like
Assembly code. 



Not many neophytes have been shown with use cases why for example
writing 2-3 lines of PL/SQL can save you a huge overhead of back
and forth and environment set up to write the same thing in say
Pandas. 



I would focus on Triggers, Indices on expressions, and
time-related functions. Probably on generated columns too. They
may be considered a new feature, but the reasoning of building
successively columns on top of a few base ones is quite appealing
nowadays, especially for ML purposes.


I also wouldn't hesitate to discuss advanced topics. They are
usually considered obscure because people don't present useful
examples, rather toy and arbitrary ones. 



In a recent O'Reilly training, I was skeptical of talking about
triggers for optimization but it looks like it was probably the
most useful part of my training as students could actually "steal
and tweak" my code.




Thanks for this. May I steal some of your examples if they prove 
useful? I’ll credit you of course.


I’m planning on somewhat emphasizing that a relational database is a 
logic engine. Viewed through this lens, a query or view is a 
“backward” implication and a trigger is a “forward” one. This leads to 
considering triggers (and the moral equivalent in external code) as 
requiring “truth maintenance”, and is a great way to think about when 
the database is the appropriate place for some bit of logic.


On Thu, Jan 7, 2021 at 11:21 PM Guyren Howe mailto:guy...@gmail.com>> wrote:

On Jan 7, 2021, 13:07 -0800, Kevin Brannen mailto:kbran...@efji.com>>, wrote:

From: Guyren Howe mailto:guy...@gmail.com>>

>Most folks, in my experience, who use relational
databases don’t really understand the basic theory or even
more important the why - the philosophy - of what a
relational database is and how to get the most out of
them. I see a lot of folks trying to use SQL in an
imperative manner - make this temp table, then update it
some, then make this other temp table, etc...


Actually, I’m mostly going to talk about the relational model,
rather than SQL. Our industry seems to always settle for
third-best, and SQL is the worst of all the examples of this.
The world desperately needs a good relational database based
on a better query language — datalog, for example.


I put up with SQL so I can use the relational model, and I
think that understanding SQL has to start with that.


Anyhow.

An example of this is that we have a report we're trying
to write that I'd

like to think can be done in SQL, but I can't think of a
way to do it. Yet,

if I do the base query and pull the data back into my
application, I can do

the last bit with 3 lines of Perl very easily. The problem
here revolves

around comparing a row of data to the previous row to know
whether the data

changed "significantly enough" to keep the new row.

Another example is doing running totals. A couple of years
ago I would have

said SQL can't do that. Now I know about the OVER clause,
something that I

would categorize as somewhat obscure, I can do it as needed.


Actually, Window functions might be “advanced”, but are
certainly not obscure. Your example sounds like it’s trivially
solved with LAG().

As Michael Lewis pointed out, large dataset can also cause
you to choose not

to use SQL in 1 big statemet for everything (i.e.
advocating the use to temp

tables). In some ways, using a CTE is a type of temp
table, or at least I

view it as such. That allows a person to solve a problem
in bite-sized chunks.

I 

RE: Is there a good discussion of optimizations?

2021-01-07 Thread Kevin Brannen
From Guyren Howe
>Actually, I’m mostly going to talk about the relational model, rather than 
>SQL. Our industry seems to always settle for third-best, and SQL is the worst 
>of all the examples of this. The world desperately needs a good relational 
>database based on a better query language — datalog, for example.

>I put up with SQL so I can use the relational model, and I think that 
>understanding SQL has to start with that.

I can understand that that. :) Yet SQL is the interface/lens we most use to 
interact with an RDBMS.


>>An example of this is that we have a report we're trying to write that I'd
like to think can be done in SQL, but I can't think of a way to do it. Yet,
if I do the base query and pull the data back into my application, I can do
the last bit with 3 lines of Perl very easily. The problem here revolves
around comparing a row of data to the previous row to know whether the data
changed "significantly enough" to keep the new row.
>>Another example is doing running totals. A couple of years ago I would have
said SQL can't do that. Now I know about the OVER clause, something that I
would categorize as somewhat obscure, I can do it as needed.

>Actually, Window functions might be "advanced", but are certainly not obscure. 
>Your example sounds like it’s trivially solved with LAG().

Fair enough. Perhaps it would be better to say that many don't go to the depths 
to learn
window functions until forced to, and a lot of problems can be solved without 
them. I can
say that would be true for me and those I work with.

Thanks for the tip on LAG()! It's not a function I've used before (which makes 
it a great
example for the previous para).


>>As Michael Lewis pointed out, large dataset can also cause you to choose not
to use SQL in 1 big statemet for everything (i.e. advocating the use to temp
tables). In some ways, using a CTE is a type of temp table, or at least I
view it as such. That allows a person to solve a problem in bite-sized chunks.
I will agree that optimization can do it better at times, but the code also has
to be maintained as well – a balancing act.

>This appears to be good advice with SQL Server, which I’m coming to learn has 
>a fairly poor query optimizer. But I would have thought Postgres’s optimizer 
>would usually use a temporary table where appropriate.

I'm sure Pg's optimizer does its work better than I would manually, but that's 
not
always the point. For some I've talked to, thinking relationally and in sets is 
hard.
I'll even admit that while I think I do it pretty well, there are times I have 
to stop
and really consider what I'm doing. If you're used to thinking about how to 
solve a
problem with a computer language, most of them are imperative and that mindset 
can be
hard to get out of. So your "philosophy" approach is a well-aimed arrow in many 
ways, IMO.

Also, don't forget about "maintenance". Solving something in bite-sized chunks 
allows
for easier changes in the future, as well as understanding by newcomers -- 
generally speaking.

>Curious to hear if that’s wrong.

Depends on what version. In the earlier versions of Pg, CTE's were a fence and 
there was
no crossing over. That fence now has gates in some places in the latest 
versions from
what I read.


>>I think your idea is a good one, but I do hope you present that SQL can't
solve everything ... else why do we have plpgsql. :) You’re correct though,
SQL isn’t used as much as it should be in many places.

>An important consideration will be when relational is inappropriate. My 
>biggest goal, though, is to get folks to understand how much relations *can* 
>do — far too many devs in my experience don’t use the power of SQL because 
>they don’t understand it.

>Thanks for taking the time to give me this feedback.

I appreciate the efforts of teachers in all forms. I hope your project goes 
well and the feedback has some use.

Kevin
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


Re: How to keep format of views source code as entered?

2021-01-07 Thread raf
On Thu, Jan 07, 2021 at 05:33:48PM +, "Markhof, Ingolf" 
 wrote:

> So, it looks like PostgreSQL does support saving the original source code of 
> a view.
> 
> What's best practise to use as a code repository?
> 
> I would expect support of multi-user access, access-right management and 
> perhaps versioning as well…?
> 
> Thanks for your help!
> 
> Ingolf

Hi, I've only used stored functions (not views or
triggers), and I have tools for auditing, loading, and
dropping stored functions to match the code that is in
git (so updates can be easily applied to multiple
copies of the database), and Postgres has never altered
the code that it stores, otherwise, auditing the code
in the database against the code in git wouldn't work.

But since postgres does store a possibly altered parsed
version, you could alter your source to match
Postgres's parsed version of it. Maybe I encountered
this too long ago to remember having to adjust.

For version control, I'd recommend using git, or
whatever you are using for the rest of your code.

For multi-user access rights management, I'm not sure.
You can grant multiple users the right to create things
in the database. See the documentation on the grant
statement. e.g.:

  https://www.postgresql.org/docs/12/sql-grant.html

I'm guessing that you want:

  grant create on database ... to ...

But I don't know if it can be restricted to only
creating views. If not, it might grant too much access.
You'll also want to make sure that they all have write
access to the same git repository where the views are.

cheers,
raf

> From: Markhof, Ingolf [mailto:ingolf.mark...@de.verizon.com]
> Sent: 07 January 2021 17:19
> To: pgsql-general@lists.postgresql.org
> Subject: [E] How to keep format of views source code as entered?
> 
> Hi!
> 
> Switching from Oracle SLQ to PostgreSQL I am facing the issue that
> the SQL code the system returns when I open a views source code is
> different from the code I entered. The code is formatted differently,
> comments are gone and e.g. all text constants got an explicit cast to
> ::text added. (see sample below).
> 
> I want the SLQ code of my views stored as I entered it. Is there any
> way to achieve this? Or will I be forced to maintain my views SQL code
> outside of PostgreSQL views?
> 
> Any hints welcome!
> 
> Here is an example:
> 
> I enter this code to define a simple view:
> 
> create or replace view myview as
> select
>   product_id,
>   product_acronym
> from
>   products -- my comment here
> where
>   product_acronym = 'ABC'
> ;
> 
> However, when I open the view my SQL client (DBeaver) again, this is what I 
> get:
> 
> CREATE OR REPLACE VIEW myview
> AS SELECT product_id,
> product_acronym
>FROM products
>   WHERE product_acronym = 'ABC'::text;
> 
> So, the formatting changed, keywords are capitalized, the comment I
> added in the from-part has gone and the text constant 'ABC' changed to
> 'ABC'::text.
> 
> Verizon Deutschland GmbH - Sebrathweg 20, 44149 Dortmund, Germany - 
> Amtsgericht Dortmund, HRB 14952 - Geschäftsführer: Detlef Eppig - 
> Vorsitzender des Aufsichtsrats: Francesco de Maio




Re: How to keep format of views source code as entered?

2021-01-07 Thread David G. Johnston
On Thu, Jan 7, 2021 at 4:38 PM raf  wrote:

>
> Hi, I've only used stored functions (not views or
> triggers),


Extrapolating to these other types of objects based upon experiences with
functions isn't all that helpful.

and Postgres has never altered
> the code that it stores,


Right, you use functions...

But since postgres does store a possibly altered parsed
> version, you could alter your source to match
> Postgres's parsed version of it. Maybe I encountered
> this too long ago to remember having to adjust.
>

Now you've introduced PostgreSQL version dependency into the mix.


> For version control, I'd recommend using git, or
> whatever you are using for the rest of your code.
>

Yes, consider the original text as being official, not what is stored in
the database.  Don't allow changes to get pushed to the database unless
driven from the source code.

>
> For multi-user access rights management, I'm not sure.
> You can grant multiple users the right to create things
> in the database. See the documentation on the grant
> statement. e.g.:
>
>   https://www.postgresql.org/docs/12/sql-grant.html
>
> I'm guessing that you want:
>
>   grant create on database ... to ...
>
> But I don't know if it can be restricted to only
> creating views. If not, it might grant too much access.
>

It cannot.


> You'll also want to make sure that they all have write
> access to the same git repository where the views are.
>

Huh?

In short, one creates a function by writing:

CREATE FUNCTION ... $$ function body written as a text literal here $$ ...;

and a view:

CREATE VIEW AS SELECT ... (rest of a select statement here) ...;

The fact that a function is simply a body of text is why it is preserved -
and generally does't get validated at the time the CREATE statement is
executed, only when it is run.  CREATE VIEW takes in a fully functioning
select command, parses it, figures out its dependencies, and stores the
components and meta-data.  You get all this extra benefit at the cost of
not retaining the original text.

Admittedly, the system probably should be made to save the text, should
someone wish to write such a patch.  Given the generally better-accepted
version control and migration management method of maintaining one's
database structure the need and desire to add such a capability to the core
server is quite low.

David J.