Re: Two Postgres master process are showing - one is on and off

2019-08-21 Thread Laurenz Albe
chiru r wrote:
> I have observed one of our PostgreSQL DB instance showing two postgres 
> process on Linux server as highlighted. The second postgres process is on and 
> off.
> We did not find any references in logs.
> 
> Please provide your advice and help on this issue.
> 
> DB version :  PostgreSQL  9.5.10
> Server version   :  RHEL 7.6
>  
> [postgres@PGSERVER data]$ ps -ef|grep postgres
> postgres  33438  1  0 12:41 ?00:00:03 
> /u01/postgres/9.5/bin/postgres -D /u02/pgdata01/9.5/data
> [...]
> postgres 110181  33438  0 15:30 ?00:00:00 
> /u01/postgres/9.5/bin/postgres -D /u02/pgdata01/9.5/data

Perhaps it is a new client connection, and the process title just hasn't been 
changed (yet).

What is your setting for "update_process_title"?

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com





SV: Databases and servers

2019-08-21 Thread Karl Martin Skoldebrand


On Tue, Aug 20, 2019 at 6:33 AM Karl Martin Skoldebrand 
mailto:ks0c77...@techmahindra.com>> wrote:

Hi,

I just discovered that a client has done this:

They have two web applications A1 and A2. They have seperate hostnames/URLs. 
Both have a production and a test database A1p and A1t/ A2p and A2t.

What they've done is have both A1p and A2p on the same actual databaser server 
and A1t and A2t on the same server.

>Are these two PostgreSQL instances running on the same hardware, or two 
>databases within a single PostgreSQL instance?

They are two databases in a Single PostgreSQL instance access by different 
accounts with different permissions.>


So, I'm thinking - if a bug in application A1 crashes the application and 
database badly it will risk bringing down both services A1 and A2.

>Is this a common occurrence?  Of all the occurrences of downtime in recent 
>memory (or better yet, from incidence documentation), what were the causes of 
>them?  Is this near the top of the list?

No it is not common.

>Also, are the two apps completely independent, or are they used together such 
>that one being down makes the other one not very useful?

Yes they are independent on the database level. They share the same Windows 
application server.

The same risk would be evident on a successful security breach.

>On the other hand, more servers means more moving parts, means more 
>opportunities for mistakes in configuration or maintenance that let breaches 
>happen.

That is true.

I would prefer to A1p and A2p on seperate servers, maybe keeping A1t and A2t on 
the same. (This is what seems to be happening when the database servers are 
being repladed).

>I don't know what that last part means.
repladed == replaced


What is the general thought on the current setup?

In my experience, people acting on mere conjectures about what might cause 
downtime in the future and how to prevent it have caused more downtime than 
they have prevented.
/M
Cheers,

Jeff

 Disclaimer: This message and the information contained herein is proprietary 
and confidential and subject to the Tech Mahindra policy statement, you may 
review the policy at http://www.techmahindra.com/Disclaimer.html externally 
http://tim.techmahindra.com/tim/disclaimer.html internally within TechMahindra. 



SV: Databases and servers

2019-08-21 Thread Karl Martin Skoldebrand

Från: Peter J. Holzer 
Skickat: den 20 augusti 2019 22:58
Till: pgsql-general@lists.postgresql.org
Ämne: Re: Databases and servers

On 2019-08-20 10:33:17 +, Karl Martin Skoldebrand wrote:
> I just discovered that a client has done this:
>
> They have two web applications A1 and A2. They have seperate hostnames/URLs.
> Both have a production and a test database A1p and A1t/ A2p and A2t.
>
> What they've done is have both A1p and A2p on the same actual databaser server
> and A1t and A2t on the same server.
>
> So, I'm thinking - if a bug in application A1 crashes the application and
> database badly it will risk bringing down both services A1 and A2. The same
> risk would be evident on a successful security breach.
>
> I would prefer to A1p and A2p on seperate servers, maybe keeping A1t and A2t 
> on
> the same. (This is what seems to be happening when the database servers are
> being repladed).

On rereading this I notice that I'm not sure what that means. If you
propose replacing the two servers with three (two production, one test)
or even four (two production and two test), I agree.

===
I was proposing replacing the two servers with three (or four). Running 
production on two seperate servers and possibly the tests on one (or possibly 
two servers).

/M.

 Disclaimer: This message and the information contained herein is proprietary 
and confidential and subject to the Tech Mahindra policy statement, you may 
review the policy at http://www.techmahindra.com/Disclaimer.html externally 
http://tim.techmahindra.com/tim/disclaimer.html internally within TechMahindra. 





Re: SELECT all the rows where id is children of other node.

2019-08-21 Thread Francisco Olarte
Pablo:

On Tue, Aug 20, 2019 at 6:49 PM pabloa98  wrote:
> Thank you for your responses Rob. Appreciated. The problem with recursive 
> queries is that they are executed several times and it has and impact in 
> performance.
> I need a subset of those rows and I want them in one pass.
> I discovered that ltree extension could be useful. I will play with it today. 
> I am sure there's a way to find al the nodes in O(n) time with n = size of 
> the resulset ...

Unless you have some extra conditions in a table ( like
"autoincremented immutable primary key and parents are always created
before childs" ) I think your problem of "get all the descendant ( i
do not like to call them children ) nodes of a given id" can not be
solved in one pass.

I mean, if you are getting descendants of the node N1, you need to
read the last node, NL, of the table to know if it is a child of N1.
But then you have to read the table again to find childs of NL.

Of course, if you have something like "hierarchical ids" you can
traverse ordering by it and know NL MUST be childless, and build the
tree rooted on node N1 as you go, but without some of this conditions
I do not think it can be done in an "ideal" db ( which lets you scan
in any order you can define from just a row without cost ) in one scan
( storing and prunning the whole tree as you go is cheating ).

Also, if your node ids come from a serial and are immutables, or you
take a little care when mutating them, you can do it traversing by id,
but you need a full scan, a recursive query with several index scans
may easily be faster in wide trees.


Francisco Olarte.




Re: Retroactively adding send and recv functions to a type?

2019-08-21 Thread Johann 'Myrkraverk' Oskarsson
On Tue, Aug 20, 2019 at 2:46 AM Tom Lane  wrote:
>
> "Johann 'Myrkraverk' Oskarsson"  writes:
> > On Tue, Aug 20, 2019 at 1:32 AM Tom Lane  wrote:
> >> You could manually update the pg_type row, and then if you were
> >> being fussy, add pg_depend entries showing the type depends on
> >> the functions.
>
> > Can I do this in a future proof way?  That is, is there a way to make
> > that into an upgrade script, or will I make the extension
> > un-upgradable doing that?
>
> [ shrug... ]  Depends what you consider "future proof".  I should think
> that if pg_type.typsend goes away or changes meaning, for example,
> that would be reflective of changes large enough to break an extension
> dabbling in binary I/O in other ways anyway.
>
> Inserting new rows into pg_depend manually is a bit riskier, but I
> don't think that catalog has changed since its inception, so it's
> not all that risky.

I have updated the catalog, and the binary send and recv functions work.

The steps I took are

  create function sha1_send( sha1 ) returns bytea immutable
  language c strict as 'hashtypes', 'sha_send1';

  update pg_type set typsend = 'sha1_send'::regproc
  where typname = 'sha1';

  create function sha1_recv( internal ) returns sha1 immutable
  language c strict as 'hashtypes', 'sha_recv1';

  update pg_type set typreceive = 'sha1_recv'::regproc
  where typname = 'sha1';

Then for completeness sake, I added two rows into pg_depend with

  insert into pg_depend ( classid, objid, objsubid, refclassid,
refobjid, refobjsubid, deptype )
  values ( 'pg_type'::regclass::oid, 'sha1'::regtype::oid, 0,
'pg_proc'::regclass::oid, 'sha1_recv'::regproc::oid, 0, 'n' );

  insert into pg_depend ( classid, objid, objsubid, refclassid,
refobjid, refobjsubid, deptype )
  values ( 'pg_type'::regclass::oid, 'sha1'::regtype::oid, 0,
'pg_proc'::regclass::oid, 'sha1_send'::regproc::oid, 0, 'n' );

Before I roll all of that into an upgrade script for the other sha
types, is there something else I should be doing?

I did not dare to try before adding to pg_depend, but here's what
happens when I try to drop function sha1_recv;

  ERROR:  cannot drop function sha1_recv(internal) because other
objects depend on it
  DETAIL:  extension hashtypes depends on function sha1_recv(internal)
  column passwd of table pwned depends on type sha1
  function sha1_send(sha1) depends on type sha1

Does this look correct?

> In any case, you could limit the lifespan of the upgrade script,
> if you roll it up into a new base install script ASAP.

I am not the maintainer of the extension, and I'll see what I can do.

-- 
Johann

 I'm not from the internet, I just work there.




Re: Retroactively adding send and recv functions to a type?

2019-08-21 Thread Tom Lane
"Johann 'Myrkraverk' Oskarsson"  writes:
> The steps I took are

>   create function sha1_send( sha1 ) returns bytea immutable
>   language c strict as 'hashtypes', 'sha_send1';

>   update pg_type set typsend = 'sha1_send'::regproc
>   where typname = 'sha1';

>   create function sha1_recv( internal ) returns sha1 immutable
>   language c strict as 'hashtypes', 'sha_recv1';

>   update pg_type set typreceive = 'sha1_recv'::regproc
>   where typname = 'sha1';

Those updates don't look very safe: for instance, what if there's
another type named sha1 in some other schema?  I'd do it like

-- create the functions
update pg_type set
  typsend = 'sha1_send(sha1)'::regprocedure,
  typreceive = 'sha1_recv(internal)'::regprocedure
where oid = 'sha1'::regtype;

This formulation only relies on your schema being frontmost in
the search path, which it should be during CREATE/ALTER EXTENSION.

> Then for completeness sake, I added two rows into pg_depend with

>   insert into pg_depend ( classid, objid, objsubid, refclassid,
> refobjid, refobjsubid, deptype )
>   values ( 'pg_type'::regclass::oid, 'sha1'::regtype::oid, 0,
> 'pg_proc'::regclass::oid, 'sha1_recv'::regproc::oid, 0, 'n' );

>   insert into pg_depend ( classid, objid, objsubid, refclassid,
> refobjid, refobjsubid, deptype )
>   values ( 'pg_type'::regclass::oid, 'sha1'::regtype::oid, 0,
> 'pg_proc'::regclass::oid, 'sha1_send'::regproc::oid, 0, 'n' );

You could skip the explicit casts to oid, and again I think use of
regprocedure would be safer than regproc.  Seems fine otherwise.

> I did not dare to try before adding to pg_depend, but here's what
> happens when I try to drop function sha1_recv;

>   ERROR:  cannot drop function sha1_recv(internal) because other
> objects depend on it
>   DETAIL:  extension hashtypes depends on function sha1_recv(internal)
>   column passwd of table pwned depends on type sha1
>   function sha1_send(sha1) depends on type sha1

> Does this look correct?

It looks a bit odd, but I think that just indicates that you created the
two functions manually rather than inside an extension update script,
so they're not known to be part of the extension.  You could experiment
with ALTER EXTENSION ADD to see if this output changes when they are
part of the extension.  (But you don't need ALTER EXTENSION ADD when
you create them in an update script.)

regards, tom lane




RE: Rename a column if not already renamed.?

2019-08-21 Thread Day, David
I agree the function could be improved to deal with both old and new name 
existing simultaneously.
That is almost certainly the root  cause, and one that I would confirm if the 
tester and site were currently available to me.  

Our work flow  for this scenario is something like:

1.  9.6 pg_dump takes a snapshot of our  9.6  database.
2.  Postgres is upgraded/freshly installed to  11.3..
3.  The 9.6 database is restored using the version 11 pg_restore tool.

4. Once our application process starts up, it sees there is a patch available 
in it's old branch that is one greater then it's restored  9.6 content.
That happens to be a merge patch which resets the expectations.
It attempts to apply all patches in the new branch since the point of 
divergence and runs into my current issue. 
 
It occurs to me I could simply put an exception handler in the rename column 
function and I would likely proceed merrily along.
But curiosity is killing me and the cat. What is causing the old name to 
persist in the pg_attribute table after the rename. ?

Would a stale function referencing the old column name be a contributor?


Regards


Dave Day




-Original Message-
From: Tom Lane [mailto:t...@sss.pgh.pa.us] 
Sent: Tuesday, August 20, 2019 4:57 PM
To: Day, David 
Cc: Luca Ferrari ; pgsql-gene...@postgresql.org
Subject: Re: Rename a column if not already renamed.?

"Day, David"  writes:
> The error is something like column already exists and

Not sure about the workflow this function is used within, but maybe you need to 
consider what to do when both the old and new column names exist.
Because that sure sounds like what is happening.

regards, tom lane






Re: Rename a column if not already renamed.?

2019-08-21 Thread Adrian Klaver

On 8/21/19 7:52 AM, Day, David wrote:

I agree the function could be improved to deal with both old and new name 
existing simultaneously.
That is almost certainly the root  cause, and one that I would confirm if the 
tester and site were currently available to me.

Our work flow  for this scenario is something like:

1.  9.6 pg_dump takes a snapshot of our  9.6  database.
2.  Postgres is upgraded/freshly installed to  11.3..
3.  The 9.6 database is restored using the version 11 pg_restore tool.


In 3) you are restoring to the new 11.3 instance, correct?



4. Once our application process starts up, it sees there is a patch available 
in it's old branch that is one greater then it's restored  9.6 content.
That happens to be a merge patch which resets the expectations.
It attempts to apply all patches in the new branch since the point of 
divergence and runs into my current issue.
  
It occurs to me I could simply put an exception handler in the rename column function and I would likely proceed merrily along.

But curiosity is killing me and the cat. What is causing the old name to 
persist in the pg_attribute table after the rename. ?


If you are indeed working on the new instance pg_attribute would have no 
'memory' of the dropped column.  It would seem to me to come down to 
what is passed into sys.rename_column() as old_name_, new_name.




Would a stale function referencing the old column name be a contributor?


Regards


Dave Day




-Original Message-
From: Tom Lane [mailto:t...@sss.pgh.pa.us]
Sent: Tuesday, August 20, 2019 4:57 PM
To: Day, David 
Cc: Luca Ferrari ; pgsql-gene...@postgresql.org
Subject: Re: Rename a column if not already renamed.?

"Day, David"  writes:

The error is something like column already exists and


Not sure about the workflow this function is used within, but maybe you need to 
consider what to do when both the old and new column names exist.
Because that sure sounds like what is happening.

regards, tom lane








--
Adrian Klaver
adrian.kla...@aklaver.com




Complex filters -> Bad row estimates -> bad query plan

2019-08-21 Thread Mathieu Fenniak
Hi all,

I have a database query where I have a number of "simple" where clauses, a
number of "complex" subquery based where clauses, and one NOT EXISTS where
clause; it looks something like this:

SELECT ...some fields... FROM Table1
WHERE
   Field1 IN (1, 2, 3, 4, 21, 24) AND -- simple filter
   Field2  <> 1 AND -- simple filter
   Field3 >= '2019-07-08' AND -- simple filter
   Field3 <= '2019-08-18' AND -- simple filter
   NOT EXISTS (SELECT 1 FROM Table2 WHERE Table2.Status = 2 AND
Table2.SomeId = Table1.Id) AND -- anti-join
   COALESCE((SELECT Status FROM Table3 WHERE Table3.SomeId = Table1.Id),
(SELECT Status FROM Table4 WHERE Table4.SomeId = Table1.Id)) = 2 --
"complex" condition

The problem I'm encountering is that I've observed degraded performance in
some cases where the Anti Join merge for the NOT EXISTS clause is planned
based upon poor row estimates for Table1.  All of the other filters, and
the SubPlan filter(s) for the complex clauses, result in the query planner
estimating that only 1 row from Table1 will be resulting, so a Nested Loop
Anti Join is used and the RHS of that nested loop in a seqscan on Table2.
The reality is that many thousands of records match all the conditions; a
Merge Anti Join or Hash Anti Join would be a better query plan.

I've tested the query planner with just the simpler conditions, and it
makes pretty reasonable estimates about the row count (+/- 10%).  Adding
the NOT EXISTS results in a Merge Anti Join, and performance is great.

Adding the more "complex" conditions (there are potentially multiple of
these subquery plan searches) results in the estimated row count dropping
to 1, and, performance dives.

I know there are no "query hints" in PostgreSQL... any thoughts on
alternative approaches here?  The only option I've used in the past for
this is creating specialized indexes, which can provide more targeted
statistics; but it's not applicable here since the "complex" conditions use
data from another table in a subquery.

Appreciate any thoughts, theories, or directions. :-)  Thanks,

Mathieu


Re: SELECT all the rows where id is children of other node.

2019-08-21 Thread Rob Sargent


> On Aug 21, 2019, at 3:35 AM, Francisco Olarte  wrote:
> 
> Pablo:
> 
> On Tue, Aug 20, 2019 at 6:49 PM pabloa98  wrote:
>> Thank you for your responses Rob. Appreciated. The problem with recursive 
>> queries is that they are executed several times and it has and impact in 
>> performance.
>> I need a subset of those rows and I want them in one pass.
>> I discovered that ltree extension could be useful. I will play with it 
>> today. I am sure there's a way to find al the nodes in O(n) time with n = 
>> size of the resulset ...
> 
> Unless you have some extra conditions in a table ( like
> "autoincremented immutable primary key and parents are always created
> before childs" ) I think your problem of "get all the descendant ( i
> do not like to call them children ) nodes of a given id" can not be
> solved in one pass.
> 
> I mean, if you are getting descendants of the node N1, you need to
> read the last node, NL, of the table to know if it is a child of N1.
> But then you have to read the table again to find childs of NL.
> 
> Of course, if you have something like "hierarchical ids" you can
> traverse ordering by it and know NL MUST be childless, and build the
> tree rooted on node N1 as you go, but without some of this conditions
> I do not think it can be done in an "ideal" db ( which lets you scan
> in any order you can define from just a row without cost ) in one scan
> ( storing and prunning the whole tree as you go is cheating ).
> 
> Also, if your node ids come from a serial and are immutables, or you
> take a little care when mutating them, you can do it traversing by id,
> but you need a full scan, a recursive query with several index scans
> may easily be faster in wide trees.
> 
> 
> Francisco Olarte.
If you accept Francisco’s thesis then you may be interested in this
with recursive descendants (last, children) as 
(select c.c, array[null::int] from kids c where not exists (select 1 from kids 
p where c.c = p.p)
 union all
 select k.p, array[k.c] || l.children from kids k, descendants l where k.c = 
l.last)
 select a.last, array_agg(distinct(a.kids))as clan from (select last, 
unnest(array_remove(children, null)) as kids from descendants where children[1] 
is not null) as a group by last order by last
 last |   clan   
--+--
1 | {2,3,4,21,22,23,221,222}
2 | {21,22,23,221,222}
   22 | {221,222}
(3 rows)
No comment on performance other than to say that if you are interested in the 
result for a given seed parent then performance would likely correlate with the 
average depth of your lineages.

I believe the ascending order of the members of each clan is completely 
fortuitous unless it’s a consequence of distinct?





Re: Complex filters -> Bad row estimates -> bad query plan

2019-08-21 Thread Michael Lewis
If those conditions that are throwing off the stats are expected to be
minimally impactful/filtering few rows, then you can use the one
tried-and-true optimizer hint (aside from materialized CTEs, stylized
indexes, etc) --- OFFSET 0 at the end of a sub-query.

SELECT * FROM ( [your existing query without the sub-selects that are
complicated and produce bad estimates] OFFSET 0 ) WHERE [your other
conditions that don't produce good estimates]

If there is correlation between field1 and field2, you might also look at
CREATE STATISTICS assuming you are on PG 10 or 11.

Before I do any of that, I would try LEFT JOIN for Table3 and Table4 then
use the where conditon  "AND 2 = COALESCE( Table3.Status, Table4.Status"
and see if the optimizer likes that option better.


RE: Rename a column if not already renamed.?

2019-08-21 Thread Day, David
Restoring into 11.3 instance the 9.6 dump  ? ->  yes.

For the upgrade scenario, I  confirmed that both old column name and new name 
are in the pg_attribute table at the time the patch attempts to rename it.
Why both  is a big question.
However,
It is easy enough to re-write the column rename function to deal with the 
simultaneous possibility.  I will include the redefined function in the merge 
patch and see how it goes.

I'll update the thread after some further exploration.

Thanks all for your assistance.


-Original Message-
From: Adrian Klaver [mailto:adrian.kla...@aklaver.com] 
Sent: Wednesday, August 21, 2019 11:47 AM
To: Day, David ; Tom Lane 
Cc: Luca Ferrari ; pgsql-gene...@postgresql.org
Subject: Re: Rename a column if not already renamed.?

On 8/21/19 7:52 AM, Day, David wrote:
> I agree the function could be improved to deal with both old and new name 
> existing simultaneously.
> That is almost certainly the root  cause, and one that I would confirm if the 
> tester and site were currently available to me.
> 
> Our work flow  for this scenario is something like:
> 
> 1.  9.6 pg_dump takes a snapshot of our  9.6  database.
> 2.  Postgres is upgraded/freshly installed to  11.3..
> 3.  The 9.6 database is restored using the version 11 pg_restore tool.

In 3) you are restoring to the new 11.3 instance, correct?

> 
> 4. Once our application process starts up, it sees there is a patch available 
> in it's old branch that is one greater then it's restored  9.6 content.
> That happens to be a merge patch which resets the expectations.
> It attempts to apply all patches in the new branch since the point of 
> divergence and runs into my current issue.
>   
> It occurs to me I could simply put an exception handler in the rename column 
> function and I would likely proceed merrily along.
> But curiosity is killing me and the cat. What is causing the old name to 
> persist in the pg_attribute table after the rename. ?

If you are indeed working on the new instance pg_attribute would have no 
'memory' of the dropped column.  It would seem to me to come down to what is 
passed into sys.rename_column() as old_name_, new_name.

> 
> Would a stale function referencing the old column name be a contributor?
> 
> 
> Regards
> 
> 
> Dave Day
> 
> 
> 
> 
> -Original Message-
> From: Tom Lane [mailto:t...@sss.pgh.pa.us]
> Sent: Tuesday, August 20, 2019 4:57 PM
> To: Day, David 
> Cc: Luca Ferrari ; pgsql-gene...@postgresql.org
> Subject: Re: Rename a column if not already renamed.?
> 
> "Day, David"  writes:
>> The error is something like column already exists and
> 
> Not sure about the workflow this function is used within, but maybe you need 
> to consider what to do when both the old and new column names exist.
> Because that sure sounds like what is happening.
> 
>   regards, tom lane
> 
> 
> 
> 
> 


--
Adrian Klaver
adrian.kla...@aklaver.com


Importing from CSV, auto creating table?

2019-08-21 Thread stan
I have a situation where we need to import data, as an interim measure,
from spreadsheets. 

I have read up on \copy and COPY, but I do not see that either of these can
use the header from a CSV file to define a new table. Am I missing
something?

Also downloaded something called pgfutter, which I thought would do this,
but have not had any success with this. After I (thought) I had figured out
the arguments, it just seams to hag forever.


-- 
"They that would give up essential liberty for temporary safety deserve
neither liberty nor safety."
-- Benjamin Franklin




Re: Importing from CSV, auto creating table?

2019-08-21 Thread David G. Johnston
On Wed, Aug 21, 2019 at 2:15 PM stan  wrote:

> I have a situation where we need to import data, as an interim measure,
> from spreadsheets.
>
> I have read up on \copy and COPY, but I do not see that either of these can
> use the header from a CSV file to define a new table. Am I missing
> something?
>
>
Nope.  You need to create the table separately - or find a tool that will
do that creation for you.

David J.


Re: Complex filters -> Bad row estimates -> bad query plan

2019-08-21 Thread Mathieu Fenniak
Thanks Michael.

I'll give some join alternatives a shot first... but, that's cool.

What about OFFSET 0 makes this approach work?  I'm thinking the OFFSET 0
create an optimization barrier that prevents the planner from collapsing
that sub-query into the top query, and enforces ordering in the query?

I appreciate your thoughts, thank-you very much for the feedback.

Mathieu


On Wed, Aug 21, 2019 at 12:08 PM Michael Lewis  wrote:

> If those conditions that are throwing off the stats are expected to be
> minimally impactful/filtering few rows, then you can use the one
> tried-and-true optimizer hint (aside from materialized CTEs, stylized
> indexes, etc) --- OFFSET 0 at the end of a sub-query.
>
> SELECT * FROM ( [your existing query without the sub-selects that are
> complicated and produce bad estimates] OFFSET 0 ) WHERE [your other
> conditions that don't produce good estimates]
>
> If there is correlation between field1 and field2, you might also look at
> CREATE STATISTICS assuming you are on PG 10 or 11.
>
> Before I do any of that, I would try LEFT JOIN for Table3 and Table4 then
> use the where conditon  "AND 2 = COALESCE( Table3.Status, Table4.Status"
> and see if the optimizer likes that option better.
>


Re: Importing from CSV, auto creating table?

2019-08-21 Thread DiasCosta

Hi Stan,

I uploaded to the database (PostgreSQL 9.6), monthly and for several 
years, over 50 000 000 csv records
using a version of pgtfutter that I compiled (with some changes if I 
remember correctly) and the tables

were created in loading process from the column titles.

Dias Costa




On 21-08-2019 22:15, stan wrote:

I have a situation where we need to import data, as an interim measure,
from spreadsheets.

I have read up on \copy and COPY, but I do not see that either of these can
use the header from a CSV file to define a new table. Am I missing
something?

Also downloaded something called pgfutter, which I thought would do this,
but have not had any success with this. After I (thought) I had figured out
the arguments, it just seams to hag forever.





--
J. M. Dias Costa
Telef. 214026948

Se divulgar esta mensagem por terceiros, por favor:
1. Apague o meu endereço de correio electrónico e o meu nome.
2. Apague também os endereços dos seus amigos antes de distribuir.
3. Enderece como cópia oculta (Cc ou Bcc) para os SEUS destinatários.
Agindo deste modo, dificultará a disseminação de "vírus", "spams" e
"banners" e contribuirá para manter a privacidade de todos e cada um.
Obrigado.

Nota: Não se deverá ao acaso a ocorrência de palavras na minha escrita que não 
respeitem o
malfadado acordo ortográfico.



Re: Importing from CSV, auto creating table?

2019-08-21 Thread Ron

On 8/21/19 4:15 PM, stan wrote:

I have a situation where we need to import data, as an interim measure,
from spreadsheets.

I have read up on \copy and COPY, but I do not see that either of these can
use the header from a CSV file to define a new table. Am I missing
something?


Data types.  Sure, you've got column names from the header line, but what 
types do you assign them?




Also downloaded something called pgfutter, which I thought would do this,
but have not had any success with this. After I (thought) I had figured out
the arguments, it just seams to hag forever.




--
Angular momentum makes the world go 'round.




Re: Importing from CSV, auto creating table?

2019-08-21 Thread David G. Johnston
On Wednesday, August 21, 2019, Ron  wrote:

> On 8/21/19 4:15 PM, stan wrote:
>
>> I have a situation where we need to import data, as an interim measure,
>> from spreadsheets.
>>
>> I have read up on \copy and COPY, but I do not see that either of these
>> can
>> use the header from a CSV file to define a new table. Am I missing
>> something?
>>
>
> Data types.  Sure, you've got column names from the header line, but what
> types do you assign them?


Text.  I’m gonna post-process anyway, casting to better types isn’t a
problem.

David J.


Re: Rename a column if not already renamed.?

2019-08-21 Thread Adrian Klaver

On 8/21/19 11:58 AM, Day, David wrote:

Restoring into 11.3 instance the 9.6 dump  ? ->  yes.

For the upgrade scenario, I  confirmed that both old column name and new name 
are in the pg_attribute table at the time the patch attempts to rename it.
Why both  is a big question.


I been playing around with that and the only way I can see it happening 
is that both columns actually exist in the table.


This leads to:

1) Examine the output of the pg_dump file to see if the table comes over 
with both columns.


2) Examine the table in the 11.3 instance to see if has both columns. An 
indication to me that patch process is adding back the old column.


3) If neither 1) or 2) examine the patch process to see if it has both 
columns in the table at some point in time and then removes the old column.



However,
It is easy enough to re-write the column rename function to deal with the 
simultaneous possibility.  I will include the redefined function in the merge 
patch and see how it goes.

I'll update the thread after some further exploration.

Thanks all for your assistance.






--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Complex filters -> Bad row estimates -> bad query plan

2019-08-21 Thread Michael Lewis
-- I'm thinking the OFFSET 0 create an optimization barrier that prevents
the planner from collapsing that sub-query into the top query, and enforces
ordering in the query?

That's my understanding. I think it is an optimizer hint by another name. I
used to put things in a CTE (which is always materialized until v12, which
will change it to inlined unless keyword MATERIALIZED is included) or I
would create a temp table if the dataset is expected to contain many rows
such that I can do ANALYZE pg_temp.table_table; so the optimizer has stats
to make good decisions.

Note- Replying to messages with a full quote of the conversation below your
comment (aka top-posting) is discouraged on these mailing lists. Please
quote the portion you are responding to and that's it.