RE: Need support on tuning at the time of index creation

2020-01-31 Thread Sandip Pradhan
Hi Ron,

Thanks for your time.

We are using the version 9.5.9.14.

Regards,


 
Sandip Pradhan 
Tech Lead
 
BDGS SD IN BSS EOC_ECM 1
Mobile: 9830880856
sandip.prad...@ericsson.com
 
Ericsson
DLF IT Park - II, DLF SEZ Block 1A, Rajarhat
700156, Kolkata, West Bengal
India
ericsson.com

-Original Message-
From: Ron  
Sent: Monday, January 27, 2020 4:57 PM
To: pgsql-general@lists.postgresql.org
Subject: Re: Need support on tuning at the time of index creation

On 1/27/20 5:10 AM, Sandip Pradhan wrote:
> Dear Sir/Madam,
>
> One of our ericsson product used backend db as postgresql 9. We are 
> facing following performance issues where we need some support from your side.
> We are having 10 tables and we are inserting around 150 million to 250 
> million records on each of those tables. After that we need to create 
> 29 indexes includung primary key index and other types of indexes. 
> Currently it is taking almost 3.5 to 4 hrs.
>
> Please let us know how we can tune the perfirmance so that we can 
> complete this task in 30-40 mins.
>
> Note: We cannot change the db version.

There are seven versions of "postgresql 9", and multiple patch releases of 
each.  Please specify which version you're running.

--
Angular momentum makes the world go 'round.






Re: Need support on tuning at the time of index creation

2020-01-31 Thread Jayadevan M
On Fri, Jan 31, 2020 at 3:22 PM Sandip Pradhan 
wrote:

> Hi Ron,
>
> Thanks for your time.
>
> We are using the version 9.5.9.14.
>


May be you could try tweaking maintenance_work_mem?

Regards,
Jayadevan


Re: ERROR: too many dynamic shared memory segments

2020-01-31 Thread Nicola Contu
Hi Thomas,
unfortunately I can't find any core dump to help you more.
Thanks for the fix, we are in the process of installing 12.1 in production,
so we can still wait on this release and go live with 12.2

I will let you know at this point if I still get this after installing 12.2
trying to build a core dump file.

Do you still recommend to increase max_conn?


Il giorno gio 30 gen 2020 alle ore 23:41 Thomas Munro <
thomas.mu...@gmail.com> ha scritto:

> On Wed, Jan 29, 2020 at 11:53 PM Thomas Munro 
> wrote:
> > On Wed, Jan 29, 2020 at 10:37 PM Nicola Contu 
> wrote:
> > > This is the error on postgres log of the segmentation fault :
> > >
> > > 2020-01-21 14:20:29 GMT [] [4]: [108-1] db=,user= LOG:  server
> process (PID 2042) was terminated by signal 11: Segmentation fault
> > > 2020-01-21 14:20:29 GMT [] [4]: [109-1] db=,user= DETAIL:  Failed
> process was running: select pid from pg_stat_activity where query ilike
> 'REFRESH MATERIALIZED VIEW CONCURRENTLY matview_vrs_request_stats'
> > > 2020-01-21 14:20:29 GMT [] [4]: [110-1] db=,user= LOG:
> terminating any other active server processes
>
> > That gave me an idea...  I hacked my copy of PostgreSQL to flip a coin
> > to decide whether to pretend there are no slots free (see below), and
> > I managed to make it crash in the regression tests when doing a
> > parallel index build.  It's late here now, but I'll look into that
> > tomorrow.  It's possible that the parallel index code needs to learn
> > to cope with that.
>
> Hi Nicola,
>
> Without more information I can't know if I found the same bug you
> experienced, but I think it's likely.  I have committed a fix for
> that, which will be available in the next release (mid-February).
>
>
> https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=74618e77b43cfce670b4725d5b9a300a2afd12d1
>


RE: Need support on tuning at the time of index creation

2020-01-31 Thread Sandip Pradhan
Hi Peter,

Thanks for your time.

Out Postgresql version: 9.5.9.14.

We are using COPY command to insert rows into the tables. While running COPY 
command, all indexes dropped.

After successfully inserting all the rows, we are trying to create all those 
indexes.

Example of index creation script:
CREATE INDEX cwt_sr_assocact_entityref
  ON cwt_sr_associationact
  USING btree
  (entityapplicationcontext COLLATE pg_catalog."default", entitytype COLLATE 
pg_catalog."default", entitydn COLLATE pg_catalog."default");

CREATE INDEX ix_sr_assoc_customerrelateddn
  ON cwt_sr_associationact
  USING btree
  (customerrelateddn COLLATE pg_catalog."default");

Running indexes in parallel.

Please find the hardware detail of the server:
OS: Red Hat Enterprise Linux 7.4
RAM: 125 GB
CPU Core: 36

Set maintenance_work_mem to 25GB

Please let me know if you need any further information.

Regards,


 
Sandip Pradhan 
Tech Lead
 
BDGS SD IN BSS EOC_ECM 1
Mobile: 9830880856
sandip.prad...@ericsson.com
 
Ericsson
DLF IT Park - II, DLF SEZ Block 1A, Rajarhat
700156, Kolkata, West Bengal
India
ericsson.com

-Original Message-
From: Peter J. Holzer  
Sent: Tuesday, January 28, 2020 4:16 AM
To: pgsql-general@lists.postgresql.org
Subject: Re: Need support on tuning at the time of index creation

On 2020-01-27 11:10:36 +, Sandip Pradhan wrote:
> One of our ericsson product used backend db as postgresql 9. We are 
> facing following performance issues where we need some support from your side.
> We are having 10 tables and we are inserting around 150 million to 250 
> million records on each of those tables. After that we need to create 
> 29 indexes includung primary key index and other types of indexes. 
> Currently it is taking almost 3.5 to 4 hrs.

It is hard to give advice on what you could do differently if you don't tell us 
what you do.

How are you inserting those rows? 
How are you creating the indexes?

Especially: Are you doing things serially or in parallel?

Also performance depends a lot on hardware, so faster CPUs (or more
cores) and faster disks/ssds might help, too.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | 
https://protect2.fireeye.com/v1/url?k=228fefe0-7e0635f0-228faf7b-0cc47ad93e2a-0af0cf5ca8f30246&q=1&e=467afc4c-b87f-4d98-9a57-bf0596fd612a&u=http%3A%2F%2Fwww.hjp.at%2F
 |   challenge!"




RE: Need support on tuning at the time of index creation

2020-01-31 Thread Sandip Pradhan
Currently maintenance_work_mem set to 25 GB.

Regards,

[cid:image002.png@01D5D84D.47F72BC0]

Sandip Pradhan
Tech Lead

BDGS SD IN BSS EOC_ECM 1
Mobile: 9830880856
sandip.prad...@ericsson.com

Ericsson
DLF IT Park - II, DLF SEZ Block 1A, Rajarhat
700156, Kolkata, West Bengal
India
ericsson.com

From: Jayadevan M 
Sent: Friday, January 31, 2020 3:30 PM
To: Sandip Pradhan 
Cc: Ron ; pgsql-general@lists.postgresql.org
Subject: Re: Need support on tuning at the time of index creation



On Fri, Jan 31, 2020 at 3:22 PM Sandip Pradhan 
mailto:sandip.prad...@ericsson.com>> wrote:
Hi Ron,

Thanks for your time.

We are using the version 9.5.9.14.


May be you could try tweaking maintenance_work_mem?

Regards,
Jayadevan



Re: Add column with default value in big table - splitting of updates can help?

2020-01-31 Thread Durumdara
Hello!

a.)
PG version is mainly 9.6, but some of the servers are 10.x or 11.x.

b.)
We have semi-automatic tool which get the a new modifications on databases,
and execute them at once by database.
So one SQL script by one database, under one transaction - whole or
nothing. If something failed, we know where to we start again by hand. It
is impossible to execute only the first half, and we don't know which one
executed or not.

The main problem that sometimes we have to modify some tables which have
too much records in some customer databases.

---

As I think the best way to solve this:

1.)
Future goal: upgrade to PG 11.

2.)
We need to write a preprocessor code. If it finds "alter table" with "add
column" and "default", and it is "critical database", the whole operation
halts on that point, and warn us to "do it yourself"... :-)

---

After the previous same problem on tk table I tried to write a client app,
which update records by 1 with commit - but it was very slow.

   update tk set field1 = 'MUCH' where id in (
  select id from tk where field1 is NULL limit 1
   )

I think this caused that in the goal field haven't got index (because many
times the fields with default values are not searchable, just row level
fields), and the client/server communication is slower than I like.

Formerly I thought I could speeding up this with stored proc - but as I
read the stored procs can't use inner transactions - so I must make client
programme to use begin/commit... (PGAdmin?).

Thanks for the prior infos!

Best regards
   dd


hubert depesz lubaczewski  ezt írta (időpont: 2020. jan.
30., Cs, 17:20):

> On Thu, Jan 30, 2020 at 04:51:02PM +0100, Durumdara wrote:
> > Is there any advance when I split updates? I'm in one transaction.
>
> In this case - no. The benefit of split updates is when you're not in
> single transaction.
>
> Why would you want to have it all done in single transaction?
>
> Best regards,
>
> depesz
>
>


combination join against multiple tables

2020-01-31 Thread Geoff Winkless
Hi

I have a query involving multiple tables that I would like to return in a
single query. That means returning multiple sets of the data from the first
base table, but that's acceptable for the simplicity in grabbing all the
data in one hit.

An example set:

CREATE TABLE t1 (a int, b int, c int);
CREATE TABLE t2 (a int, b int, c int);
CREATE TABLE base (a int);
INSERT INTO t1 (a, b, c) VALUES (1, 1, 111), (1,2,112), (2,1,121),
(4,1,141);
INSERT INTO t2 (a, b, c) VALUES (1, 1, 211), (2, 2, 222), (5,3,253);
INSERT INTO base(a) SELECT * FROM GENERATE_SERIES(1,10);

Now the problem is that I would like to return all the rows from a, but
with a single row where t2.b and t1.b match.

So the results I would like:

 a  |  c  |  c
+-+-
  1 | 111 | 211
  1 | 112 |
  2 | 121 |
  2 | | 222
  3 | |
  4 | 141 |
  5 | | 253
  6 | |
  7 | |
  8 | |
  9 | |
 10 | |

At the moment I'm doing

SELECT base.a, t1.c, t2.c
FROM base
CROSS JOIN (SELECT b FROM t1 UNION SELECT b FROM t2 UNION SELECT -1) tmpset
LEFT JOIN t1 ON t1.a=base.a AND t1.b=tmpset.b
LEFT JOIN t2 ON t2.a=base.a AND t2.b=tmpset.b
WHERE t1.a IS NOT NULL
OR t2.a IS NOT NULL
OR (tmpset.b=-1
  AND NOT EXISTS (SELECT FROM t1 WHERE t1.a=base.a)
  AND NOT EXISTS (SELECT FROM t2 WHERE t2.a=base.a)
);


but this seems like a really convoluted way to do it.

Is there a join style that will return the set I want without the pain?

I should be clear that the real world data is much more complex than this,
but it describes the basic problem.

Thanks

Geoff


Re: combination join against multiple tables

2020-01-31 Thread David G. Johnston
On Friday, January 31, 2020, Geoff Winkless  wrote:

> Now the problem is that I would like to return all the rows from a, but
> with a single row where t2.b and t1.b match.
>

So, the final,number of rows for each “a” is the larger row count of “b”
and “c” having the same “a”.  Furthermore for the first “n” rows “b” and
“c” should be paired together by position.  The smaller count column just
gets nulls for the extra rows.

Probably the easiest way is to combine the matches for “b” and “c” into
arrays the jointly unnest those arrays in the final result - with in the
select list or maybe as part,of a lateral join, not sure without
experimentation.

Otherwise you can add “row_number” to “b” and “c” and then left join on (a,
row_number).

David J.


Re: combination join against multiple tables

2020-01-31 Thread Geoff Winkless
On Fri, 31 Jan 2020 at 15:25, David G. Johnston 
wrote:

> On Friday, January 31, 2020, Geoff Winkless  wrote:
>
>> Now the problem is that I would like to return all the rows from a, but
>> with a single row where t2.b and t1.b match.
>>
>
> So, the final,number of rows for each “a” is the larger row count of “b”
> and “c” having the same “a”.  Furthermore for the first “n” rows “b” and
> “c” should be paired together by position.  The smaller count column just
> gets nulls for the extra rows.
>
> Probably the easiest way is to combine the matches for “b” and “c” into
> arrays the jointly unnest those arrays in the final result - with in the
> select list or maybe as part,of a lateral join, not sure without
> experimentation.
>
> Otherwise you can add “row_number” to “b” and “c” and then left join on
> (a, row_number).
>
>
Thanks for the reply. Using array() hadn't occurred to me, I'll look at
that.

I actually came up with this:

SELECT base.a, t1.c, t2.c
FROM base
LEFT JOIN (t1 FULL OUTER JOIN t2 ON t1.b=t2.b AND t1.a=t2.a)
ON COALESCE(t1.a, base.a)=base.a AND COALESCE(t2.a, base.a)=base.a;

which does solve the described problem; sadly I realise that I'd
oversimplified my question: I haven't fully described the problem because
in reality "t2" is joined to "base" with a different field, and I can't
seem to get the join to do what I want without joining them together like
this.

Geoff


Re: How to restore to empty database

2020-01-31 Thread Adrian Klaver

On 1/30/20 10:21 PM, Andrus wrote:

Hi!

These days 37 GB is relatively small, so you maybe getting into the 
realm of premature optimization. Do the 24 databases represent an 
entire cluster you are trying to transfer?


Basically yes.
Cluster contains also  small test database which actually does not need 
transferred but I can delete it manually after transfer.
Also postgres, template0 and template1 are not used directly by 
applications and probably does not need to be transferred.


They are created with a new cluster anyway. That being said you may want 
to recreate them if you have installed objects in them. For that see the 
--clean option to pg_dumpall. If you do use that make sure and fully 
read the Notes and Examples here:


https://www.postgresql.org/docs/11/app-pg-dumpall.html




If so have you looked at pg_dumpall?:
https://www.postgresql.org/docs/12/app-pg-dumpall.html
It is a text based backup, but it will include all the databases and 
the globals.


Source cluster is in old Debian 6 Squeeze running Postgres 9.1
Should I create pipe using pg_dumpall and restore everything from old 
using pipe instead of pg_dump/pg_restore ?


That would depend on how stable and fast the connection is between the 
two machines.


What sort of time frame are you working with?



Andrus.




--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Add column with default value in big table - splitting of updates can help?

2020-01-31 Thread Adrian Klaver

On 1/31/20 5:43 AM, Durumdara wrote:

Hello!

a.)
PG version is mainly 9.6, but some of the servers are 10.x or 11.x.

b.)
We have semi-automatic tool which get the a new modifications on 
databases, and execute them at once by database.
So one SQL script by one database, under one transaction - whole or 
nothing. If something failed, we know where to we start again by hand. 
It is impossible to execute only the first half, and we don't know which 
one executed or not


The main problem that sometimes we have to modify some tables which have 
too much records in some customer databases.


---

As I think the best way to solve this:

1.)
Future goal: upgrade to PG 11.

2.)
We need to write a preprocessor code. If it finds "alter table" with 
"add column" and "default", and it is "critical database", the whole 
operation halts on that point, and warn us to "do it yourself"... :-)


---

After the previous same problem on tk table I tried to write a client 
app, which update records by 1 with commit - but it was very slow.


    update tk set field1 = 'MUCH' where id in (
       select id from tk where field1 is NULL limit 1
    )

I think this caused that in the goal field haven't got index (because 
many times the fields with default values are not searchable, just row 
level fields), and the client/server communication is slower than I like.


Formerly I thought I could speeding up this with stored proc - but as I 
read the stored procs can't use inner transactions - so I must make 
client programme to use begin/commit... (PGAdmin?).



Up to version 11 Postgres only had stored functions. With 11+ there are 
stored procedures and inner transactions:


https://www.postgresql.org/docs/11/plpgsql-transactions.html

In stored functions the best you can do is:

https://www.postgresql.org/docs/11/plpgsql-control-structures.html#PLPGSQL-ERROR-TRAPPING



Thanks for the prior infos!

Best regards
    dd


hubert depesz lubaczewski mailto:dep...@depesz.com>> 
ezt írta (időpont: 2020. jan. 30., Cs, 17:20):


On Thu, Jan 30, 2020 at 04:51:02PM +0100, Durumdara wrote:
 > Is there any advance when I split updates? I'm in one transaction.

In this case - no. The benefit of split updates is when you're not in
single transaction.

Why would you want to have it all done in single transaction?

Best regards,

depesz




--
Adrian Klaver
adrian.kla...@aklaver.com




Re: combination join against multiple tables

2020-01-31 Thread Steven Lembark
On Fri, 31 Jan 2020 14:01:17 +
Geoff Winkless  wrote:

> Hi
> 
> I have a query involving multiple tables that I would like to return
> in a single query. That means returning multiple sets of the data
> from the first base table, but that's acceptable for the simplicity
> in grabbing all the data in one hit.
> 
> An example set:
> 
> CREATE TABLE t1 (a int, b int, c int);
> CREATE TABLE t2 (a int, b int, c int);
> CREATE TABLE base (a int);
> INSERT INTO t1 (a, b, c) VALUES (1, 1, 111), (1,2,112), (2,1,121),
> (4,1,141);
> INSERT INTO t2 (a, b, c) VALUES (1, 1, 211), (2, 2, 222), (5,3,253);
> INSERT INTO base(a) SELECT * FROM GENERATE_SERIES(1,10);
> 
> Now the problem is that I would like to return all the rows from a,
> but with a single row where t2.b and t1.b match.
> 
> So the results I would like:
> 
>  a  |  c  |  c
> +-+-
>   1 | 111 | 211
>   1 | 112 |
>   2 | 121 |
>   2 | | 222
>   3 | |
>   4 | 141 |
>   5 | | 253
>   6 | |
>   7 | |
>   8 | |
>   9 | |
>  10 | |
> 
> At the moment I'm doing
> 
> SELECT base.a, t1.c, t2.c
> FROM base
> CROSS JOIN (SELECT b FROM t1 UNION SELECT b FROM t2 UNION SELECT -1)
> tmpset LEFT JOIN t1 ON t1.a=base.a AND t1.b=tmpset.b
> LEFT JOIN t2 ON t2.a=base.a AND t2.b=tmpset.b
> WHERE t1.a IS NOT NULL
> OR t2.a IS NOT NULL
> OR (tmpset.b=-1
>   AND NOT EXISTS (SELECT FROM t1 WHERE t1.a=base.a)
>   AND NOT EXISTS (SELECT FROM t2 WHERE t2.a=base.a)
> );
> 
> 
> but this seems like a really convoluted way to do it.
> 
> Is there a join style that will return the set I want without the
> pain?
> 
> I should be clear that the real world data is much more complex than
> this, but it describes the basic problem.
> 
> Thanks
> 
> Geoff

The join on T1 & t2 seems to just be a full outer join of 
t1 & t2 on a & b.

Note that you cannot have two columns in the output with the
same name (e.g., a+c+c, above, is not a valid table). Call them
"c1" & "c2":

A full outer join of t1 & t2 on a & b seems to give you all of
the necessary combinations of c necessary; at which point an
outer join on a associates base values with anything that 
mathes on a:

select
  distinct
base.a
  , z.c1
  , z.c2
from
  base
  left join
  (
select
  distinct
t1.a
  , t1.c  "c1"
  , t2.c  "c2"
from
  t1
  full outer join
  t2
  on
t1.a= t2.a
and
t1.b= t2.b
  ) z
  on
  base.a  = z.a
;

No idea what the real data looks like, but distinct likely to be
helpful if real t's have more than three cols.

-- 
Steven Lembark3646 Flora Place
Workhorse ComputingSt. Louis, MO 63110
lemb...@wrkhors.com+1 888 359 3508




Re: combination join against multiple tables

2020-01-31 Thread Steven Lembark
On Fri, 31 Jan 2020 14:01:17 +
Geoff Winkless  wrote:

>  a  |  c  |  c
> +-+-
>   1 | 111 | 211
>   1 | 112 |
>   2 | 121 |
>   2 | | 222
>   3 | |
>   4 | 141 |
>   5 | | 253
>   6 | |
>   7 | |
>   8 | |
>   9 | |
>  10 | |

The c's look just like a full outer join of t1 & t2 on a & b.

Giving them saner names to avoid duplicate output col's, let's
call them "c1" & "c2".

At that point a left outer join on a gives you all of the base.a
values with any t{1,2} rows that have a matching a:

No idea what your data really looks like but if t1 or t2 has more
than three col's distinct can save some annoying cross-products:

select
  distinct
base.a
  , z.c1
  , z.c2
from
  base
  left join
  (
select
t1.a
  , t1.c  "c1"
  , t2.c  "c2"
from
  t1
  full outer join
  t2
  on
t1.a= t2.a
and
t1.b= t2.b
  ) z
  on
  base.a  = z.a
;

-- 
Steven Lembark3646 Flora Place
Workhorse ComputingSt. Louis, MO 63110
lemb...@wrkhors.com+1 888 359 3508




performance of loading CSV data with COPY is 50 times faster than Perl::DBI

2020-01-31 Thread Matthias Apitz


Hello,

Since ages, we transfer data between different DBS (Informix, Sybase,
Oracle, and now PostgreSQL) with our own written tool, based on
Perl::DBI which produces a CSV like export in a common way, i.e. an
export of Oracle can be loaded into Sybase and vice versa. Export and
Import is done row by row, for some tables millions of rows.

We produced a special version of the tool to export the rows into a
format which understands the PostgreSQL's COPY command and got to know
that the import into PostgreSQL of the same data with COPY is 50 times
faster than with Perl::DBI, 2.5 minutes ./. 140 minutes for around 6
million rows into an empty table without indexes.

How can COPY do this so fast?

matthias

-- 
Matthias Apitz, ✉ g...@unixarea.de, http://www.unixarea.de/ +49-176-38902045
Public GnuPG key: http://www.unixarea.de/key.pub




Re: performance of loading CSV data with COPY is 50 times faster than Perl::DBI

2020-01-31 Thread Adrian Klaver

On 1/31/20 10:24 AM, Matthias Apitz wrote:


Hello,

Since ages, we transfer data between different DBS (Informix, Sybase,
Oracle, and now PostgreSQL) with our own written tool, based on
Perl::DBI which produces a CSV like export in a common way, i.e. an
export of Oracle can be loaded into Sybase and vice versa. Export and
Import is done row by row, for some tables millions of rows.

We produced a special version of the tool to export the rows into a
format which understands the PostgreSQL's COPY command and got to know
that the import into PostgreSQL of the same data with COPY is 50 times
faster than with Perl::DBI, 2.5 minutes ./. 140 minutes for around 6
million rows into an empty table without indexes.

How can COPY do this so fast?


Well for one thing COPY does everything in a single transaction, which 
is both good and bad. The good is that it is fast, the bad is that a 
single error will rollback the entire operation.


COPY also uses it's own method for transferring data. For all the 
details see:


https://www.postgresql.org/docs/12/protocol-flow.html#PROTOCOL-COPY



matthias




--
Adrian Klaver
adrian.kla...@aklaver.com




Re: performance of loading CSV data with COPY is 50 times faster than Perl::DBI

2020-01-31 Thread Pavel Stehule
pá 31. 1. 2020 v 19:25 odesílatel Matthias Apitz  napsal:

>
> Hello,
>
> Since ages, we transfer data between different DBS (Informix, Sybase,
> Oracle, and now PostgreSQL) with our own written tool, based on
> Perl::DBI which produces a CSV like export in a common way, i.e. an
> export of Oracle can be loaded into Sybase and vice versa. Export and
> Import is done row by row, for some tables millions of rows.
>
> We produced a special version of the tool to export the rows into a
> format which understands the PostgreSQL's COPY command and got to know
> that the import into PostgreSQL of the same data with COPY is 50 times
> faster than with Perl::DBI, 2.5 minutes ./. 140 minutes for around 6
> million rows into an empty table without indexes.
>
> How can COPY do this so fast?
>

Probably there are more reasons

1. probably DBI implementation is not too effective (for Postgres), maybe
because COPY is not fault tolerant
2. postgres has not implicit plan cache, so every INSERT planned again and
again
3. COPY bypass planner and executor and it has very effective network
communication
4. with COPY you have a sure so autocommit is disabled.

Regards

Pavel



>
> matthias
>
> --
> Matthias Apitz, ✉ g...@unixarea.de, http://www.unixarea.de/
> +49-176-38902045
> Public GnuPG key: http://www.unixarea.de/key.pub
>
>
>


Re: performance of loading CSV data with COPY is 50 times faster than Perl::DBI

2020-01-31 Thread Brian Dunavant
You can use COPY over DBI.

https://metacpan.org/pod/DBD::Pg#COPY-support

On Fri, Jan 31, 2020 at 2:03 PM Pavel Stehule 
wrote:

>
>
> pá 31. 1. 2020 v 19:25 odesílatel Matthias Apitz 
> napsal:
>
>>
>> Hello,
>>
>> Since ages, we transfer data between different DBS (Informix, Sybase,
>> Oracle, and now PostgreSQL) with our own written tool, based on
>> Perl::DBI which produces a CSV like export in a common way, i.e. an
>> export of Oracle can be loaded into Sybase and vice versa. Export and
>> Import is done row by row, for some tables millions of rows.
>>
>> We produced a special version of the tool to export the rows into a
>> format which understands the PostgreSQL's COPY command and got to know
>> that the import into PostgreSQL of the same data with COPY is 50 times
>> faster than with Perl::DBI, 2.5 minutes ./. 140 minutes for around 6
>> million rows into an empty table without indexes.
>>
>> How can COPY do this so fast?
>>
>
> Probably there are more reasons
>
> 1. probably DBI implementation is not too effective (for Postgres), maybe
> because COPY is not fault tolerant
> 2. postgres has not implicit plan cache, so every INSERT planned again and
> again
> 3. COPY bypass planner and executor and it has very effective network
> communication
> 4. with COPY you have a sure so autocommit is disabled.
>
> Regards
>
> Pavel
>
>
>
>>
>> matthias
>>
>> --
>> Matthias Apitz, ✉ g...@unixarea.de, http://www.unixarea.de/
>> +49-176-38902045
>> Public GnuPG key: http://www.unixarea.de/key.pub
>>
>>
>>


PostgreSQL Installer issue

2020-01-31 Thread George Weaver

Good afternoon,

I am updating a PostgreSQL 10.3 installation to PostgreSQL 10.11 
installation on Windows 10 using the EDB installer.


On the Select Components dialog, all of the components are selected and 
not enabled: no component can be de-selected.


George




Re: PostgreSQL Installer issue

2020-01-31 Thread Adrian Klaver

On 1/31/20 12:27 PM, George Weaver wrote:

Good afternoon,

I am updating a PostgreSQL 10.3 installation to PostgreSQL 10.11 
installation on Windows 10 using the EDB installer.


On the Select Components dialog, all of the components are selected and 
not enabled: no component can be de-selected.


My guess is because it just a minor version upgrade and all its doing is 
upgrading the binaries.




George





--
Adrian Klaver
adrian.kla...@aklaver.com




Re: ERROR: too many dynamic shared memory segments

2020-01-31 Thread Thomas Munro
On Thu, Jan 30, 2020 at 12:26 AM Thomas Munro  wrote:
> On Wed, Jan 29, 2020 at 11:24 PM Julian Backes  wrote:
> > we only had the "too many shared too many dynamic shared memory segments" 
> > error but no segmentation faults. The error started occurring after 
> > upgrading from postgres 10 to postgres 12 (server has 24 cores / 48 
> > threads, i.e. many parallel workers). The error itself was not that much of 
> > a problem but /dev/shm started filling up with orphaned files which 
> > probably (?) had not been cleaned up by postgres after the parallel workers 
> > died. In consequence, after some time, /dev/shm was full and everything 
> > crashed.
>
> Oh, thanks for the report.  I think see what was happening there, and
> it's a third independent problem.  The code in dsm_create() does
> DSM_OP_DESTROY (ie cleans up) in the DSM_CREATE_NULL_IF_MAXSEGMENTS
> case, but in the case where you see "ERROR: too many dynamic shared
> memory segments" it completely fails to clean up after itself.  I can
> reproduce that here.  That's a terrible bug, and has been sitting in
> the tree for 5 years.

I committed a fix for that.  It'll be in the new releases that due out
in a couple of weeks.

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=93745f1e019543fe7b742d0c5e971aad8d08fd56

> > Unfortunately, the only "solution" we found so far was to increase max 
> > connections from 100 to 1000. After that (about 2 months ago I think), the 
> > error had gone.
>
> I'll take that as a vote for increasing the number of slots.

I committed something to do this for 13 (due out end of year), but I'm
not game to back-patch it to the release branches.

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=d061ea21fc1cc1c657bb5c742f5c4a1564e82ee2




Re: ERROR: too many dynamic shared memory segments

2020-01-31 Thread Thomas Munro
On Fri, Jan 31, 2020 at 11:05 PM Nicola Contu  wrote:
> Do you still recommend to increase max_conn?

Yes, as a workaround of last resort.  The best thing would be to
figure out why you are hitting the segment limit, and see if there is
something we could tune to fix that. If you EXPLAIN your queries, do
you see plans that have a lot of "Gather" nodes in them, perhaps
involving many partitions?  Or are you running a lot of parallel
queries at the same time?  Or are you running queries that do very,
very large parallel hash joins?  Or something else?