Any way to speed up INSERT INTO

2022-03-04 Thread aditya desai
Hi,
One of the service layer app is inserting Millions of records in a table
but one row at a time. Although COPY is the fastest way to import a file in
a table. Application has a requirement of processing a row and inserting it
into a table. Is there any way this INSERT can be tuned by increasing
parameters? It is taking almost 10 hours for just 2.2 million rows in a
table. Table does not have any indexes or triggers.

Regards,
Aditya.


Re: Any way to speed up INSERT INTO

2022-03-04 Thread Bruce Momjian
On Sat, Mar  5, 2022 at 12:01:52AM +0530, aditya desai wrote:
> Hi,
> One of the service layer app is inserting Millions of records in a table but
> one row at a time. Although COPY is the fastest way to import a file in a
> table. Application has a requirement of processing a row and inserting it into
> a table. Is there any way this INSERT can be tuned by increasing parameters? 
> It
> is taking almost 10 hours for just 2.2 million rows in a table. Table does not
> have any indexes or triggers.

Well, sections 14.4 and 14.5 might help:

https://www.postgresql.org/docs/14/performance-tips.html

Your time seems very slow --- are the rows very wide?

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  If only the physical world exists, free will is an illusion.





Re: Any way to speed up INSERT INTO

2022-03-04 Thread Tom Lane
aditya desai  writes:
> One of the service layer app is inserting Millions of records in a table
> but one row at a time. Although COPY is the fastest way to import a file in
> a table. Application has a requirement of processing a row and inserting it
> into a table. Is there any way this INSERT can be tuned by increasing
> parameters? It is taking almost 10 hours for just 2.2 million rows in a
> table. Table does not have any indexes or triggers.

Using a prepared statement for the INSERT would help a little bit.
What would help more, if you don't expect any insertion failures,
is to group multiple inserts per transaction (ie put BEGIN ... COMMIT
around each batch of 100 or 1000 or so insertions).  There's not
going to be any magic bullet that lets you get away without changing
the app, though.

It's quite possible that network round trip costs are a big chunk of your
problem, in which case physically grouping multiple rows into each INSERT
command (... or COPY ...) is the only way to fix it.  But I'd start with
trying to reduce the transaction commit overhead.

regards, tom lane




Re: Any way to speed up INSERT INTO

2022-03-04 Thread aditya desai
Hi Bruce,
Correct rows are wider. One of the columns is text and one is bytea.

Regards,
Aditya.

On Sat, Mar 5, 2022 at 12:08 AM Bruce Momjian  wrote:

> On Sat, Mar  5, 2022 at 12:01:52AM +0530, aditya desai wrote:
> > Hi,
> > One of the service layer app is inserting Millions of records in a table
> but
> > one row at a time. Although COPY is the fastest way to import a file in a
> > table. Application has a requirement of processing a row and inserting
> it into
> > a table. Is there any way this INSERT can be tuned by increasing
> parameters? It
> > is taking almost 10 hours for just 2.2 million rows in a table. Table
> does not
> > have any indexes or triggers.
>
> Well, sections 14.4 and 14.5 might help:
>
> https://www.postgresql.org/docs/14/performance-tips.html
>
> Your time seems very slow --- are the rows very wide?
>
> --
>   Bruce Momjian  https://momjian.us
>   EDB  https://enterprisedb.com
>
>   If only the physical world exists, free will is an illusion.
>
>


Re: Any way to speed up INSERT INTO

2022-03-04 Thread Bruce Momjian
On Fri, Mar  4, 2022 at 01:42:39PM -0500, Tom Lane wrote:
> aditya desai  writes:
> > One of the service layer app is inserting Millions of records in a table
> > but one row at a time. Although COPY is the fastest way to import a file in
> > a table. Application has a requirement of processing a row and inserting it
> > into a table. Is there any way this INSERT can be tuned by increasing
> > parameters? It is taking almost 10 hours for just 2.2 million rows in a
> > table. Table does not have any indexes or triggers.
> 
> Using a prepared statement for the INSERT would help a little bit.

Yeah, I thought about that but it seems it would only minimally help.

> What would help more, if you don't expect any insertion failures,
> is to group multiple inserts per transaction (ie put BEGIN ... COMMIT
> around each batch of 100 or 1000 or so insertions).  There's not
> going to be any magic bullet that lets you get away without changing
> the app, though.

Yeah, he/she could insert via multiple rows too:

CREATE TABLE test (x int);
INSERT INTO test VALUES (1), (2), (3);

> It's quite possible that network round trip costs are a big chunk of your
> problem, in which case physically grouping multiple rows into each INSERT
> command (... or COPY ...) is the only way to fix it.  But I'd start with
> trying to reduce the transaction commit overhead.

Agreed, turning off synchronous_commit for that those queries would be
my first approach.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  If only the physical world exists, free will is an illusion.





Re: Any way to speed up INSERT INTO

2022-03-04 Thread Andres Freund
Hi, 

On March 4, 2022 10:42:39 AM PST, Tom Lane  wrote:
>aditya desai  writes:
>> One of the service layer app is inserting Millions of records in a table
>> but one row at a time. Although COPY is the fastest way to import a file in
>> a table. Application has a requirement of processing a row and inserting it
>> into a table. Is there any way this INSERT can be tuned by increasing
>> parameters? It is taking almost 10 hours for just 2.2 million rows in a
>> table. Table does not have any indexes or triggers.
>
>Using a prepared statement for the INSERT would help a little bit.
>What would help more, if you don't expect any insertion failures,
>is to group multiple inserts per transaction (ie put BEGIN ... COMMIT
>around each batch of 100 or 1000 or so insertions).  There's not
>going to be any magic bullet that lets you get away without changing
>the app, though.
>
>It's quite possible that network round trip costs are a big chunk of your
>problem, in which case physically grouping multiple rows into each INSERT
>command (... or COPY ...) is the only way to fix it.  But I'd start with
>trying to reduce the transaction commit overhead.

Pipelining could also help.
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.




RES: Any way to speed up INSERT INTO

2022-03-04 Thread Edson Richter

De: Andres Freund
Enviado:sexta-feira, 4 de março de 2022 15:52
Para: 
[email protected];
 Tom Lane; aditya desai
Cc:Pgsql Performance
Assunto: Re: Any way to speed up INSERT INTO

Hi,

On March 4, 2022 10:42:39 AM PST, Tom Lane  wrote:
>aditya desai  writes:
>> One of the service layer app is inserting Millions of records in a table
>> but one row at a time. Although COPY is the fastest way to import a file in
>> a table. Application has a requirement of processing a row and inserting it
>> into a table. Is there any way this INSERT can be tuned by increasing
>> parameters? It is taking almost 10 hours for just 2.2 million rows in a
>> table. Table does not have any indexes or triggers.
>
>Using a prepared statement for the INSERT would help a little bit.
>What would help more, if you don't expect any insertion failures,
>is to group multiple inserts per transaction (ie put BEGIN ... COMMIT
>around each batch of 100 or 1000 or so insertions).  There's not
>going to be any magic bullet that lets you get away without changing
>the app, though.
>
>It's quite possible that network round trip costs are a big chunk of your
>problem, in which case physically grouping multiple rows into each INSERT
>command (... or COPY ...) is the only way to fix it.  But I'd start with
>trying to reduce the transaction commit overhead.

Pipelining could also help.
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.


Sorry for disturbing – I had similar problem with storing logs for e-commerce 
service mesh producing millions of records per day; to not loose anything, I do 
record every log records in Apache ActiveMQ Artemis, and then another 
microservice collects data from MQ and store in PostgreSQL. Since we have logs 
in waves, ActiveMQ Artemis reduces the “impedance” between systems.
Just my 2c.

Regards,

ER.


Re: Any way to speed up INSERT INTO

2022-03-04 Thread Imre Samu
> Correct rows are wider. One of the columns is text and one is bytea.

with the PG14 the LZ4 compression is worth checking.

via
https://www.postgresql.fastware.com/blog/what-is-the-new-lz4-toast-compression-in-postgresql-14




*"""INSERT statements with 16 clientsAnother common scenario that I tested
was accessing the database from multiple clients - 16 in this case.What I
found out, as can be seen below, is that compression performance of single
large files (HTML, English text, source code, executable binary, pictures)
using LZ4 was 60% to 70% faster compared to PGLZ, and that there was also a
small improvement while inserting multiple small files (PostgreSQL
document).*
*"""*

kind regards,
  Imre

aditya desai  ezt írta (időpont: 2022. márc. 4., P,
19:42):

> Hi Bruce,
> Correct rows are wider. One of the columns is text and one is bytea.
>
> Regards,
> Aditya.
>
> On Sat, Mar 5, 2022 at 12:08 AM Bruce Momjian  wrote:
>
>> On Sat, Mar  5, 2022 at 12:01:52AM +0530, aditya desai wrote:
>> > Hi,
>> > One of the service layer app is inserting Millions of records in a
>> table but
>> > one row at a time. Although COPY is the fastest way to import a file in
>> a
>> > table. Application has a requirement of processing a row and inserting
>> it into
>> > a table. Is there any way this INSERT can be tuned by increasing
>> parameters? It
>> > is taking almost 10 hours for just 2.2 million rows in a table. Table
>> does not
>> > have any indexes or triggers.
>>
>> Well, sections 14.4 and 14.5 might help:
>>
>> https://www.postgresql.org/docs/14/performance-tips.html
>>
>> Your time seems very slow --- are the rows very wide?
>>
>> --
>>   Bruce Momjian  https://momjian.us
>>   EDB  https://enterprisedb.com
>>
>>   If only the physical world exists, free will is an illusion.
>>
>>


XA transactions much slower on 14.2 than on 13.5

2022-03-04 Thread Mladen Gogala
My boss asked me to upgrade one of the development  databases from 13.5 
--> 14.2. One thing that we've noticed right away is that XA 
transactions (2-phase commit) are much slower on 14.2 than on 13.5. Were 
there any significant changes to the XA protocol in the version 14? Did 
anybody else encountered this problem?


When I say "XA transactions are much slower", I mean that commit and/or 
rollback take much longer. The SQL execution takes the same and the 
plans are identical to the 13.5 version. The application code is the 
same, using IBM WebSphere 9.0.4.


Regards

--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com


Re: XA transactions much slower on 14.2 than on 13.5

2022-03-04 Thread Tom Lane
Mladen Gogala  writes:
> My boss asked me to upgrade one of the development  databases from 13.5 
> --> 14.2. One thing that we've noticed right away is that XA 
> transactions (2-phase commit) are much slower on 14.2 than on 13.5. Were 
> there any significant changes to the XA protocol in the version 14? Did 
> anybody else encountered this problem?

There were a bunch of changes around the 2PC code to support logical
replication of 2PC transactions, but I don't think they should have
made for any particular performance difference in non-replicated
servers.  Can you put together a self-contained test case that
demonstrates what you're seeing?

regards, tom lane




Re: Any way to speed up INSERT INTO

2022-03-04 Thread aditya desai
Thanks all for your inputs. We will try to implement inserts in single
transaction. I feel that is the best approach.

Thanks,
AD.

On Saturday, March 5, 2022, Bruce Momjian  wrote:

> On Fri, Mar  4, 2022 at 01:42:39PM -0500, Tom Lane wrote:
> > aditya desai  writes:
> > > One of the service layer app is inserting Millions of records in a
> table
> > > but one row at a time. Although COPY is the fastest way to import a
> file in
> > > a table. Application has a requirement of processing a row and
> inserting it
> > > into a table. Is there any way this INSERT can be tuned by increasing
> > > parameters? It is taking almost 10 hours for just 2.2 million rows in a
> > > table. Table does not have any indexes or triggers.
> >
> > Using a prepared statement for the INSERT would help a little bit.
>
> Yeah, I thought about that but it seems it would only minimally help.
>
> > What would help more, if you don't expect any insertion failures,
> > is to group multiple inserts per transaction (ie put BEGIN ... COMMIT
> > around each batch of 100 or 1000 or so insertions).  There's not
> > going to be any magic bullet that lets you get away without changing
> > the app, though.
>
> Yeah, he/she could insert via multiple rows too:
>
> CREATE TABLE test (x int);
> INSERT INTO test VALUES (1), (2), (3);
>
> > It's quite possible that network round trip costs are a big chunk of your
> > problem, in which case physically grouping multiple rows into each INSERT
> > command (... or COPY ...) is the only way to fix it.  But I'd start with
> > trying to reduce the transaction commit overhead.
>
> Agreed, turning off synchronous_commit for that those queries would be
> my first approach.
>
> --
>   Bruce Momjian  https://momjian.us
>   EDB  https://enterprisedb.com
>
>   If only the physical world exists, free will is an illusion.
>
>