Re: UPDATE on 20 Million Records Transaction or not?

2020-06-23 Thread luis . roberto



Thanks Adrian,
> You could break it down into multiple transactions if there is way to specify 
> ranges of records.
Say I couldn't break it up, would it be faster in or out of the transaction?


It depends whether you have concurrent transactions using the table. If you do, 
I think it would be better to split the update between smaller transactions.




Join optimization

2020-07-11 Thread luis . roberto
Hi! 

Recently on a blogpost [1] I saw that Oracle was able to "optimize" a join 
strategy by completely eliminating access to a table. 

Heres the execution on Oracle 18c [2] and PostgreSQL 13 (beta) [3]. 

Is there a fundamental reason why PG can't do the same? 

Thanks! 
[1] [ 
https://blog.dbi-services.com/the-myth-of-nosql-vs-rdbms-joins-dont-scale/ | 
https://blog.dbi-services.com/the-myth-of-nosql-vs-rdbms-joins-dont-scale/ ] 
[2] [ 
https://dbfiddle.uk/?rdbms=oracle_18&fiddle=a5afeba2fdb27dec7533545ab0a6eb0e | 
https://dbfiddle.uk/?rdbms=oracle_18&fiddle=a5afeba2fdb27dec7533545ab0a6eb0e ] 
[3] [ 
https://dbfiddle.uk/?rdbms=postgres_13&fiddle=87c78b4e97fdbd87255efc2fc909ee62 
| 
https://dbfiddle.uk/?rdbms=postgres_13&fiddle=87c78b4e97fdbd87255efc2fc909ee62 
] 


Re: Join optimization

2020-07-11 Thread luis . roberto
I'm sorry for the bad example. 

Here is another, with some data on PG: [ 
https://dbfiddle.uk/?rdbms=postgres_13&fiddle=ccfd1c4fa291e74a6db9db1772e2b5ac 
| 
https://dbfiddle.uk/?rdbms=postgres_13&fiddle=ccfd1c4fa291e74a6db9db1772e2b5ac 
] and Oracle: [ 
https://dbfiddle.uk/?rdbms=oracle_18&fiddle=21a98f499065ad4e2c35ff4bd1487e14 | 
https://dbfiddle.uk/?rdbms=oracle_18&fiddle=21a98f499065ad4e2c35ff4bd1487e14 ] 
. 

I don't understand oracle's execution plan very well, but it doesn't seem to be 
hitting the Users table... 


De: "Fabrízio de Royes Mello"  
Para: "luis.roberto"  
Cc: "pgsql-general"  
Enviadas: Sábado, 11 de julho de 2020 15:24:04 
Assunto: Re: Join optimization 


Em sáb, 11 de jul de 2020 às 14:20, < [ mailto:luis.robe...@siscobra.com.br | 
luis.robe...@siscobra.com.br ] > escreveu: 



Hi! 

Recently on a blogpost [1] I saw that Oracle was able to "optimize" a join 
strategy by completely eliminating access to a table. 

Heres the execution on Oracle 18c [2] and PostgreSQL 13 (beta) [3]. 

Is there a fundamental reason why PG can't do the same? 




It does... did you see the “never executed” notice on the Postgres explain 
output? 

Regards, 



BQ_BEGIN


BQ_END

-- 
Fabrízio de Royes Mello Timbira - [ http://www.timbira.com.br/ | 
http://www.timbira.com.br/ ] 
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento 



Re: Join optimization

2020-07-12 Thread luis . roberto


- Mensagem original -
De: "David Rowley" 
Para: "luis.roberto" 
Cc: "Fabrízio de Royes Mello" , "pgsql-general" 

Enviadas: Domingo, 12 de julho de 2020 5:29:08
Assunto: Re: Join optimization

On Sun, 12 Jul 2020 at 06:59,  wrote:
>
> I'm sorry for the bad example.
>
> Here is another, with some data on PG:  
> https://dbfiddle.uk/?rdbms=postgres_13&fiddle=ccfd1c4fa291e74a6db9db1772e2b5ac
>   and Oracle:  
> https://dbfiddle.uk/?rdbms=oracle_18&fiddle=21a98f499065ad4e2c35ff4bd1487e14.

I believe what you're talking about is join removals.  It appears as
though Oracle is able to remove the inner join to the users table as
the join only serves to check the user record exists. No columns are
being selected.  The record must exist due to the foreign key
referencing users.

PostgreSQL currently can only remove left joins. Likely what you could
do here is just change your inner join into a left join. If you're
happy enough that the user record will always exist then that should
allow it to work.

The reason PostgreSQL does not currently support inner join is that by
default, foreign key constraints are only triggered at the end of the
query, (or if deferred, at the end of the transaction). WIth
PostgreSQL, it's possible for a SELECT query to see a violated foreign
key constraint.  This can happen if your SELECT query calls a function
which updates a referenced record.  The cascade of the foreign key
won't occur until the end of the statement, so the select may stumble
upon a violated foreign key.

Here's a quick example of this case:
drop table t1,t2;
create table t1 (a int primary key);
create table t2 (a int references t1 on update cascade);

insert into t1 values(1),(2);
insert into t2 values(2),(2);
create or replace function update_t1 (p_a int) returns int as $$ begin
update t1 set a = a + 1 where a = p_a; return p_a; end; $$ language
plpgsql volatile;

-- in theory, this should never return any rows as we're asking for
all rows that
-- don't exist in the referenced table. However, we do get a result
since the function
-- call updates t1 setting the row with a=2 to a=3. The cascade to t2
does not occur
-- until the end of the statement.
select update_t1(a+1),t1.a from t1 where not exists(select 1 from t2
where t1.a=t2.a);
 update_t1 | a
---+---
 2 | 1
(1 row)

If you're happy that you don't have any functions like that which
could momentarily cause the foreign key to appear violated, then there
shouldn't be any harm in changing the INNER JOIN on users to a LEFT
JOIN.  PostgreSQL will be able to remove the join in that case.

David

-

Thanks for the reply David! I understand it better now.




Re: Performance of "distinct with limit"

2020-08-28 Thread luis . roberto
Hi, 

If "n" is indexed, it should run quickly. Can you share the execution plan for 
your query? 



De: "Klaudie Willis"  
Para: "pgsql-general"  
Enviadas: Sexta-feira, 28 de agosto de 2020 8:29:58 
Assunto: Performance of "distinct with limit" 

Hi, 

Ran into this under-optimized query execution. 

select distinct n from bigtable; -- Lets say this takes 2 minutes 
select distinct n from bigtable limit 2 -- This takes approximately the same 
time 

However, the latter should have the potential to be so much quicker. I checked 
the same query on MSSQL (with 'top 2'), and it seems to do exactly the 
optimization I would expect. 

Is there any way to achieve a similar speedup in Postgresql? 

Klaudie 




Effective IO Concurrency

2020-09-14 Thread luis . roberto
Hi! 

In PostgreSQL 13, the way of using effective_io_concurrency has changed. Until 
v12, I used 200 for this value (using SSD drives). Using the new formula 
described in [ https://www.postgresql.org/docs/13/release-13.html | 
https://www.postgresql.org/docs/13/release-13.html ] gives me 1176. However, in 
the documentation [ 
https://www.postgresql.org/docs/13/runtime-config-resource.html#GUC-EFFECTIVE-IO-CONCURRENCY
 | 
https://www.postgresql.org/docs/13/runtime-config-resource.html#GUC-EFFECTIVE-IO-CONCURRENCY
 ] it says that the maximum value allowed is 1000. 




Re: Effective IO Concurrency

2020-09-14 Thread luis . roberto
De: "Ron"  
Para: "pgsql-general"  
Enviadas: Segunda-feira, 14 de setembro de 2020 13:10:23 
Assunto: Re: Effective IO Concurrency 

On 9/14/20 11:03 AM, Laurenz Albe wrote: 
> On Mon, 2020-09-14 at 10:39 -0300, luis.robe...@siscobra.com.br wrote: 
>> In PostgreSQL 13, the way of using effective_io_concurrency has changed. 
>> Until v12, 
>> I used 200 for this value (using SSD drives). Using the new formula 
>> described in 
>> https://www.postgresql.org/docs/13/release-13.html gives me 1176. 
>> However, in the documentation 
>> https://www.postgresql.org/docs/13/runtime-config-resource.html#GUC-EFFECTIVE-IO-CONCURRENCY
>>  
>> it says that the maximum value allowed is 1000. 
> Then use the value 1000... 

I think he means that the formula should take that into account. 

-- 
Angular momentum makes the world go 'round. 

Exactly 

For future reference, and maybe a updated documentation: 

SELECT least(round(sum(OLD / n::float)),1000) FROM generate_series(1, OLD) s(n) 

I don't know how to write patches, so maybe someone can do that. 

Thanks. 


Table sizes

2020-09-30 Thread luis . roberto
Hi! 

I'm trying to use this query to get table sizes, however I'm getting a strange 
error: 

select tablename,pg_relation_size(tablename::text) 
from pg_tables; 

In PG 13: 

SQL Error [42P01]: ERROR: relation "sql_implementation_info" does not exist 

In PG 12: 

SQL Error [42P01]: ERROR: relation "sql_parts" does not exist 



Re: Table sizes

2020-09-30 Thread luis . roberto



De: "Charles Clavadetscher"  
Para: "luis.roberto"  
Cc: "pgsql-general"  
Enviadas: Quarta-feira, 30 de setembro de 2020 10:46:39 
Assunto: Re: Table sizes 

Hello 

On 2020-09-30 14:11, luis.robe...@siscobra.com.br wrote: 
> Hi! 
> 
> I'm trying to use this query to get table sizes, however I'm getting a 
> strange error: 
> 
> select tablename,pg_relation_size(tablename::text) 
> from pg_tables; 
> 
> In PG 13: 
> 
> SQL Error [42P01]: ERROR: relation "sql_implementation_info" does not 
> exist 
> 
> In PG 12: 
> 
> SQL Error [42P01]: ERROR: relation "sql_parts" does not exist 

Try like this: 

select schemaname, 
tablename, 
pg_relation_size((schemaname || '.' || '"' || tablename || 
'"')::regclass) 
from pg_tables; 

You need to schema qualify the tables. Additionally, if you happen to 
have table names that have a mix of capital and non capital letters or 
contain other characters that might be problematic, you need to enclose 
the table name in double quotes. 

Regards 
Charles 

-- 
Charles Clavadetscher 
Swiss PostgreSQL Users Group 
Treasurer 
Spitzackerstrasse 9 
CH - 8057 Zürich 

http://www.swisspug.org 

+---+ 
|  __ ___ | 
| / )/ \/ \ | 
| ( / __ _\ ) | 
| \ (/ o) ( o) ) | 
| \_ (_ ) \ ) _/ | 
| \ /\_/ \)/ | 
| \/  | 
| _| | | 
| \|_/ | 
| | 
| Swiss PostgreSQL | 
| Users Group | 
| | 
+---+ 





Thanks, this worked. 

I wonder though, why calling pg_relation_size('users') work (I don't need to 
specify the schema). 


Analyze and Statistics

2020-11-05 Thread luis . roberto
Hi! 

Analyzing a table which has a statistic object raises the message: statistics 
object "public.new_statistic" could not be computed for relation 
"public.client" 


Re: How does Postgres decide if to use additional workers?

2021-02-09 Thread luis . roberto
De: "Thorsten Schöning"  
Para: "pgsql-general"  
Enviadas: Terça-feira, 9 de fevereiro de 2021 12:52:02 
Assunto: How does Postgres decide if to use additional workers? 
So, based on which facts does Postgres decide if to use aadditional 
workers or not? Can I see those decisions explained somewhere? I don't 
see anything in the query plan. Thanks! 

Mit freundlichen Grüßen 

Thorsten Schöning 



Hi! 

I think you should look into min_parallel_table_scan_size and 
min_parallel_index_scan_size GUCs. 




Re: pg_stat_progress_vacuum empty when running vacuum full

2021-02-12 Thread luis . roberto
> De: "Luca Ferrari" 
> Para: "pgsql-general" 
> Enviadas: Sexta-feira, 12 de fevereiro de 2021 8:00:46
> Assunto: pg_stat_progress_vacuum empty when running vacuum full

> Hi all, 
> I'm running 11.5 and I'm monitoring pg_stat_progress_vacuum every 2 
> seconds, while doing from another connection per-table VACUUMs. Every 
> vacuum last 30 or more seconds, so I was expecting to see a record 
> within pg_stat_progress_vacuum, but nothing appears if the vacuum is 
> full. I suspect this is due to vacuum full performing a side-by-side 
> table rewriting, rather than in-place actions against the original 
> table, but I'm not sure if this is real reason or if I'm missing 
> something. 
> 
> Thanks, 
> Luca 


Hi!

I believe VACUUM FULL progress can be monitored via the 
pg_stat_progress_cluster command: 
https://www.postgresql.org/docs/current/progress-reporting.html#VACUUM-PROGRESS-REPORTING




Re: Slow while inserting and retrieval (compared to SQL Server)

2021-02-17 Thread luis . roberto
- Mensagem original -
> De: sivapostg...@yahoo.com
> Para: "pgsql-general" 
> Enviadas: Quarta-feira, 17 de fevereiro de 2021 9:01:15
> Assunto: Re: Slow while inserting and retrieval (compared to SQL Server)
> 
> Hello, 
> 
> Using Postgres 11 in Windows Server 2012 & Powerbuilder 
> Working from the same machine where Postgres 11 is installed. So no chance 
> for any network issue, I feel. 
> No setup/config change done. Just working with all the default settings. 
> With no data in any of the 179 tables in that database. 
> 
> To populate some basic data we try to insert few records (max 4 records) in 
> few tables (around 6 tables) from one window. We feel that the insert time 
> taken is longer than the time taken while using Sql Server. We tested almost 
> a similar window that updated the similar table(s) in SQL server, which was > 
> > faster. With Postgres database, we need to wait for a couple of seconds 
> before the insert/update is over, which we didn't feel in Sql Server. 
> 
> I feel that some changes in settings might improve this speed, but with not 
> much knowledge in Postgres I struggle to find out those setup values. 
> 
> Any help in improving the speed is really appreciated. 
> 
> Happiness Always 
> BKR Sivaprakash 

Can you run EXPLAIN (ANALYZE,BUFFERS) on the INSERT command and post the 
results here? Usually inserting this many records should be instantaneous.




Re: Slow while inserting and retrieval (compared to SQL Server)

2021-02-17 Thread luis . roberto


- Mensagem original -
> De: "sivapostgres" 
> Para: "luis.roberto" 
> Cc: "pgsql-general" 
> Enviadas: Quarta-feira, 17 de fevereiro de 2021 9:54:18
> Assunto: Re: Slow while inserting and retrieval (compared to SQL Server)

> Hello,
> Yes, that's what I feel. With no records in any tables, working from the same
> machine where PG has been installed, with only one user working, inserting few
> records (10 records in total, in all 6 tables) should not take this much time.

> I'll be inserting records from PowerBuilder applications, and how to catch the
> result of Explain. OR should I run all insert commands in PG Admin or so and
> catch those results?

> As I'm new to PG, any documentation/help in this direction will be useful.

> Happiness Always
> BKR Sivaprakash

> On Wednesday, 17 February, 2021, 05:35:43 pm IST, 
> 
> wrote:


You can run it wherever you prefer. I'm not familiar with PowerBuilder, so I'd 
say PGAdmin.
BTW, if you run the INSERTs on PGAdmin, do you still seeing slow timings?




Re: Slow while inserting and retrieval (compared to SQL Server)

2021-02-19 Thread luis . roberto
> De: "sivapostgres" 
> Para: "Benedict Holland" 
> Cc: "pgsql-general" , "Thomas Kellerer" 
> 
> Enviadas: Quarta-feira, 17 de fevereiro de 2021 11:09:38
> Assunto: Re: Slow while inserting and retrieval (compared to SQL Server)

> So far no performance tuning done for sql server.  It works fine for the load.
> Even the express edition which is free, works better.  I don't think postgres
> will be so slow to insert such a low number of records in an empty database.
> I'll be preparing the required sql statements to insert those records in 
> pgadmin
> and see the timings, tomorrow.
> 
>On Wednesday, 17 February, 2021, 07:29:29 pm IST, Benedict Holland
> wrote:
> 
> Sql server is a 10k dollar to 1 million dollar application (or more) that is
> specifically optimized for windows and had limited to no support anywhere 
> else.
> Postgres is free and from my experience, comes within 5% of any other dbms.
> Inserting 1 row at a time with auto commit on will be a bit slow but it
> shouldn't be noticeable. What times are you seeing if you do this with 
> pgadmin4
> compared to sql server? Also, have you done any performance tuning for 
> postgres
> server? There are many documents detailing performance tuning your servers,
> like you probably did, at some point, with your sql server.
> Thanks,Ben
> On Wed, Feb 17, 2021, 8:28 AM sivapostg...@yahoo.com 
> wrote:
> 
> We use datawindows.  Datawindows will send the required DML statements to the
> database.  And it sent in format 1 .
> IN start of the application, Autocommit set to True.Before update of any
> table(s)Autocommit is set to FalseInsert/Update/Delete recordsIf success 
> commit
> else rollbackAutocommit is set to True
> This has been followed for decades and it's working fine with Sql server.
> Here we are trying to insert just 10 records spread across 6 tables, which is
> taking more time.. that's what we feel.   The similar work in SQL Server takes
> much less time < as if no wait is there >.
>On Wednesday, 17 February, 2021, 06:48:35 pm IST, Thomas Kellerer
> wrote:
> 
> sivapostg...@yahoo.com schrieb am 17.02.2021 um 13:01:
>> To populate some basic data we try to insert few records (max 4
>> records) in few tables (around 6 tables) from one window.  We feel
>> that the insert time taken is longer than the time taken while using
>> Sql Server.  We tested almost a similar window that updated the
>> similar table(s) in SQL server, which was faster.  With Postgres
>> database, we need to wait for a couple of seconds before the
>> insert/update is over, which we didn't feel in Sql Server.
> 
> 
> Are you doing single-row inserts like:
> 
>    insert into ... values (..);
>    insert into ... values (..);
>    insert into ... values (..);
>    insert into ... values (..);
> 
> or are you doing multi-row inserts like this:
> 
>    insert into ... values (..), (..), (..), (..);
> 
> Typically the latter will perform much better (especially if autocommit is
> enabled)

Please provide the EXPLAIN ANALYZE plans so we can take a look at what is 
causing these 'slow' inserts.




Re: rollback previous commit if the current one fails

2021-04-13 Thread luis . roberto
- Mensagem original -

> Any idea how to approach it?


Hi!

https://www.postgresql.org/docs/current/sql-savepoint.html



Luis R. Weck 




DB size

2021-04-26 Thread luis . roberto
Hi!

I've looked around, but could only find very old answers to this question, and 
maybe it changed  since then...

I'm struggling to identify the cause of the difference in size between the sum 
of all relations (via pg_total_relation_size) and pg_database_size:

  SELECT sum(pg_total_relation_size(relid)), 
 pg_size_pretty(sum(pg_total_relation_size(relid)))
FROM pg_catalog.pg_stat_all_tables 


sum |pg_size_pretty|
+--+
518549716992|483 GB|


SELECT pg_database_size('mydb'),
   pg_size_pretty(pg_database_size('mydb'))


pg_database_size|pg_size_pretty|
+--+
869150909087|809 GB|

There are three databases in the cluster, apart from 'mydb' (one of them is the 
'postgres' database). These other two size about 8MB each.


We run pg_repack weekly and recently had crashes related to disk running out of 
space, so my guess is something got 'lost' during repack.

What can I do to recover the wasted space?

Thanks!

Luis R. Weck 




Re: DB size

2021-04-26 Thread luis . roberto



- Mensagem original -
> De: "Josef Šimánek" 
> Para: "luis.roberto" 
> Cc: "pgsql-general" 
> Enviadas: Segunda-feira, 26 de abril de 2021 17:40:05
> Assunto: Re: DB size

> Do you have any indexes in mydb database?


Yes, I do. I believe pg_total_relation_size accounts for these. These are the 
results for my biggest table:

SELECT pg_table_size('my_table'),
   pg_size_pretty(pg_table_size('my_table')) AS tb_pretty,
   pg_indexes_size('my_table'),
   pg_size_pretty(pg_indexes_size('my_table')) AS idx_pretty,
   pg_total_relation_size('my_table'),
   pg_size_pretty(pg_total_relation_size('my_table')) AS total_pretty

pg_table_size|tb_pretty|pg_indexes_size|idx_pretty|pg_total_relation_size|total_pretty|
-+-+---+--+--++
  82016485376|76 GB|88119033856|82 GB |  170135519232|158 
GB  |


Luis R. Weck




Re: DB size

2021-04-26 Thread luis . roberto


- Mensagem original -
> De: "Alvaro Herrera" 
> Para: "luis.roberto" 
> Cc: "pgsql-general" 
> Enviadas: Segunda-feira, 26 de abril de 2021 17:45:34
> Assunto: Re: DB size

> I would guess that there are leftover files because of those crashes you
> mentioned. You can probably look for files in the database subdir in
> the data directory that do not appear in the pg_class.relfilenode
> listing for the database.

> --
> Álvaro Herrera 39°49'30"S 73°17'W
>  really, I see PHP as like a strange amalgamation of C, Perl, Shell
>  inflex: you know that "amalgam" means "mixture with mercury",
> more or less, right?
>  i.e., "deadly poison"


Thanks Alvaro! 

That's what I read in an old thread, back in the 8.3 days... Can you point me 
in the right direction as to where sould I look and how do I know which files 
exactly are safe to remove?


Luis R. Weck




Re: DB size

2021-04-27 Thread luis . roberto
- Mensagem original -
> De: "Magnus Hagander" 
> Para: "Laurenz Albe" 
> Cc: "Alvaro Herrera" , "luis.roberto" 
> , "pgsql-general"
> 
> Enviadas: Terça-feira, 27 de abril de 2021 4:05:42
> Assunto: Re: DB size

> Yeah, you want to use pg_relation_filenode(oid) rather than looking
> directly at relfilenode.
> 
> When compared to the filesystem, it's probably even easier to use
> pg_relation_filepath(oid).
> 
> --
> Magnus Hagander
> Me: https://www.hagander.net/
> Work: https://www.redpill-linpro.com/

Thanks all! 

Looks like pg_orphaned is what I need! I'll use pg_relation_filepath too to 
double check.



Luis R. Weck 




Re: idle_in_transaction_session_timeout

2021-05-07 Thread luis . roberto


- Mensagem original -
> De: "Atul Kumar" 
> Para: "pgsql-general" 
> Enviadas: Sexta-feira, 7 de maio de 2021 3:34:44
> Assunto: idle_in_transaction_session_timeout

> Hi,

> I have postgres 9.5 version running on my machine.

> When I am trying to find out the parameter
> idle_in_transaction_session_timeout it is showing me below error:

> postgres=# show idle_in_transaction_session_timeout;
> ERROR: unrecognized configuration parameter
> "idle_in_transaction_session_timeout"

> I also checked postgresql.conf but even in this file there is no such 
> parameter.

> Please help me to find this parameter.

> Regards,
> Atul

idle_in_transaction_session_timeout first appears in v9.6[1]

[1] https://www.postgresql.org/docs/9.6/runtime-config-client.html

Luis R. Weck