Re: UPDATE on 20 Million Records Transaction or not?
Thanks Adrian, > You could break it down into multiple transactions if there is way to specify > ranges of records. Say I couldn't break it up, would it be faster in or out of the transaction? It depends whether you have concurrent transactions using the table. If you do, I think it would be better to split the update between smaller transactions.
Join optimization
Hi! Recently on a blogpost [1] I saw that Oracle was able to "optimize" a join strategy by completely eliminating access to a table. Heres the execution on Oracle 18c [2] and PostgreSQL 13 (beta) [3]. Is there a fundamental reason why PG can't do the same? Thanks! [1] [ https://blog.dbi-services.com/the-myth-of-nosql-vs-rdbms-joins-dont-scale/ | https://blog.dbi-services.com/the-myth-of-nosql-vs-rdbms-joins-dont-scale/ ] [2] [ https://dbfiddle.uk/?rdbms=oracle_18&fiddle=a5afeba2fdb27dec7533545ab0a6eb0e | https://dbfiddle.uk/?rdbms=oracle_18&fiddle=a5afeba2fdb27dec7533545ab0a6eb0e ] [3] [ https://dbfiddle.uk/?rdbms=postgres_13&fiddle=87c78b4e97fdbd87255efc2fc909ee62 | https://dbfiddle.uk/?rdbms=postgres_13&fiddle=87c78b4e97fdbd87255efc2fc909ee62 ]
Re: Join optimization
I'm sorry for the bad example. Here is another, with some data on PG: [ https://dbfiddle.uk/?rdbms=postgres_13&fiddle=ccfd1c4fa291e74a6db9db1772e2b5ac | https://dbfiddle.uk/?rdbms=postgres_13&fiddle=ccfd1c4fa291e74a6db9db1772e2b5ac ] and Oracle: [ https://dbfiddle.uk/?rdbms=oracle_18&fiddle=21a98f499065ad4e2c35ff4bd1487e14 | https://dbfiddle.uk/?rdbms=oracle_18&fiddle=21a98f499065ad4e2c35ff4bd1487e14 ] . I don't understand oracle's execution plan very well, but it doesn't seem to be hitting the Users table... De: "Fabrízio de Royes Mello" Para: "luis.roberto" Cc: "pgsql-general" Enviadas: Sábado, 11 de julho de 2020 15:24:04 Assunto: Re: Join optimization Em sáb, 11 de jul de 2020 às 14:20, < [ mailto:luis.robe...@siscobra.com.br | luis.robe...@siscobra.com.br ] > escreveu: Hi! Recently on a blogpost [1] I saw that Oracle was able to "optimize" a join strategy by completely eliminating access to a table. Heres the execution on Oracle 18c [2] and PostgreSQL 13 (beta) [3]. Is there a fundamental reason why PG can't do the same? It does... did you see the “never executed” notice on the Postgres explain output? Regards, BQ_BEGIN BQ_END -- Fabrízio de Royes Mello Timbira - [ http://www.timbira.com.br/ | http://www.timbira.com.br/ ] PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
Re: Join optimization
- Mensagem original - De: "David Rowley" Para: "luis.roberto" Cc: "Fabrízio de Royes Mello" , "pgsql-general" Enviadas: Domingo, 12 de julho de 2020 5:29:08 Assunto: Re: Join optimization On Sun, 12 Jul 2020 at 06:59, wrote: > > I'm sorry for the bad example. > > Here is another, with some data on PG: > https://dbfiddle.uk/?rdbms=postgres_13&fiddle=ccfd1c4fa291e74a6db9db1772e2b5ac > and Oracle: > https://dbfiddle.uk/?rdbms=oracle_18&fiddle=21a98f499065ad4e2c35ff4bd1487e14. I believe what you're talking about is join removals. It appears as though Oracle is able to remove the inner join to the users table as the join only serves to check the user record exists. No columns are being selected. The record must exist due to the foreign key referencing users. PostgreSQL currently can only remove left joins. Likely what you could do here is just change your inner join into a left join. If you're happy enough that the user record will always exist then that should allow it to work. The reason PostgreSQL does not currently support inner join is that by default, foreign key constraints are only triggered at the end of the query, (or if deferred, at the end of the transaction). WIth PostgreSQL, it's possible for a SELECT query to see a violated foreign key constraint. This can happen if your SELECT query calls a function which updates a referenced record. The cascade of the foreign key won't occur until the end of the statement, so the select may stumble upon a violated foreign key. Here's a quick example of this case: drop table t1,t2; create table t1 (a int primary key); create table t2 (a int references t1 on update cascade); insert into t1 values(1),(2); insert into t2 values(2),(2); create or replace function update_t1 (p_a int) returns int as $$ begin update t1 set a = a + 1 where a = p_a; return p_a; end; $$ language plpgsql volatile; -- in theory, this should never return any rows as we're asking for all rows that -- don't exist in the referenced table. However, we do get a result since the function -- call updates t1 setting the row with a=2 to a=3. The cascade to t2 does not occur -- until the end of the statement. select update_t1(a+1),t1.a from t1 where not exists(select 1 from t2 where t1.a=t2.a); update_t1 | a ---+--- 2 | 1 (1 row) If you're happy that you don't have any functions like that which could momentarily cause the foreign key to appear violated, then there shouldn't be any harm in changing the INNER JOIN on users to a LEFT JOIN. PostgreSQL will be able to remove the join in that case. David - Thanks for the reply David! I understand it better now.
Re: Performance of "distinct with limit"
Hi, If "n" is indexed, it should run quickly. Can you share the execution plan for your query? De: "Klaudie Willis" Para: "pgsql-general" Enviadas: Sexta-feira, 28 de agosto de 2020 8:29:58 Assunto: Performance of "distinct with limit" Hi, Ran into this under-optimized query execution. select distinct n from bigtable; -- Lets say this takes 2 minutes select distinct n from bigtable limit 2 -- This takes approximately the same time However, the latter should have the potential to be so much quicker. I checked the same query on MSSQL (with 'top 2'), and it seems to do exactly the optimization I would expect. Is there any way to achieve a similar speedup in Postgresql? Klaudie
Effective IO Concurrency
Hi! In PostgreSQL 13, the way of using effective_io_concurrency has changed. Until v12, I used 200 for this value (using SSD drives). Using the new formula described in [ https://www.postgresql.org/docs/13/release-13.html | https://www.postgresql.org/docs/13/release-13.html ] gives me 1176. However, in the documentation [ https://www.postgresql.org/docs/13/runtime-config-resource.html#GUC-EFFECTIVE-IO-CONCURRENCY | https://www.postgresql.org/docs/13/runtime-config-resource.html#GUC-EFFECTIVE-IO-CONCURRENCY ] it says that the maximum value allowed is 1000.
Re: Effective IO Concurrency
De: "Ron" Para: "pgsql-general" Enviadas: Segunda-feira, 14 de setembro de 2020 13:10:23 Assunto: Re: Effective IO Concurrency On 9/14/20 11:03 AM, Laurenz Albe wrote: > On Mon, 2020-09-14 at 10:39 -0300, luis.robe...@siscobra.com.br wrote: >> In PostgreSQL 13, the way of using effective_io_concurrency has changed. >> Until v12, >> I used 200 for this value (using SSD drives). Using the new formula >> described in >> https://www.postgresql.org/docs/13/release-13.html gives me 1176. >> However, in the documentation >> https://www.postgresql.org/docs/13/runtime-config-resource.html#GUC-EFFECTIVE-IO-CONCURRENCY >> >> it says that the maximum value allowed is 1000. > Then use the value 1000... I think he means that the formula should take that into account. -- Angular momentum makes the world go 'round. Exactly For future reference, and maybe a updated documentation: SELECT least(round(sum(OLD / n::float)),1000) FROM generate_series(1, OLD) s(n) I don't know how to write patches, so maybe someone can do that. Thanks.
Table sizes
Hi! I'm trying to use this query to get table sizes, however I'm getting a strange error: select tablename,pg_relation_size(tablename::text) from pg_tables; In PG 13: SQL Error [42P01]: ERROR: relation "sql_implementation_info" does not exist In PG 12: SQL Error [42P01]: ERROR: relation "sql_parts" does not exist
Re: Table sizes
De: "Charles Clavadetscher" Para: "luis.roberto" Cc: "pgsql-general" Enviadas: Quarta-feira, 30 de setembro de 2020 10:46:39 Assunto: Re: Table sizes Hello On 2020-09-30 14:11, luis.robe...@siscobra.com.br wrote: > Hi! > > I'm trying to use this query to get table sizes, however I'm getting a > strange error: > > select tablename,pg_relation_size(tablename::text) > from pg_tables; > > In PG 13: > > SQL Error [42P01]: ERROR: relation "sql_implementation_info" does not > exist > > In PG 12: > > SQL Error [42P01]: ERROR: relation "sql_parts" does not exist Try like this: select schemaname, tablename, pg_relation_size((schemaname || '.' || '"' || tablename || '"')::regclass) from pg_tables; You need to schema qualify the tables. Additionally, if you happen to have table names that have a mix of capital and non capital letters or contain other characters that might be problematic, you need to enclose the table name in double quotes. Regards Charles -- Charles Clavadetscher Swiss PostgreSQL Users Group Treasurer Spitzackerstrasse 9 CH - 8057 Zürich http://www.swisspug.org +---+ | __ ___ | | / )/ \/ \ | | ( / __ _\ ) | | \ (/ o) ( o) ) | | \_ (_ ) \ ) _/ | | \ /\_/ \)/ | | \/ | | _| | | | \|_/ | | | | Swiss PostgreSQL | | Users Group | | | +---+ Thanks, this worked. I wonder though, why calling pg_relation_size('users') work (I don't need to specify the schema).
Analyze and Statistics
Hi! Analyzing a table which has a statistic object raises the message: statistics object "public.new_statistic" could not be computed for relation "public.client"
Re: How does Postgres decide if to use additional workers?
De: "Thorsten Schöning" Para: "pgsql-general" Enviadas: Terça-feira, 9 de fevereiro de 2021 12:52:02 Assunto: How does Postgres decide if to use additional workers? So, based on which facts does Postgres decide if to use aadditional workers or not? Can I see those decisions explained somewhere? I don't see anything in the query plan. Thanks! Mit freundlichen Grüßen Thorsten Schöning Hi! I think you should look into min_parallel_table_scan_size and min_parallel_index_scan_size GUCs.
Re: pg_stat_progress_vacuum empty when running vacuum full
> De: "Luca Ferrari" > Para: "pgsql-general" > Enviadas: Sexta-feira, 12 de fevereiro de 2021 8:00:46 > Assunto: pg_stat_progress_vacuum empty when running vacuum full > Hi all, > I'm running 11.5 and I'm monitoring pg_stat_progress_vacuum every 2 > seconds, while doing from another connection per-table VACUUMs. Every > vacuum last 30 or more seconds, so I was expecting to see a record > within pg_stat_progress_vacuum, but nothing appears if the vacuum is > full. I suspect this is due to vacuum full performing a side-by-side > table rewriting, rather than in-place actions against the original > table, but I'm not sure if this is real reason or if I'm missing > something. > > Thanks, > Luca Hi! I believe VACUUM FULL progress can be monitored via the pg_stat_progress_cluster command: https://www.postgresql.org/docs/current/progress-reporting.html#VACUUM-PROGRESS-REPORTING
Re: Slow while inserting and retrieval (compared to SQL Server)
- Mensagem original - > De: sivapostg...@yahoo.com > Para: "pgsql-general" > Enviadas: Quarta-feira, 17 de fevereiro de 2021 9:01:15 > Assunto: Re: Slow while inserting and retrieval (compared to SQL Server) > > Hello, > > Using Postgres 11 in Windows Server 2012 & Powerbuilder > Working from the same machine where Postgres 11 is installed. So no chance > for any network issue, I feel. > No setup/config change done. Just working with all the default settings. > With no data in any of the 179 tables in that database. > > To populate some basic data we try to insert few records (max 4 records) in > few tables (around 6 tables) from one window. We feel that the insert time > taken is longer than the time taken while using Sql Server. We tested almost > a similar window that updated the similar table(s) in SQL server, which was > > > faster. With Postgres database, we need to wait for a couple of seconds > before the insert/update is over, which we didn't feel in Sql Server. > > I feel that some changes in settings might improve this speed, but with not > much knowledge in Postgres I struggle to find out those setup values. > > Any help in improving the speed is really appreciated. > > Happiness Always > BKR Sivaprakash Can you run EXPLAIN (ANALYZE,BUFFERS) on the INSERT command and post the results here? Usually inserting this many records should be instantaneous.
Re: Slow while inserting and retrieval (compared to SQL Server)
- Mensagem original - > De: "sivapostgres" > Para: "luis.roberto" > Cc: "pgsql-general" > Enviadas: Quarta-feira, 17 de fevereiro de 2021 9:54:18 > Assunto: Re: Slow while inserting and retrieval (compared to SQL Server) > Hello, > Yes, that's what I feel. With no records in any tables, working from the same > machine where PG has been installed, with only one user working, inserting few > records (10 records in total, in all 6 tables) should not take this much time. > I'll be inserting records from PowerBuilder applications, and how to catch the > result of Explain. OR should I run all insert commands in PG Admin or so and > catch those results? > As I'm new to PG, any documentation/help in this direction will be useful. > Happiness Always > BKR Sivaprakash > On Wednesday, 17 February, 2021, 05:35:43 pm IST, > > wrote: You can run it wherever you prefer. I'm not familiar with PowerBuilder, so I'd say PGAdmin. BTW, if you run the INSERTs on PGAdmin, do you still seeing slow timings?
Re: Slow while inserting and retrieval (compared to SQL Server)
> De: "sivapostgres" > Para: "Benedict Holland" > Cc: "pgsql-general" , "Thomas Kellerer" > > Enviadas: Quarta-feira, 17 de fevereiro de 2021 11:09:38 > Assunto: Re: Slow while inserting and retrieval (compared to SQL Server) > So far no performance tuning done for sql server. It works fine for the load. > Even the express edition which is free, works better. I don't think postgres > will be so slow to insert such a low number of records in an empty database. > I'll be preparing the required sql statements to insert those records in > pgadmin > and see the timings, tomorrow. > >On Wednesday, 17 February, 2021, 07:29:29 pm IST, Benedict Holland > wrote: > > Sql server is a 10k dollar to 1 million dollar application (or more) that is > specifically optimized for windows and had limited to no support anywhere > else. > Postgres is free and from my experience, comes within 5% of any other dbms. > Inserting 1 row at a time with auto commit on will be a bit slow but it > shouldn't be noticeable. What times are you seeing if you do this with > pgadmin4 > compared to sql server? Also, have you done any performance tuning for > postgres > server? There are many documents detailing performance tuning your servers, > like you probably did, at some point, with your sql server. > Thanks,Ben > On Wed, Feb 17, 2021, 8:28 AM sivapostg...@yahoo.com > wrote: > > We use datawindows. Datawindows will send the required DML statements to the > database. And it sent in format 1 . > IN start of the application, Autocommit set to True.Before update of any > table(s)Autocommit is set to FalseInsert/Update/Delete recordsIf success > commit > else rollbackAutocommit is set to True > This has been followed for decades and it's working fine with Sql server. > Here we are trying to insert just 10 records spread across 6 tables, which is > taking more time.. that's what we feel. The similar work in SQL Server takes > much less time < as if no wait is there >. >On Wednesday, 17 February, 2021, 06:48:35 pm IST, Thomas Kellerer > wrote: > > sivapostg...@yahoo.com schrieb am 17.02.2021 um 13:01: >> To populate some basic data we try to insert few records (max 4 >> records) in few tables (around 6 tables) from one window. We feel >> that the insert time taken is longer than the time taken while using >> Sql Server. We tested almost a similar window that updated the >> similar table(s) in SQL server, which was faster. With Postgres >> database, we need to wait for a couple of seconds before the >> insert/update is over, which we didn't feel in Sql Server. > > > Are you doing single-row inserts like: > > insert into ... values (..); > insert into ... values (..); > insert into ... values (..); > insert into ... values (..); > > or are you doing multi-row inserts like this: > > insert into ... values (..), (..), (..), (..); > > Typically the latter will perform much better (especially if autocommit is > enabled) Please provide the EXPLAIN ANALYZE plans so we can take a look at what is causing these 'slow' inserts.
Re: rollback previous commit if the current one fails
- Mensagem original - > Any idea how to approach it? Hi! https://www.postgresql.org/docs/current/sql-savepoint.html Luis R. Weck
DB size
Hi! I've looked around, but could only find very old answers to this question, and maybe it changed since then... I'm struggling to identify the cause of the difference in size between the sum of all relations (via pg_total_relation_size) and pg_database_size: SELECT sum(pg_total_relation_size(relid)), pg_size_pretty(sum(pg_total_relation_size(relid))) FROM pg_catalog.pg_stat_all_tables sum |pg_size_pretty| +--+ 518549716992|483 GB| SELECT pg_database_size('mydb'), pg_size_pretty(pg_database_size('mydb')) pg_database_size|pg_size_pretty| +--+ 869150909087|809 GB| There are three databases in the cluster, apart from 'mydb' (one of them is the 'postgres' database). These other two size about 8MB each. We run pg_repack weekly and recently had crashes related to disk running out of space, so my guess is something got 'lost' during repack. What can I do to recover the wasted space? Thanks! Luis R. Weck
Re: DB size
- Mensagem original - > De: "Josef Šimánek" > Para: "luis.roberto" > Cc: "pgsql-general" > Enviadas: Segunda-feira, 26 de abril de 2021 17:40:05 > Assunto: Re: DB size > Do you have any indexes in mydb database? Yes, I do. I believe pg_total_relation_size accounts for these. These are the results for my biggest table: SELECT pg_table_size('my_table'), pg_size_pretty(pg_table_size('my_table')) AS tb_pretty, pg_indexes_size('my_table'), pg_size_pretty(pg_indexes_size('my_table')) AS idx_pretty, pg_total_relation_size('my_table'), pg_size_pretty(pg_total_relation_size('my_table')) AS total_pretty pg_table_size|tb_pretty|pg_indexes_size|idx_pretty|pg_total_relation_size|total_pretty| -+-+---+--+--++ 82016485376|76 GB|88119033856|82 GB | 170135519232|158 GB | Luis R. Weck
Re: DB size
- Mensagem original - > De: "Alvaro Herrera" > Para: "luis.roberto" > Cc: "pgsql-general" > Enviadas: Segunda-feira, 26 de abril de 2021 17:45:34 > Assunto: Re: DB size > I would guess that there are leftover files because of those crashes you > mentioned. You can probably look for files in the database subdir in > the data directory that do not appear in the pg_class.relfilenode > listing for the database. > -- > Álvaro Herrera 39°49'30"S 73°17'W > really, I see PHP as like a strange amalgamation of C, Perl, Shell > inflex: you know that "amalgam" means "mixture with mercury", > more or less, right? > i.e., "deadly poison" Thanks Alvaro! That's what I read in an old thread, back in the 8.3 days... Can you point me in the right direction as to where sould I look and how do I know which files exactly are safe to remove? Luis R. Weck
Re: DB size
- Mensagem original - > De: "Magnus Hagander" > Para: "Laurenz Albe" > Cc: "Alvaro Herrera" , "luis.roberto" > , "pgsql-general" > > Enviadas: Terça-feira, 27 de abril de 2021 4:05:42 > Assunto: Re: DB size > Yeah, you want to use pg_relation_filenode(oid) rather than looking > directly at relfilenode. > > When compared to the filesystem, it's probably even easier to use > pg_relation_filepath(oid). > > -- > Magnus Hagander > Me: https://www.hagander.net/ > Work: https://www.redpill-linpro.com/ Thanks all! Looks like pg_orphaned is what I need! I'll use pg_relation_filepath too to double check. Luis R. Weck
Re: idle_in_transaction_session_timeout
- Mensagem original - > De: "Atul Kumar" > Para: "pgsql-general" > Enviadas: Sexta-feira, 7 de maio de 2021 3:34:44 > Assunto: idle_in_transaction_session_timeout > Hi, > I have postgres 9.5 version running on my machine. > When I am trying to find out the parameter > idle_in_transaction_session_timeout it is showing me below error: > postgres=# show idle_in_transaction_session_timeout; > ERROR: unrecognized configuration parameter > "idle_in_transaction_session_timeout" > I also checked postgresql.conf but even in this file there is no such > parameter. > Please help me to find this parameter. > Regards, > Atul idle_in_transaction_session_timeout first appears in v9.6[1] [1] https://www.postgresql.org/docs/9.6/runtime-config-client.html Luis R. Weck