RE: Barman

2019-11-01 Thread Daulat Ram
Thanks Tomas for your inputs. Suppose, if we have database in TB's with OLTP 
applications then what will be suitable backup strategy. 


-Original Message-
From: Tomas Vondra  
Sent: Friday, November 1, 2019 12:27 AM
To: Daulat Ram 
Cc: pgsql-general@lists.postgresql.org; pgsql-performa...@lists.postgresql.org
Subject: Re: Barman

On Thu, Oct 31, 2019 at 05:29:34PM +, Daulat Ram wrote:
>Hi All,
>
>We have configured postgres 11.2 in streaming replication (primary &
>Standby)  on docker and I am looking to initiate the Postgres backup 
>using barman. As I know there are few options for taking backup using 
>barman.
>
>RSYNC backup
>Incremental Backups
>Streaming Backup with continuous WAL streaming Centralized and 
>Catalogued Backups
>
>Which is the best option for backup using barman? So that we can keep 
>the database safe in case of disaster? I feel the Incremental Backups 
>are most useful to perform the PITR but I want to know the experts 
>suggestions.
>

You're mixing a number of topics, here. Firstly, all backups done by barman are 
centralized and catalogued, that's pretty much one of the main purposes of 
barman.

When it comes to backup methods, there are two basic methods. rsync and 
postgres (which means pg_basebackup). This is about creating the initial base 
backup. Both methods then can replicate WAL by either streaming or 
archive_command.

So first you need to decide whether to use rsync and pg_basebackup, where rsync 
allows advanced features like incremental backup, parallel backup and 
deduplication.

Then you need to decide whether to use archive_command or streaming (i.e. 
pg_receivexlog).

The "right" backup method very much depends on the size of your database, 
activity, and so on. By default you should probably go with the default option, 
described as "scenario 1" in the barman docs, i.e.
pg_basebackup (backup_method = postgres) and WAL streaming.

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Getting following error in using cursor to fetch the records from a large table in c language(current transaction is aborted, commands ignored until end of transaction block)

2019-11-01 Thread M Tarkeshwar Rao
Hi all,



Getting following error in using cursor to fetch the records from a large table 
in c language.

Can you please suggest why it is coming and what is the remedy for this.



Error Details

-
Nov  1 13:21:54 sprintstd2 postgres[18604]: [10-1] < 2019-11-01 13:21:54.212 
CET > ERROR:  current transaction is aborted, commands ignored until end of 
transaction block
Nov  1 13:21:54 sprintstd2 postgres[18604]: [10-2] < 2019-11-01 13:21:54.212 
CET > STATEMENT:  BEGIN
Nov  1 13:21:54 sprintstd2 postgres[18604]: [11-1] < 2019-11-01 13:21:54.324 
CET > ERROR:  current transaction is aborted, commands ignored until end of 
transaction block
Nov  1 13:21:54 sprintstd2 postgres[18604]: [11-2] < 2019-11-01 13:21:54.324 
CET > STATEMENT:  BEGIN
Nov  1 13:21:54 sprintstd2 postgres[18604]: [12-1] < 2019-11-01 13:21:54.356 
CET > ERROR:  current transaction is aborted, commands ignored until end of 
transaction block
Nov  1 13:21:54 sprintstd2 postgres[18604]: [12-2] < 2019-11-01 13:21:54.356 
CET > STATEMENT:  BEGIN
Nov  1 13:21:54 sprintstd2 postgres[18604]: [13-1] < 2019-11-01 13:21:54.360 
CET > ERROR:  current transaction is aborted, commands ignored until end of 
transaction block
Nov  1 13:21:54 sprintstd2 postgres[18604]: [13-2] < 2019-11-01 13:21:54.360 
CET > STATEMENT



Sample Code snippet used



theCursorDec = (RWCString)"DECLARE " +  mySqlCursor + " CURSOR FOR " + 
theSql;

myFetchSql = "FETCH " + fetchStr + " IN " + mySqlCursor;



// Begin the cursor

PQexec(connection, ,"BEGIN"))

PQexec(connection, ,"myFetchSql")



// Fetch records from the cursor. Getting First N tuples

mySqlResultsPG = PQexec(connection,myFetchSql);

if(PQresultStatus(mySqlResultsPG) == PGRES_TUPLES_OK)

{

myNumColumns = PQnfields(mySqlResultsPG);

ntuples = PQntuples(mySqlResultsPG);

myTotalNumberOfRowsInQueryResult = ntuples;

myCurrentRowNum = 0 ;

}



Regards

Tarkeshwar



RE: Declarative Range Partitioning Postgres 11

2019-11-01 Thread Shatamjeev Dewan
Hi Michael,

I want to create a partition by year and subpartition by month in postgres 11 
timestamp column. Please advise syntax.

Thanks,
Shatamjeev

From: Michael Lewis 
Sent: October-08-19 1:33 PM
To: Shatamjeev Dewan 
Cc: pgsql-general 
Subject: Re: Declarative Range Partitioning Postgres 11

On Tue, Oct 8, 2019 at 8:00 AM Shatamjeev Dewan 
mailto:sde...@nbsps.com>> wrote:
Hi Michael,

In this case , I always need to include partition key(date)  in primary key ( 
if I have a primary key defined on non partition key column e.g id (in my 
case), to make it a composite primary key (id, date). This would allow 
duplicate id with different date,which is not desirable .

If you are generating the ID with a sequence, there isn't any real world 
likelihood of conflict, but I do understand your concern in terms of enforcing 
data integrity. Other than creating a custom stored procedure that functions as 
a primary key constraint, I don't know of any way around that.

Let's take a step back... why do you think you need to partition at all? And 
why partition by the date/timestamp/timestamptz field? Also, from what I have 
seen, PG12 is when partitioning really gets performant in terms of more than 10 
to 100 partitions, and you can then create FKeys to the partitioned table (not 
possible in PG11). Also, if your frequent access of the table is by 
date/timestamptz field, then you might consider a BRIN index if you have high 
correlation between physical storage and values in that field. That can 
mitigate the need for partitioning.

Our organization will be waiting until next quarter to upgrade to PG12 and then 
partitioning a few of our largest tables. That is to say, I don't have 
experience with partitioning in production yet so others may chime in with 
better advice.


Re: Getting following error in using cursor to fetch the records from a large table in c language(current transaction is aborted, commands ignored until end of transaction block)

2019-11-01 Thread Francisco Olarte
On Fri, Nov 1, 2019 at 1:44 PM M Tarkeshwar Rao
 wrote:
> Getting following error in using cursor to fetch the records from a large 
> table in c language.

Regarding this, "c language", I'll comment later


> Can you please suggest why it is coming and what is the remedy for this.

> Nov  1 13:21:54 sprintstd2 postgres[18604]: [10-1] < 2019-11-01 13:21:54.212 
> CET > ERROR:  current transaction is aborted, commands ignored until end of 
> transaction block
> Nov  1 13:21:54 sprintstd2 postgres[18604]: [10-2] < 2019-11-01 13:21:54.212 
> CET > STATEMENT:  BEGIN
> Nov  1 13:21:54 sprintstd2 postgres[18604]: [11-1] < 2019-11-01 13:21:54.324 
> CET > ERROR:  current transaction is aborted, commands ignored until end of 
> transaction block
> Nov  1 13:21:54 sprintstd2 postgres[18604]: [11-2] < 2019-11-01 13:21:54.324 
> CET > STATEMENT:  BEGIN
> Nov  1 13:21:54 sprintstd2 postgres[18604]: [12-1] < 2019-11-01 13:21:54.356 
> CET > ERROR:  current transaction is aborted, commands ignored until end of 
> transaction block
> Nov  1 13:21:54 sprintstd2 postgres[18604]: [12-2] < 2019-11-01 13:21:54.356 
> CET > STATEMENT:  BEGIN
> Nov  1 13:21:54 sprintstd2 postgres[18604]: [13-1] < 2019-11-01 13:21:54.360 
> CET > ERROR:  current transaction is aborted, commands ignored until end of 
> transaction block
> Nov  1 13:21:54 sprintstd2 postgres[18604]: [13-2] < 2019-11-01 13:21:54.360 
> CET > STATEMENT

This seems incomplete, but I's telling you the cause. You had an
error, you need to terminate the transaction before issuing a new one,
i.e., do a commit ( which, IIRC, will rollback if the transaction is
in error ) or rollback.

> Sample Code snippet used

As you stated C I cannot comment too much, but notice:


> theCursorDec = (RWCString)"DECLARE " +  mySqlCursor + " CURSOR FOR " 
> + theSql;
> myFetchSql = "FETCH " + fetchStr + " IN " + mySqlCursor;

Neither of these are C, they COULD be C++

> // Begin the cursor
Same as this comment.

> PQexec(connection, ,"BEGIN"))
> PQexec(connection, ,"myFetchSql”)

And these are definitely not C ( no ; ) and, if you generated them by
editing, myfetchsql is quoted which smells fishy.

I won't comment more until you confirm that is the real code, but
anyway  it seems to me you issue transaction start queries without
properly terminating them with a transaction end one.

Francisco Olarte




Are my autovacuum settings too aggressive for this table?

2019-11-01 Thread Jason Ralph
Hello list,

DB1=# select version();
-[ RECORD 1 
]
version | PostgreSQL 11.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.4.7 
20120313 (Red Hat 4.4.7-23), 64-bit


I am sure this question has come up before, I have scoured the documentation 
and I think I have a good grasp on the autovacuum.  I wanted to ask the team if 
my settings for this particular table are too aggressive, I have the following 
set which is producing a vacuum analyze multiple times a day.

I think the defaults out of the box were not aggressive enough, so I went with 
the following on the global level, I will possibly move to table level if 
needed.  I tried to show the stats below of a 10 minute interval during peak 
time. Any push in the right direction is appreciated, I want my tables analyzed 
and vacuumed but do not want to over do it. The rest of the autovacuum settings 
are default.

I know the stats are estimates so here is my calculations.

Live tuples = 19,766,480
Analyze scale factor = 0.001
Analyze thresh = 5000
Thresh + live_tuples * factor = 24,766

So an autovacuum analyze should trigger around 24K tuples modified, is this to 
little or too much? Same goes for autvacuum vacuum, is it too aggressive?

#--
# AUTOVACUUM
#--
autovacuum_naptime = 1h
autovacuum_vacuum_threshold = 1
autovacuum_analyze_threshold = 5000
autovacuum_vacuum_scale_factor = 0.002
autovacuum_analyze_scale_factor = 0.001




DB1=# \x
Expanded display is on.
DB1=# select now();
-[ RECORD 1 ]--
now | 2019-11-01 14:35:51.893328-04

DB1=# select * from pg_stat_user_tables where relname = 'members';
-[ RECORD 1 ]---+--
relid   | 18583
schemaname  | public
relname | members
seq_scan| 129
seq_tup_read| 2036932707
idx_scan| 546161742
idx_tup_fetch   | 1670607103
n_tup_ins   | 46742
n_tup_upd   | 35704112
n_tup_del   | 0
n_tup_hot_upd   | 31106485
n_live_tup  | 19766480
n_dead_tup  | 1844251
n_mod_since_analyze | 15191
last_vacuum | 2019-10-13 15:42:06.043385-04
last_autovacuum | 2019-11-01 12:24:45.575283-04
last_analyze| 2019-10-13 15:42:17.370086-04
last_autoanalyze| 2019-11-01 12:25:17.181133-04
vacuum_count| 2
autovacuum_count| 15
analyze_count   | 2
autoanalyze_count   | 17

DB1=# select now();
-[ RECORD 1 ]--
now | 2019-11-01 14:45:10.845269-04

DB1=# select * from pg_stat_user_tables where relname = 'members';
-[ RECORD 1 ]---+--
relid   | 18583
schemaname  | public
relname | members
seq_scan| 129
seq_tup_read| 2036932707
idx_scan| 546171120
idx_tup_fetch   | 1670615505
n_tup_ins   | 46742
n_tup_upd   | 35705068
n_tup_del   | 0
n_tup_hot_upd   | 31107303
n_live_tup  | 19766480
n_dead_tup  | 1844881
n_mod_since_analyze | 16147
last_vacuum | 2019-10-13 15:42:06.043385-04
last_autovacuum | 2019-11-01 12:24:45.575283-04
last_analyze| 2019-10-13 15:42:17.370086-04
last_autoanalyze| 2019-11-01 12:25:17.181133-04
vacuum_count| 2
autovacuum_count| 15
analyze_count   | 2
autoanalyze_count   | 17

Thanks for your time,
Jason Ralph

This message contains confidential information and is intended only for the 
individual named. If you are not the named addressee you should not 
disseminate, distribute or copy this e-mail. Please notify the sender 
immediately by e-mail if you have received this e-mail by mistake and delete 
this e-mail from your system. E-mail transmission cannot be guaranteed to be 
secure or error-free as information could be intercepted, corrupted, lost, 
destroyed, arrive late or incomplete, or contain viruses. The sender therefore 
does not accept liability for any errors or omissions in the contents of this 
message, which arise as a result of e-mail transmission. If verification is 
required please request a hard-copy version.




Re: Declarative Range Partitioning Postgres 11

2019-11-01 Thread Michael Lewis
On Fri, Nov 1, 2019 at 9:22 AM Shatamjeev Dewan  wrote:

> Hi Michael,
>
>
>
> I want to create a partition by year and subpartition by month in postgres
> 11 timestamp column. Please advise syntax.
>


https://www.postgresql.org/docs/11/ddl-partitioning.html

The documentation is rather clear with examples like-

CREATE TABLE measurement_y2006m02 PARTITION OF measurement
FOR VALUES FROM ('2006-02-01') TO ('2006-03-01');
CREATE TABLE measurement_y2006m03 PARTITION OF measurement
FOR VALUES FROM ('2006-03-01') TO ('2006-04-01');

Note- Don't partition on function results like date_part because
performance will likely suffer greatly. Also note that the top end is
always exclusive so the above give a continuous range for those two months.

I would hesitate to partition by more than year alone before upgrading to
PG v12. The speed improvements for more than 10-100 partitions (max
recommended for PG11) is huge in 12.


Re: Are my autovacuum settings too aggressive for this table?

2019-11-01 Thread Michael Lewis
My thinking is opposite from what you have. I consider it important to very
aggressive on autovacuum because it only ever does the required amount of
work. If a tiny amount of work is needed, it does only that and is done.
Assuming it doesn't cause I/O concerns, do it as often as possible to
minimize the growth of bloat and maximize the reuse of the space already
allocated to that relation.

On the subject of analyze, the statistics are unlikely to be impacted
significantly by inserting about 24k rows to a table with almost 20 million
already. With default_statistics_target at 100, what are the chances those
new rows will even be included in the sample? I don't know the math, but
given each run of analyze does the same ALL the work each and every time it
runs, it seems prudent to do them a little less often than autovacuum
anyway. Regardless though, autoanalyze is a small amount of work that it
does each time.


RE: Are my autovacuum settings too aggressive for this table?

2019-11-01 Thread Jason Ralph
Michael Lewis  writes:
>My thinking is opposite from what you have. I consider it important to very 
>aggressive on autovacuum because it only ever does the required amount of 
>>work. If a tiny amount of work is needed, it does only that and is done. 
>Assuming it doesn't cause I/O concerns, do it as often as possible to minimize 
>the> growth of bloat and maximize the reuse of the space already allocated to 
>that relation.

Excellent, I am not seeing any I/O concerns, and it seems to be keeping up now, 
so I will keep this setting unless someone else points out another suggestion.

>On the subject of analyze, the statistics are unlikely to be impacted 
>significantly by inserting about 24k rows to a table with almost 20 million 
>already. >With default_statistics_target at 100, what are the chances those 
>new rows will even be included in the sample? I don't know the math, but given 
>each >run of analyze does the same ALL the work each and every time it runs, 
>it seems prudent to do them a little less often than autovacuum anyway. 
>>Regardless though, autoanalyze is a small amount of work that it does each 
>time.

I agree, this is excellent advice, I overlooked the fact that this is a sample 
and the new rows may not even be included in this sample.  I will adjust 
accordingly.


-Original Message-
From: Jason Ralph 
Sent: Friday, November 1, 2019 2:59 PM
To: pgsql-general@lists.postgresql.org
Cc: Jason Ralph 
Subject: Are my autovacuum settings too aggressive for this table?

Hello list,

DB1=# select version();
-[ RECORD 1 
]
version | PostgreSQL 11.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.4.7 
20120313 (Red Hat 4.4.7-23), 64-bit


I am sure this question has come up before, I have scoured the documentation 
and I think I have a good grasp on the autovacuum.  I wanted to ask the team if 
my settings for this particular table are too aggressive, I have the following 
set which is producing a vacuum analyze multiple times a day.

I think the defaults out of the box were not aggressive enough, so I went with 
the following on the global level, I will possibly move to table level if 
needed.  I tried to show the stats below of a 10 minute interval during peak 
time. Any push in the right direction is appreciated, I want my tables analyzed 
and vacuumed but do not want to over do it. The rest of the autovacuum settings 
are default.

I know the stats are estimates so here is my calculations.

Live tuples = 19,766,480
Analyze scale factor = 0.001
Analyze thresh = 5000
Thresh + live_tuples * factor = 24,766

So an autovacuum analyze should trigger around 24K tuples modified, is this to 
little or too much? Same goes for autvacuum vacuum, is it too aggressive?

#--
# AUTOVACUUM
#--
autovacuum_naptime = 1h
autovacuum_vacuum_threshold = 1
autovacuum_analyze_threshold = 5000
autovacuum_vacuum_scale_factor = 0.002
autovacuum_analyze_scale_factor = 0.001




DB1=# \x
Expanded display is on.
DB1=# select now();
-[ RECORD 1 ]--
now | 2019-11-01 14:35:51.893328-04

DB1=# select * from pg_stat_user_tables where relname = 'members'; -[ RECORD 1 
]---+--
relid   | 18583
schemaname  | public
relname | members
seq_scan| 129
seq_tup_read| 2036932707
idx_scan| 546161742
idx_tup_fetch   | 1670607103
n_tup_ins   | 46742
n_tup_upd   | 35704112
n_tup_del   | 0
n_tup_hot_upd   | 31106485
n_live_tup  | 19766480
n_dead_tup  | 1844251
n_mod_since_analyze | 15191
last_vacuum | 2019-10-13 15:42:06.043385-04
last_autovacuum | 2019-11-01 12:24:45.575283-04
last_analyze| 2019-10-13 15:42:17.370086-04
last_autoanalyze| 2019-11-01 12:25:17.181133-04
vacuum_count| 2
autovacuum_count| 15
analyze_count   | 2
autoanalyze_count   | 17

DB1=# select now();
-[ RECORD 1 ]--
now | 2019-11-01 14:45:10.845269-04

DB1=# select * from pg_stat_user_tables where relname = 'members'; -[ RECORD 1 
]---+--
relid   | 18583
schemaname  | public
relname | members
seq_scan| 129
seq_tup_read| 2036932707
idx_scan| 546171120
idx_tup_fetch   | 1670615505
n_tup_ins   | 46742
n_tup_upd   | 35705068
n_tup_del   | 0
n_tup_hot_upd   | 31107303
n_live_tup  | 19766480
n_dead_tup  | 1844881
n_mod_since_analyze | 16147
last_vacuum | 2019-10-13 15:42:06.043385-04
last_autovacuum | 2019-11-01 12:24:45.575283-04
last_analyze| 2019-10-13 15:42:17.370086-04
last_autoanalyze| 2019-11-01 12:25:17.181133-04
vacuum_count

RE: Declarative Range Partitioning Postgres 11

2019-11-01 Thread Shatamjeev Dewan
Thanks a ton Michael

From: Michael Lewis 
Sent: November-01-19 3:20 PM
To: Shatamjeev Dewan 
Cc: pgsql-general 
Subject: Re: Declarative Range Partitioning Postgres 11

On Fri, Nov 1, 2019 at 9:22 AM Shatamjeev Dewan 
mailto:sde...@nbsps.com>> wrote:
Hi Michael,

I want to create a partition by year and subpartition by month in postgres 11 
timestamp column. Please advise syntax.


https://www.postgresql.org/docs/11/ddl-partitioning.html

The documentation is rather clear with examples like-

CREATE TABLE measurement_y2006m02 PARTITION OF measurement
FOR VALUES FROM ('2006-02-01') TO ('2006-03-01');
CREATE TABLE measurement_y2006m03 PARTITION OF measurement
FOR VALUES FROM ('2006-03-01') TO ('2006-04-01');

Note- Don't partition on function results like date_part because performance 
will likely suffer greatly. Also note that the top end is always exclusive so 
the above give a continuous range for those two months.

I would hesitate to partition by more than year alone before upgrading to PG 
v12. The speed improvements for more than 10-100 partitions (max recommended 
for PG11) is huge in 12.


QUERY: autovacuum: VACUUM ANALYZE table versus QUERY: autovacuum: VACUUM table

2019-11-01 Thread Jason Ralph
Hello list,
I am trying to find out if the naming convention from autovacuum does what its 
command line equivalent does, or at least what I think it does.

QUERY: autovacuum: VACUUM ANALYZE table versus autovacuum: VACUUM ANALYZE table.

I have been getting my autovacuum tuned and I have the settings where I see 2 
types of queries running now and doing their jobs. I know vacuum alone will 
mark dead tuples ready for use again, and analyze alone will update statistics 
for the query planner.

1. QUERY: autovacuum: VACUUM ANALYZE table
2. QUERY: autovacuum: VACUUM table

My question is do we need both?  Why wouldn't QUERY: autovacuum: VACUUM ANALYZE 
table perform both, I always thought when issuing VACUUM ANALYZE table it would 
do both. Why wouldn't we just want vacuum analyze to run?
Or is QUERY: autovacuum: VACUUM ANALYZE table different from the command line 
and it only analyzes and QUERY: autovacuum: VACUUM table only vacuums which 
would make sense.


Thanks as always and hope this is clear.


Jason Ralph

This message contains confidential information and is intended only for the 
individual named. If you are not the named addressee you should not 
disseminate, distribute or copy this e-mail. Please notify the sender 
immediately by e-mail if you have received this e-mail by mistake and delete 
this e-mail from your system. E-mail transmission cannot be guaranteed to be 
secure or error-free as information could be intercepted, corrupted, lost, 
destroyed, arrive late or incomplete, or contain viruses. The sender therefore 
does not accept liability for any errors or omissions in the contents of this 
message, which arise as a result of e-mail transmission. If verification is 
required please request a hard-copy version.