Re: Ways to deal with large amount of columns;

2018-08-31 Thread Ben Madin
Hi - this is a spreadsheet model, not a database model, and could be
modelled with three columns.

The aggregate functions are an analytic issue, not a data issue.

cheers

Ben


On 30 August 2018 at 17:13, a <372660...@qq.com> wrote:

> Hi all:
>
> I need to make a table contains projected monthly cashflow for multiple
> agents (10,000 around).
>
> Therefore, the column number would be 1000+.
>
> I would need to perform simple aggregate function such as count, sum or
> average on each cashflow projected.
>
> So if there is anyway of doing this? Will there be anything like define a
> macro in C that I can manipulate multiple columns by simple word that
> representing them.
>
> Thanks so much!
>
> Shore
>



-- 

[image: Ausvet Logo] 

Dr Ben Madin

BVMS MVPHMgmt PhD MANZCVS GAICD
Managing Director
Mobile:
+61 448 887 220 <+61448887220>
E-mail:
b...@ausvet.com.au
Website:
www.ausvet.com.au
Skype: benmadin
Address:
5 Shuffrey Street
Fremantle, WA 6160
Australia


Re: Ways to deal with large amount of columns;

2018-08-31 Thread Brent Wood
You need to normalise your data model... the basic dataset you describe will 
probably fit in one table with three columns, but that might be simplistic...

All your aggregate queries are pretty basic SQL statements, which could be 
represented as views or made simpler to invoke by turning them into functions.

So, one table describing agents, with columns for agent, date and projected 
cashflow will store all your data...
(index agent & date for performance)
eg: for a summary for all agents for the last 12 months:
select agent, sum(cashflow) 
from tablewhere date >= now() - interval '1 year'group by agentorder by agent;

or a cashflow aggregate summary for the latest month
select sum(cashflow), avg(cashflow), min(cashflow), max(cashflow)  
from table 
where date = (select max(date) from table);

or get the agent with highest projected cashflow for the latest month
select agentfrom tablewhere date = (select max(date) from table)  and 
cashflow=(select max(cashflow) from table 
  where date = (select max(date) from table));


  From: a <372660...@qq.com>
 To: pgsql-general  
 Sent: Thursday, August 30, 2018 9:14 PM
 Subject: Ways to deal with large amount of columns;
   
Hi all:
I need to make a table contains projected monthly cashflow for multiple agents 
(10,000 around).
Therefore, the column number would be 1000+. 
I would need to perform simple aggregate function such as count, sum or average 
on each cashflow projected.
So if there is anyway of doing this? Will there be anything like define a macro 
in C that I can manipulate multiple columns by simple word that representing 
them.

Thanks so much!
Shore

   

very slow largeobject transfers through JDBC

2018-08-31 Thread Mate Varga
Hi,

we're fetching binary data from pg_largeobject table. The data is not very
large, but we ended up storing it there. If I'm copying the data to a file
from the psql console, then it takes X time (e.g. a second), fetching it
through the JDBC driver takes at least 10x more. We don't see this
difference between JDBC and 'native' performance for anything except
largeobjects (and bytea columns, for the record).

Does anyone have any advice about whether this can be tuned or what the
cause is?

Thanks,
Mate


Re: very slow largeobject transfers through JDBC

2018-08-31 Thread Dmitry Igrishin
пт, 31 авг. 2018 г. в 16:35, Mate Varga :
>
> Hi,
>
> we're fetching binary data from pg_largeobject table. The data is not very 
> large, but we ended up storing it there. If I'm copying the data to a file 
> from the psql console, then it takes X time (e.g. a second), fetching it 
> through the JDBC driver takes at least 10x more. We don't see this difference 
> between JDBC and 'native' performance for anything except largeobjects (and 
> bytea columns, for the record).
>
> Does anyone have any advice about whether this can be tuned or what the cause 
> is?
I don't know what a reason of that, but I think it's reasonable and
quite simple to call lo_import()/lo_export() via JNI.



Re: pg_upgrade fails saying function unaccent(text) doesn't exist

2018-08-31 Thread Adrian Klaver

On 08/30/2018 09:03 AM, Gunnlaugur Thor Briem wrote:

\dFd unaccent
List of text search dictionaries
  Schema |   Name   | Description
+--+-
  public | unaccent |
(1 row)

\dFd+ unaccent
     List of text search dictionaries
  Schema |   Name   |    Template |    Init options    | Description
+--+-++-
  public | unaccent | public.unaccent | rules = 'unaccent' |
(1 row)

Cheers,
Gulli



I could not replicate with simple case:

select version();
  version 



 PostgreSQL 10.5 on x86_64-pc-linux-gnu, compiled by gcc (SUSE Linux) 
4.8.5, 64-bit


CREATE OR REPLACE FUNCTION public.semantic_normalize(title text)
 RETURNS text
 LANGUAGE sql
AS $function$
SELECT lower(unaccent(btrim(regexp_replace($1,
'\s+', ' ', 'g'), ' "')))
$function$

CREATE TABLE unaccent_test(title text);

INSERT INTO unaccent_test values ('Hukić'), ('Böttcher'), ('ÀÁÂÃÄÅ'), 
('électro');


CREATE INDEX ix_semantic_normalize_title on unaccent_test(title);

VACUUM ANALYZE unaccent_test;
VACUUM

vacuumdb -U postgres -z -t unaccent_test test
vacuumdb: vacuuming database "test"


The only thing I can think of is that you have an older version of 
vacuumdb that is not aware of the schema specification changes in the 
newer versions of Postgrse.




--
Adrian Klaver
adrian.kla...@aklaver.com



Re: very slow largeobject transfers through JDBC

2018-08-31 Thread Mate Varga
I see -- we could try that, though we're mostly using an ORM (Hibernate) to
do this. Thanks!

On Fri, Aug 31, 2018 at 3:57 PM Dmitry Igrishin  wrote:

> пт, 31 авг. 2018 г. в 16:35, Mate Varga :
> >
> > Hi,
> >
> > we're fetching binary data from pg_largeobject table. The data is not
> very large, but we ended up storing it there. If I'm copying the data to a
> file from the psql console, then it takes X time (e.g. a second), fetching
> it through the JDBC driver takes at least 10x more. We don't see this
> difference between JDBC and 'native' performance for anything except
> largeobjects (and bytea columns, for the record).
> >
> > Does anyone have any advice about whether this can be tuned or what the
> cause is?
> I don't know what a reason of that, but I think it's reasonable and
> quite simple to call lo_import()/lo_export() via JNI.
>


Re: pg_upgrade fails saying function unaccent(text) doesn't exist

2018-08-31 Thread Adrian Klaver

On 08/30/2018 09:03 AM, Gunnlaugur Thor Briem wrote:

\dFd unaccent
List of text search dictionaries
  Schema |   Name   | Description
+--+-
  public | unaccent |
(1 row)

\dFd+ unaccent
     List of text search dictionaries
  Schema |   Name   |    Template |    Init options    | Description
+--+-++-
  public | unaccent | public.unaccent | rules = 'unaccent' |
(1 row)

Cheers,
Gulli



Forgot to add to previous post:

\dx unaccent
   List of installed extensions
   Name   | Version | Schema | Description
--+-++-
 unaccent | 1.1 | public | text search dictionary that removes accents



--
Adrian Klaver
adrian.kla...@aklaver.com



locate DB corruption

2018-08-31 Thread Dave Peticolas
Hello, I'm running into the following error running a large query on a
database restored from WAL replay:

could not access status of transaction 330569126
DETAIL: Could not open file "pg_clog/0C68": No such file or directory

Searches of this mailing list seem to indicate this means a database row
has been corrupted. How would I go about locating the point of corruption
in order to fix?

thanks,
dave


Re: locate DB corruption

2018-08-31 Thread Adrian Klaver

On 08/31/2018 08:02 AM, Dave Peticolas wrote:
Hello, I'm running into the following error running a large query on a 
database restored from WAL replay:


could not access status of transaction 330569126
DETAIL: Could not open file "pg_clog/0C68": No such file or directory



Postgres version?

Where is the replay coming from?



Searches of this mailing list seem to indicate this means a database row 
has been corrupted. How would I go about locating the point of 
corruption in order to fix?


To me it looks like what it says, the transaction file could not be 
found. From Postgres 9.6 --> 10 pg_clog became pg_xact.


Are you sure you are not working across versions?

If not do pg_clog/ and 0C68 actually exist?



thanks,
dave




--
Adrian Klaver
adrian.kla...@aklaver.com



Re: locate DB corruption

2018-08-31 Thread Dave Peticolas
On Fri, Aug 31, 2018 at 8:14 AM Adrian Klaver 
wrote:

> On 08/31/2018 08:02 AM, Dave Peticolas wrote:
> > Hello, I'm running into the following error running a large query on a
> > database restored from WAL replay:
> >
> > could not access status of transaction 330569126
> > DETAIL: Could not open file "pg_clog/0C68": No such file or directory
>
>
> Postgres version?
>

Right! Sorry, that original email didn't have a lot of info. This is 9.6.9
restoring a backup from 9.6.8.


> Where is the replay coming from?
>

>From a snapshot and WAL files stored in Amazon S3.



> >
> > Searches of this mailing list seem to indicate this means a database row
> > has been corrupted. How would I go about locating the point of
> > corruption in order to fix?
>
> To me it looks like what it says, the transaction file could not be
> found. From Postgres 9.6 --> 10 pg_clog became pg_xact.
>
> Are you sure you are not working across versions?
>

I am sure, they are all 9.6.


> If not do pg_clog/ and 0C68 actually exist?
>

pg_clog definitely exists, but 0C68 does not. I think I have subsequently
found the precise row in the specific table that seems to be the problem.
Specifically I can select * from TABLE where id = BADID - 1 or id = BADID +
1 and the query returns. I get the error if I select the row with the bad
ID.

Now what I'm not sure of is how to fix.


Autovacuum degrades all other operations by keeping all buffers dirty?

2018-08-31 Thread David Pacheco
Hello,

We've been struggling with some major performance issues related to
autovacuum.
(I know this is a common problem.)  For a while, we believed this was
primarily
related to I/O contention, but recent observations from our monitoring make
me
wonder if there's a deeper issue here, so I'm looking for some better
understanding.

>From reading the 9.6.3 source, it looks like the autovacuum process itself
is
single-threaded, and it reads pages essentially linearly from the relation
(possibly skipping some).  When the autovacuum process needs to modify a
page,
it doesn't write it directly, but rather marks the buffer dirty.  The page
will
be written later, either by the checkpointer (according to its
configuration, in
terms of time and WAL), the bgwriter (according to its configuration, in
terms
of write count and sleep time), or else some other backend process that
requires
a free buffer (if the count of non-dirty buffers reaches zero).  Is this
accurate?

In our case, we found that when autovacuum runs, the number of dirty buffers
written by regular backends shoots from a handful to as much as 300 buffers
per
second.  (We have 200 backends on most databases.)  More specifically:
prior to
recent autovacuums starting, databases did under 1000 buffer allocations per
second.  Most of those were done by the checkpointer.  Individual backends
did
just a handful.  After autovacuum started, buffer allocations rose to
between
2,000 and 8,000 per second.  The checkpointer handled many of these, but so
did
individual backends (see above).  The bgwriter peaked around 25 buffer
writes
per second.  So it seems like the spike in buffers written by normal
backends
could explain the significant degradation that we see in average latency and
overall throughput (which can be as much as 50%).

It looks to me like the autovacuum process is effectively generating work
(in
the form of async writes) that's being distributed implicitly to the various
backend processes, creating latency for any other query that happens to
require
a buffer (including read-only queries).  Maybe we can improve this by
tuning the
bgwriter.  But if that's single-threaded, presumably there's an upper bound
to
how many buffer writes it can handle?  Is there anything to guarantee that
the
bgwriter will keep up with the work that's being generated?  That is, for
some
workloads, wouldn't it be possible that autovacuum could always generate
work
faster than the bgwriter can do it, and you'd always have some pretty major
degradation to all other queries?

I've drawn a lot of inferences here, and I'm sure there are some mistakes.
I'd
appreciate any clarification, correction, or confirmation!

Thanks in advance,
Dave


Re: Autovacuum degrades all other operations by keeping all buffers dirty?

2018-08-31 Thread Alvaro Herrera
On 2018-Aug-31, David Pacheco wrote:

> From reading the 9.6.3 source, it looks like the autovacuum process
> itself is single-threaded, and it reads pages essentially linearly
> from the relation (possibly skipping some).  When the autovacuum
> process needs to modify a page, it doesn't write it directly, but
> rather marks the buffer dirty.  The page will be written later,

Unless there's some bug, there is a BufferAccessStrategy that only lets
a few dozen buffers go unwritten before the autovac worker process
itself is forced to write some.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Autovacuum degrades all other operations by keeping all buffers dirty?

2018-08-31 Thread Andres Freund
On 2018-08-31 19:31:47 -0300, Alvaro Herrera wrote:
> On 2018-Aug-31, David Pacheco wrote:
> 
> > From reading the 9.6.3 source, it looks like the autovacuum process
> > itself is single-threaded, and it reads pages essentially linearly
> > from the relation (possibly skipping some).  When the autovacuum
> > process needs to modify a page, it doesn't write it directly, but
> > rather marks the buffer dirty.  The page will be written later,
> 
> Unless there's some bug, there is a BufferAccessStrategy that only lets
> a few dozen buffers go unwritten before the autovac worker process
> itself is forced to write some.

I've not re-checked, but I'm not sure that's true if the buffer is
already in s_b, which it'll be for many workloads.

Greetings,

Andres Freund



Re: locate DB corruption

2018-08-31 Thread Adrian Klaver

On 08/31/2018 08:51 AM, Dave Peticolas wrote:
On Fri, Aug 31, 2018 at 8:14 AM Adrian Klaver > wrote:


On 08/31/2018 08:02 AM, Dave Peticolas wrote:
 > Hello, I'm running into the following error running a large query
on a
 > database restored from WAL replay:
 >
 > could not access status of transaction 330569126
 > DETAIL: Could not open file "pg_clog/0C68": No such file or directory


Postgres version?


Right! Sorry, that original email didn't have a lot of info. This is 
9.6.9 restoring a backup from 9.6.8.


Where is the replay coming from?


 From a snapshot and WAL files stored in Amazon S3.


Seems the process is not creating a consistent backup.

How are they being generated?



Are you sure you are not working across versions?


I am sure, they are all 9.6.

If not do pg_clog/ and 0C68 actually exist?


pg_clog definitely exists, but 0C68 does not. I think I have 
subsequently found the precise row in the specific table that seems to 
be the problem. Specifically I can select * from TABLE where id = BADID 
- 1 or id = BADID + 1 and the query returns. I get the error if I select 
the row with the bad ID.


Now what I'm not sure of is how to fix.


One thing I can think of is to rebuild from a later version of your S3 
data and see if it has all the necessary files.


There is also pg_resetxlog:

https://www.postgresql.org/docs/9.6/static/app-pgresetxlog.html

I have not used it, so I can not offer much in the way of tips. Just 
from reading the docs I would suggest stopping the server and then 
creating a backup of $PG_DATA(if possible) before using pg_resetxlog.



--
Adrian Klaver
adrian.kla...@aklaver.com



Re: using a plpgsql function argument as a table column.

2018-08-31 Thread ss

I changed the data from the years in one row to each row has a year

create table tvalues as
select t.key, year, value from values t
  join lateral (values
    (2021,t.y2021),
    (2020,t.y2020),
    (2019,t.y2019),
    (2018,t.y2018),
    (2017,t.y2017),
    (2016,t.y2016),
    (2015,t.y2015),
    (2014,t.y2014),
    (2013,t.y2013),
    (2012,t.y2012),
    (2011,t.y2011),
    (2010,t.y2010),
    (2009,t.y2009),
    (2008,t.y2008),
    (2007,t.y2007),
    (2006,t.y2006),
    (2005,t.y2005),
    (2004,t.y2004),
    (2003,t.y2003),
    (2002,t.y2002),
    (2001,t.y2001),
    (2000,t.y2000),
    (1999,t.y1999),
    (1998,t.y1998),
    (1997,t.y1997),
    (1996,t.y1996),
    (1995,t.y1995),
    (1994,t.y1994),
    (1993,t.y1993),
    (1992,t.y1992),
    (1991,t.y1991),
    (1990,t.y1990),
    (1989,t.y1989),
    (1988,t.y1988),
    (1987,t.y1987),
    (1986,t.y1986),
    (1985,t.y1985),
    (1984,t.y1984),
    (1983,t.y1983),
    (1982,t.y1982),
    (1981,t.y1981),
    (1980,t.y1980),
    (1979,t.y1979),
    (1978,t.y1979),
    (1977,t.y1977),
    (1976,t.y1976))
  s(year, value) ON TRUE;

On 08/29/2018 06:40 AM, Adrian Klaver wrote:

On 08/28/2018 10:40 PM, ss wrote:


CREATE OR REPLACE FUNCTION test(year VARCHAR)
RETURNS TABLE (agencycode INT, bureaucode INT, acctname VARCHAR, 
beacat VARCHAR, onoffbudget VARCHAR, val INT)

AS $$
BEGIN
RETURN QUERY SELECT t1.agencycode, t1.bureaucode, t1.acctcode, 
t2.beacat, t2.onoffbudget, t2.XX  FROM allnames AS t1
JOIN total AS t2 on t1.agencycode = t2.agencycode and t1.bureaucode = 
t2.bureaucode and t1.acctcode = t2.acctcode
WHERE t2.XXX != 0  ORDER BY t1.agencycode, t1.bureaucode, 
t1.acctcode;

END; $$
LANGUAGE 'plpgsql';

I want to replace XXX with the argument year.
I hard coded the XX with 'y2016' and I get back a table


So something along lines of:

DO $$
    DECLARE
    col_name varchar;
    out_str  varchar;
    BEGIN
    col_name = 'y'||'2018';
    out_str = format('SELECT %I FROM some_table where %I = 0', 
col_name, col_name );

    RAISE NOTICE '%', out_str;
    END;
$$ LANGUAGE plpgsql;


NOTICE:  SELECT y2018 FROM some_table where y2018 = 0



Would something like 't2.'||(year)|| work?

On 08/28/2018 10:37 PM, Tim Cross wrote:
Off the top of my head, I think you could do this using dynamic 
(execute) SQL in a function. However, it is going to be messy, 
possibly slow and likely fragile. You would need to query the 
catalogue to get the column names in the table and then build the 
SQL dynamically 'on the fly'.


Without having more detail, my spider sense tells me you have the 
wrong table/relationship design. While you may be able to get it to 
work, it is likely you will run into constant problems and 
additional complexity that could be avoided with a different design. 
You really want a design where your queries are driven by the data 
in your tables and not by the names of columns. I would seriously 
consider re-examining your schema design, look at how your design 
fits in with the normal forms and adapt as necessary.


Tim

On Wed, 29 Aug 2018 at 15:10, ss > wrote:


    I have a table with many years as columns. y1976, y2077, .. ,
    y2019,y2020 I want to dynamically return a column from a function.


    select * from FUNCTION('y2016') .

    select t1.cola t1.colb, t1.colc, t2.y2016 from . Where
    t2.y2016 != 0;

    or if I select year y2012 I want FUNCTION('y2012')

    select t1.cola t1.colb, t1.colc, t2.y2012 from . Where
    t2.y2012 != 0;


    to generalize

    select * from FUNCTION( year_column )

    select t1.cola t1.colb, t1.colc, t2.year_column from . Where
    t2.year_column != 0;

    is it possible? if so how?




--
regards,

Tim

--
Tim Cross











Re: using a plpgsql function argument as a table column.

2018-08-31 Thread Shaun Savage
I fixed it by changed that data. all the years were in one row so I 
created a separate row for each year.


create table tvalues as
select t.key, year, value from values t
  join lateral (values
    (2021,t.y2021),
    (2020,t.y2020),
    (2019,t.y2019),
    (2018,t.y2018),
    (2017,t.y2017),
    (2016,t.y2016),
    (2015,t.y2015),
    (2014,t.y2014),
    (2013,t.y2013),
    (2012,t.y2012),
    (2011,t.y2011),
    (2010,t.y2010),
    (2009,t.y2009),
    (2008,t.y2008),
    (2007,t.y2007),
    (2006,t.y2006),
    (2005,t.y2005),
    (2004,t.y2004),
    (2003,t.y2003),
    (2002,t.y2002),
    (2001,t.y2001),
    (2000,t.y2000),
    (1999,t.y1999),
    (1998,t.y1998),
    (1997,t.y1997),
    (1996,t.y1996),
    (1995,t.y1995),
    (1994,t.y1994),
    (1993,t.y1993),
    (1992,t.y1992),
    (1991,t.y1991),
    (1990,t.y1990),
    (1989,t.y1989),
    (1988,t.y1988),
    (1987,t.y1987),
    (1986,t.y1986),
    (1985,t.y1985),
    (1984,t.y1984),
    (1983,t.y1983),
    (1982,t.y1982),
    (1981,t.y1981),
    (1980,t.y1980),
    (1979,t.y1979),
    (1978,t.y1979),
    (1977,t.y1977),
    (1976,t.y1976))
  s(year, value) ON TRUE;


On 08/29/2018 06:40 AM, Adrian Klaver wrote:

On 08/28/2018 10:40 PM, ss wrote:


CREATE OR REPLACE FUNCTION test(year VARCHAR)
RETURNS TABLE (agencycode INT, bureaucode INT, acctname VARCHAR, 
beacat VARCHAR, onoffbudget VARCHAR, val INT)

AS $$
BEGIN
RETURN QUERY SELECT t1.agencycode, t1.bureaucode, t1.acctcode, 
t2.beacat, t2.onoffbudget, t2.XX  FROM allnames AS t1
JOIN total AS t2 on t1.agencycode = t2.agencycode and t1.bureaucode = 
t2.bureaucode and t1.acctcode = t2.acctcode
WHERE t2.XXX != 0  ORDER BY t1.agencycode, t1.bureaucode, 
t1.acctcode;

END; $$
LANGUAGE 'plpgsql';

I want to replace XXX with the argument year.
I hard coded the XX with 'y2016' and I get back a table


So something along lines of:

DO $$
    DECLARE
    col_name varchar;
    out_str  varchar;
    BEGIN
    col_name = 'y'||'2018';
    out_str = format('SELECT %I FROM some_table where %I = 0', 
col_name, col_name );

    RAISE NOTICE '%', out_str;
    END;
$$ LANGUAGE plpgsql;


NOTICE:  SELECT y2018 FROM some_table where y2018 = 0



Would something like 't2.'||(year)|| work?

On 08/28/2018 10:37 PM, Tim Cross wrote:
Off the top of my head, I think you could do this using dynamic 
(execute) SQL in a function. However, it is going to be messy, 
possibly slow and likely fragile. You would need to query the 
catalogue to get the column names in the table and then build the 
SQL dynamically 'on the fly'.


Without having more detail, my spider sense tells me you have the 
wrong table/relationship design. While you may be able to get it to 
work, it is likely you will run into constant problems and 
additional complexity that could be avoided with a different design. 
You really want a design where your queries are driven by the data 
in your tables and not by the names of columns. I would seriously 
consider re-examining your schema design, look at how your design 
fits in with the normal forms and adapt as necessary.


Tim

On Wed, 29 Aug 2018 at 15:10, ss > wrote:


    I have a table with many years as columns. y1976, y2077, .. ,
    y2019,y2020 I want to dynamically return a column from a function.


    select * from FUNCTION('y2016') .

    select t1.cola t1.colb, t1.colc, t2.y2016 from . Where
    t2.y2016 != 0;

    or if I select year y2012 I want FUNCTION('y2012')

    select t1.cola t1.colb, t1.colc, t2.y2012 from . Where
    t2.y2012 != 0;


    to generalize

    select * from FUNCTION( year_column )

    select t1.cola t1.colb, t1.colc, t2.year_column from . Where
    t2.year_column != 0;

    is it possible? if so how?




--
regards,

Tim

--
Tim Cross











Re: locate DB corruption

2018-08-31 Thread Dave Peticolas
On Fri, Aug 31, 2018 at 5:19 PM Adrian Klaver 
wrote:

> On 08/31/2018 08:51 AM, Dave Peticolas wrote:
> > On Fri, Aug 31, 2018 at 8:14 AM Adrian Klaver  > > wrote:
> >
> > On 08/31/2018 08:02 AM, Dave Peticolas wrote:
> >  > Hello, I'm running into the following error running a large query
> > on a
> >  > database restored from WAL replay:
> >  >
> >  > could not access status of transaction 330569126
> >  > DETAIL: Could not open file "pg_clog/0C68": No such file or
> directory
> >
> >
> > Postgres version?
> >
> >
> > Right! Sorry, that original email didn't have a lot of info. This is
> > 9.6.9 restoring a backup from 9.6.8.
> >
> > Where is the replay coming from?
> >
> >
> >  From a snapshot and WAL files stored in Amazon S3.
>
> Seems the process is not creating a consistent backup.
>

This time, yes. This setup has been working for almost two years with
probably hundreds of restores in that time. But nothing's perfect I guess :)


> How are they being generated?
>

The snapshots are sent to S3 via a tar process after calling the start
backup function. I am following the postgres docs here. The WAL files are
just copied to S3.


>
> > Are you sure you are not working across versions?
> >
> >
> > I am sure, they are all 9.6.
> >
> > If not do pg_clog/ and 0C68 actually exist?
> >
> >
> > pg_clog definitely exists, but 0C68 does not. I think I have
> > subsequently found the precise row in the specific table that seems to
> > be the problem. Specifically I can select * from TABLE where id = BADID
> > - 1 or id = BADID + 1 and the query returns. I get the error if I select
> > the row with the bad ID.
> >
> > Now what I'm not sure of is how to fix.
>
> One thing I can think of is to rebuild from a later version of your S3
> data and see if it has all the necessary files.
>

Yes, I think that's a good idea, I'm trying that.


> There is also pg_resetxlog:
>
> https://www.postgresql.org/docs/9.6/static/app-pgresetxlog.html
>
> I have not used it, so I can not offer much in the way of tips. Just
> from reading the docs I would suggest stopping the server and then
> creating a backup of $PG_DATA(if possible) before using pg_resetxlog.
>

Thanks, I didn't know about that. The primary DB seems OK so hopefully it
won't be needed.