This may also be of interest to you, but checking official docs as Adrian
recommended is best.
https://why-upgrade.depesz.com/show?from=9.4.2&to=9.6.12&keywords=
I have not personally used this, but the write-up seems solid to minimize
downtime to help you to shift data gradually. Be sure you understand the
limitations of partitioning, particularly when you are still on 10x not yet
on v11 where updates will shift a row to a new partition if the partition
ke
On Tue, Jul 23, 2019 at 1:36 PM Adrian Klaver
wrote:
> On 7/23/19 12:20 PM, PegoraroF10 wrote:
> > We have in a single database 190 identical schemas. Now, when we create
> a new
> > one, with exactly same structure as the previous ones, it takes 20 or 30
> > minutes to finish. Usual time to fini
As a note to the original poster, you might want to check out-
https://www.postgresql.org/docs/current/citext.html
Are you autovacuuming and analyzing aggressively enough? Is there bloat or
other concern for these system tables? I expect it may be a concern based
on what I remember about your environment having thousands of tables and
perhaps autovacuum not keeping up (re: Too slow to create new schema and
thei
If those conditions that are throwing off the stats are expected to be
minimally impactful/filtering few rows, then you can use the one
tried-and-true optimizer hint (aside from materialized CTEs, stylized
indexes, etc) --- OFFSET 0 at the end of a sub-query.
SELECT * FROM ( [your existing query w
-- I'm thinking the OFFSET 0 create an optimization barrier that prevents
the planner from collapsing that sub-query into the top query, and enforces
ordering in the query?
That's my understanding. I think it is an optimizer hint by another name. I
used to put things in a CTE (which is always mate
Why can I use an alias for group by and order by and not for having? I am
just wondering if there is some barrier to it being implemented, or if it
just hasn't been.
select
table_schema || '.' || table_name AS schema_qualified_table_name,
count( column_name ) as column_count
from
information_schem
What is default_statistics_target set to? Or is there custom statistics
value set for this table/column? Perhaps the planner is making sub-optimal
choices because it only has a vague idea about the data and the histogram
is not very informative. Planning time will increase when statistics target
is
Might the issue be with pg_index or pg_attribute rather than pg_class? Is
the query still slow? Do you have the output of explain analyze for use on
https://explain.depesz.com/ or the output of EXPLAIN (ANALYZE, COSTS,
VERBOSE, BUFFERS, FORMAT JSON) for use on
http://tatiyants.com/pev/#/plans/new w
Can you manually execute vacuum analyze on all three involved tables and
again share the plan for the same query? If it is significantly improved,
it would seem like following the recommendation to tune autovacuum (and
analyze) to be more frequent would be prudent.
You haven't seemed to change fro
On Tue, Aug 27, 2019 at 9:45 PM Laurenz Albe
wrote:
> Holtgrewe, Manuel wrote:
> > Switching off fsync leads to a drastic time improvement but still
> > higher wall-clock time for four threads.
>
> Don't do that unless you are ready to start from scratch with a new
> "initdb" in the case of a cra
You need the wildcard character at front and back.
select * from fish_counts where stream_tribs ilike '*%*Nehalem*%*';
>
> CREATE STATISTICS MT_ReservationDepartureUtc (ndistinct) on "AccountCode",
> "DepartureUtc" from "Reservation"."Reservation";
> CREATE STATISTICS MT_ReservationArrivalUtc (ndistinct) on "AccountCode",
> "ArrivalUtc" from "Reservation"."Reservation";
> CREATE STATISTICS MT_ReservationNoShowFeeId
I think I would expect this behavior with how you have defined the
constraints as the function results rather than just being strict
comparisons to the timestamp field.
Instead of this-
Check constraints:
> "y2018_mis_ora_check" CHECK (date_part('year'::text, mis_ora) =
> 2018::double precisio
>My postgres instance is based on docker image postgres:11 and runs on my
MacBook Pro i7 16GB.
How much ram and such did you give to this vm?
>To my surprise, postgres 11 is extremely slow when creating a full text
index. I added a column of tsvector type and tried to create an index on
that col
You can also look at citext type to avoid the casting.
customer_keyinteger DEFAULT
nextval('customer_key_serial') PRIMARY KEY ,
cust_no smallint NOT NULL UNIQUE ,
namevarchar UNIQUE ,
Why do you have a surrogate primary key g
>I am trying to import some data from spreadsheets. Included in the data
>sets are US monetary values. These appear in the CSV file, like this: $1.00
>The column is defined like this: NUMERIC(5,2) NOT NULL.
1) remove all $ characters from csv before import
OR
2) import into text field (perhaps in
Just a side note, that the below can be written more simply-
CASE WHEN UserSkills.craftsmanship_id IS NULL THEN FALSE
ELSE TRUE as has
If you want to be direct-
UserSkills.craftsmanship_id IS NOT NULL as has
>
> Partition key: LIST (date_part('year'::text, mis_ora))
>
As an aside, you may benefit from switching to range partitioning*
depending on how your queries are written. If you have conditions such as
"WHERE mis_ora BETWEEN CURRENT_DATE - 30 AND CURRENT_DATE" or similar, then
the fact that your p
> - For tables that receive only INSERTs, schedule a regular VACUUM
> with "cron" or similar. Unfortunately, PostgreSQL isn't very smart
> about vacuuming insert-only tables.
>
What is the need to vacuum on an insert only table? Does that just maintain
the freespace map?
>
> If the data is held in common tables(bills, vouchers, etc)then the only
> thing I see happening is changing the PK values to an unused value. That
> could turn into a nightmare though. Not only that you lose the connection
> to the original data source. If the data can be broken out into separa
>
> db=# select name from table_name_ds_tmp where ARRAY[categoryid] = ANY (
> select string_to_array( '200,400', ',')::bigint[] );
>
Using either of the below instead, I get the proper result. Why doesn't ANY
work? I do not know.
select name from table_name_ds_tmp where ARRAY[categoryid] <@ ( sel
Much of indexing strategy depends on knowing the data like how many
distinct values and what the distribution is like. Is JsonBField->>'status'
always set? Are those three values mentioned in this query common or rare?
Can you re-write this query to avoid using an OR in the where clause? Are
you ju
Both of the below visualizers can help, but require some knowledge about
comparing estimated vs actual row estimates, disk sorts vs in memory,
etc. Drawing implications about whether your schema needs to change or just
the query will take time to master as well.
http://tatiyants.com/pev/#/plans/ne
"I would like to convert a table with a primary key into a partitioned
setup by a column which is not part of the primary key"
That isn't possible. The partition key must be contained by the primary
key. That is, the primary key could be site_id, id and you can create hash
partition on id or site_
>
> My Json has always a status
>
Why declare the "where" clause when creating the index? It would not seem
needed if status is always set and so your index will reference all rows in
the table.
Thanks for sharing the trick of having the second column in the index
determine the key based on the f
>
> Hi
>>
>>By inserting data in a JSONB type column I got the following error
>> message:
>>
>> *>> *
>>
>> *ERROR: string too long to represent as jsonb string*
>>
>> *DETAIL: Due to an implementation restriction, jsonb strings cannot
>> exceed 268435455 bytes.*
>>
>> *<< *
>>
>> could anyo
No, what you want is not possible and probably won't ever be I would
expect. Scanning every partition to validate the primary key isn't scalable.
On Mon, Oct 7, 2019 at 5:56 PM Ron wrote:
> On 10/7/19 6:17 PM, Michael Lewis wrote:
> > No, what you want is not possible and probably won't ever be I would
> expect.
>
> Sure it is. Maybe not the (weird) way that Postgres does partitioning,
> but
> the legacy RD
On Tue, Oct 8, 2019 at 8:00 AM Shatamjeev Dewan wrote:
> Hi Michael,
>
>
>
> In this case , I always need to include partition key(date) in primary
> key ( if I have a primary key defined on non partition key column e.g id
> (in my case), to make it a composite primary key (id, date). This would
On Tue, Oct 15, 2019 at 8:25 AM Geoff Winkless wrote:
> On Tue, 15 Oct 2019 at 14:35, Ray O'Donnell wrote:
> >
> > On 15/10/2019 14:28, stan wrote:
> > > I used to be able to return a constant value in a SELECT statement in
> > > ORACLE. I need to populate a table for testing, and I was going to
It sounds like you want row level security-
https://www.postgresql.org/docs/9.6/ddl-rowsecurity.html
But, you will need to define separate roles on the database and ensure that
the users and connecting with separate roles. The db can't magically know
about the permissions on the OS side.
>
"pg_rewind is a tool for synchronizing a PostgreSQL cluster with another
copy of the same cluster, after the clusters' timelines have diverged. A
typical scenario is to bring an old master server back online after
failover as a standby that follows the new master."
-https://www.postgresql.org/docs/
>
> We must know it to avoid disk out problems, and too much off-time.
>
You may be interested in this extension- https://github.com/reorg/pg_repack
Your plan to loop over tables and truncate them seems great if you are
worried. It seems simple to verify that space is being freed as you go, and
also easy to change tactics if the need arises.
>
>
> > CREATE TABLE books (
> > id SERIAL PRIMARY KEY,
> >
> > Which has the advantage of not having to manually create the sequences.
> Will
> > this also enforce that the "internally created sequence" will be
> initialized
> > to a value above the maximum key in use on a pg_restore?
>
>
> But it prints too many records: all moves cross-multiplied with each other.
>
> As if I have forgotten to add 1 more condition to the JOIN LATERAL
>
LIMIT 1 inside your lateral should resolve that. Personally, I'd move that
condition to EXISTS condition inside WHERE clause instead. Just a styl
On Thu, Oct 24, 2019 at 3:35 AM Laiszner Tamás
wrote:
> Actually, this is not such a unique idea:
> https://instagram-engineering.com/sharding-ids-at-instagram-1cf5a71e5a5c
>
> Thanks for the suggestion to split up the primary key into components. But
> even going down this way, packing the compo
On Fri, Oct 25, 2019 at 2:20 PM Alexander Farber
wrote:
> Thank you Michael -
>
> On Fri, Oct 25, 2019 at 7:28 PM Michael Lewis wrote:
>
>> But it prints too many records: all moves cross-multiplied with each
>>> other.
>>>
>>> As if I have forgotte
On Fri, Nov 1, 2019 at 9:22 AM Shatamjeev Dewan wrote:
> Hi Michael,
>
>
>
> I want to create a partition by year and subpartition by month in postgres
> 11 timestamp column. Please advise syntax.
>
https://www.postgresql.org/docs/11/ddl-partitioning.html
The documentation is rather clear with
My thinking is opposite from what you have. I consider it important to very
aggressive on autovacuum because it only ever does the required amount of
work. If a tiny amount of work is needed, it does only that and is done.
Assuming it doesn't cause I/O concerns, do it as often as possible to
minimi
You certainly could choose to store as tstzrange, but why not use two
fields?
https://www.postgresql.org/docs/current/rangetypes.html
> CREATE TYPE po_dates AS (
> po_isssued_datetimestamptz,
> discount_last_date timestamptz,
> net_date timestamptz
> );
>
What advantage does combining these three values into a custom composite
type give you rather than just storing directly?
On Mon, Nov 18, 2019 at 10:10 AM Shatamjeev Dewan wrote:
> I am trying to create a foreign key constraint on a table : audit_param in
> postgres 12 which references partitioned table audit_p. is there anyway to
> get rid of this error.
>
>
>
*ERROR: there is no unique constraint matching given
I try to avoid DISTINCT and use GROUP BY when feasible, as well as avoiding
OR condition. If you combined anon1 and anon2 with UNION ALL, and did
(inner) join instead of left, or even moved all of that to EXISTS, perhaps
that gives you better consistent performance. Something like this-
SELECT co
On Fri, Dec 6, 2019, 3:21 AM Mladen Marinović
wrote:
> Is there a way to detect why the planing is taking this long?
>
> The database is a 9.6.1 with 32GB of shared_buffers, and 1GB of
> maintanance_work_mem, and machine CPU is below 80% all the time.
>
What is default_statistics_target set to f
I'd suggest re-writing your query to avoid ORs whenever possible. Is this
generated by an ORM or subject to change with filters selected in
application or can you totally control it on DB side?
It may be hugely more performant to simply rewrite this as (almost) the
same query twice UNION ALL'd tog
On Wed, Dec 11, 2019 at 1:54 PM Rich Shepard
wrote:
> A sampling location table has 28 distinct sites, each site being sampled
> from 1 to 67 times. I'm trying to obtain the number of sites having 1
> sample, 2 samples, ... 67 samples and am not seeing the solution despite
> several alternative q
It looks like you are creating a partition for each minute of the day (when
logs get inserted for a given minute at least). Would it be at all
reasonable to have an hourly or daily job which creates the partitions
ahead of when they are actually needed? If partitions went unused in the
recent past,
I don't recall the details, but I know v12 included significant
enhancements to lock relations later in the process such that when
targeting relatively few of the partitions, it can be a major performance
boost.
>
> LEFT join mfg_vendor_relationship on
> mfg_vendor_relationship.mfg_key = mfg_part.mfg_key
> AND
> mfg_vendor_relationship.project_key = bom_item.project_key
> LEFT join vendor on
> mfg_vendor_relationship.vendor_key = vendor.vendor_key
>
Perhaps I am missing som
I would not think that behavior would extend to lock_timeout based on the
explanation on stackexchange. I would assume that the potentially long
runtime in this function is mostly in acquiring the lock and not doing the
update given the implied primary key in the where clause, so perhaps
lock_timeo
>
> I’m thinking it might be worth it to do a “quick” test on 1,000 or so
> records (or whatever number can run in a minute or so), watching the
> processor utilization as it runs. That should give me a better feel for
> where the bottlenecks may be, and how long the entire update process would
> t
On Mon, Jan 6, 2020 at 2:34 PM Mark Zellers
wrote:
> Just out of curiosity, what kind of trigger are you using, a row level
> trigger or a statement level trigger? If you are using a row level
> trigger, see if you can achieve your requirements using a statement level
> trigger instead. I’m rel
On Wed, Jan 8, 2020 at 7:20 AM github kran wrote:
> Sorry for the confusion, I tried to install this extension on 9.6 and it
> seems to be not working. Does 9.6 PostGreSQL supports logical replication ?
>
No. See the top of this page with supported versions listed. Prior to v10,
pg_logical exte
On Wed, Jan 8, 2020 at 8:52 PM github kran wrote:
> You are right on RDS but I believe the problem is on Aurora PostgreSQL
> where the pglogical throws an error during installation. Are you aware if
> this works on Aurora PostGreSQL
>
It seems like this question should be sent to AWS support fo
>
> when I manually mocking the data into both tables are fine and when I run
> the procedure, I get errorcode: 42P10 MSG: there is no unique or exclusion
> constraint matching on the CONFLICT specification
>
> the procedure is
>
...
> INSERT INTO ecisdrdm.bnft_curr_fact AS prod (bnft_fact_id,
>
"handle this aggregated data later in code"
What is your end goal though? Also, approx how many rows in these tables?
Can you share an example query and plan? What version are you using?
>
On Wed, Jan 15, 2020 at 9:41 AM Werner Kuhnle wrote:
> I've tried to using the newer definition:
> id int GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY
> but that does not solve the problem.
>
> Is there a way to define id columns to that when the database provides
> values,
> it recognizes alread
In terms of preventing this happening again, you might consider setting
some reasonable temp_file_limit as default. Alternatively or additionally,
you can set up another volume and direct temp files to use that to avoid
the server crashing if excessive temp files are used in a moment's time.
Then o
Other than creating a functional index on the keys that you most want to
access and include that function call in your query, I don't know that you
can. Functional indexes get statistics while partial indexes do not. If a
GIN index using path operator collected stats on the frequency of the keys,
t
If the new value does not overlap with any existing, allow. If it does
overlap, then it must be fully contained by the existing element, or the
existing element must be fully contained by it. Else, reject. Is that right?
On Fri, Feb 7, 2020 at 6:29 AM Justin wrote:
> WorkMem is the biggest consumer of resources lets say its set to 5 megs
> per connection at 1000 connections that 5,000 megs that can be allocated.
>
Clarification- work_mem is used per operation (sort, hash, etc) and could
be many many times with
It may also be worth noting that it is possible to make autovacuum/analyze
more aggressive, perhaps only on the tables that see large changes in data
that might result in a statistics issue. If you could share a query,
explain analyze output, and pseudo code or at least description of what
sort of
There is more than one type of statistics though. Stats on the distribution
of data is easily recreated with analyze table_name or analyzing the whole
database. What about the stats on how many rows have been inserted or
updated since the last (auto)vacuum and that will be used to trigger
autovacuu
Given the explicit begin before drop/create, this is a bit of an unexpected
gotcha to have any impact on other transactions. Are there other known
limitations of DDL in a transaction?
I am very interested in this discussion. We settled a table with a single
timestamp field that a script updates every minute with NOW() so that we
can check the timestamp of that table on the replica, assuming the clocks
are synced, then we will be able to compute the lag.
>
> We have seen some deadlocks and tempfile count in pg_stat_database view.
> We are trying to reset the stats.
> Can we use pg_stat_reset() function to reset these stats without any
> impact stats of databases.
> Please advise the process to reset the stats.
>
Yes, you could reset stats every mo
work_mem can be used many times per connection given it is per sort, hash,
or other operations and as mentioned that can be multiplied if the query is
handled with parallel workers. I am guessing the server has 16GB memory
total given shared_buffers and effective_cache_size, and a more reasonable
w
Have you tried with JIT turned off in PG12? The long running node
is ProjectSet type which is related to set returning functions. If that is
getting evaluated differently in PG12, that may be issue.
I don't know the answer to your stated question. I am curious if you have
set wal_level = minimal and if not, if that would be appropriate for your
use case and might render your concern a non-issue.
On Fri, Sep 20, 2019 at 8:19 AM Tom Lane wrote:
> There is a restriction on how many distinct GRANTs you can
> issue against any one object --- performance will get bad if the ACL
> list gets too large.
>
Any ballpark numbers here? Are we talking 50 or 8000?
Do you have default fillfactor set on this table? If not, I would wonder if
reducing it to 50% or even 20% would allow many more HOT updates that would
reduce bloat.
Also, is there any period of lower activity on your system that you could
schedule a vacuum freeze for daily or weekly? I believe ha
work_type.work_type_key = work_type.work_type_key
You've done a cross join.
On Wed, Mar 18, 2020, 12:55 PM Michael Lewis wrote:
> work_type.work_type_key = work_type.work_type_key
>
> You've done a cross join.
>
You meant to do permitted_work.work_type_key = work_type.work_type_key I
expect
>
On Thu, Mar 19, 2020 at 9:31 AM Justin King wrote:
> On Wed, Mar 18, 2020 at 1:40 PM Michael Lewis wrote:
> >
> > Do you have default fillfactor set on this table? If not, I would wonder
> if reducing it to 50% or even 20% would allow many more HOT updates that
> would redu
On Thu, Mar 19, 2020, 5:48 PM David G. Johnston
wrote:
> However, one other consideration with sequences: do you care that
> PostgreSQL will cache/pin (i.e., no release) every single sequence you
> touch for the lifetime of the session? (I do not think DISCARD matters here
> but I'm just guessing
>
> We haven't isolated *which* table it is blocked on (assuming it is),
> but all autovac's cease running until we manually intervene.
>
> When we get into this state again, is there some other information
> (other than what is in pg_stat_statement or pg_stat_activity) that
> would be useful for f
>
> select * from test where id between client_id-10 and client_id+10 and
> client_id=?;
>
> does not (it scans all partitions in parallel) .
> Is it expected?
>
Yes. But the below would work fine I expect since the planner would know a
constant range for id. I would be very surprised if the opti
Chris,
Does it actually need to be a different server and database, or would it be
possible to have another storage device added to your existing database and
make use of tablespaces to accomplish pseudo-archive of older partitions?
Just a thought.
>
If you didn't turn it off, you have parallel workers on by default with
v12. If work_mem is set high, memory use may be much higher as each node in
a complex plan could end up executing in parallel.
Also, do you use a connection pooler such as pgbouncer or pgpool? What is
max_connections set to?
There is nothing native to compute this. This was asked a month or so ago.
The best the poster came up with was a regular query on master DB which
updates a timestamptz field. Assuming master and slave have clocks in sync,
it is simple to compute the lag from that.
>
Yes, the system will do a full table rewrite to compute the value and store
it. Unfortunately, I believe it is an access exclusive lock during that
entire time.
You don't want recursion, you want pivot table (Excel) behavior to reformat
rows into columns. The easiest way to get this data in its raw form would
be to group by date and hour of day and compute the count.
If you have the option to add extensions in your environment, then you
should be able to
>
> My other thought was to range partition by pixelID + brin index.
>>>
I would expect brin index to be INSTEAD of partitioning. You didn't share
buffer hits, which I expect were 100% on the subsequent explain analyze
runs, but the index scan may still be faster if the planner knows it only
n
You say 12.2 is in testing but what are you using now? Have you tuned
configs much? Would you be able to implement partitioning such that your
deletes become truncates or simply a detaching of the old partition?
Generally if you are doing a vacuum full, you perhaps need to tune
autovacuum to be mor
Reviewing pg_stat_user_tables will give you an idea of how often autovacuum
is cleaning up those tables that "need" that vacuum full on a quarterly
basis. You can tune individual tables to have a lower threshold ratio of
dead tuples so the system isn't waiting until you have 20% dead rows before
va
What do the statistics look like for an example table that the index I used
vs not? Is ((instance_id)::text = 'test01'::text) rare for the tables where
an index scan is happening and common for the tables where a sequential
scan is chosen? How many rows in these tables generally?
If you use a connection pooler, this would likely be expected behavior
since the connection is getting reused many times. Else, some app is
connected and not closing their connection between queries. At least they
aren't idle in transaction though.
Indexes larger than the table may be expected if there are many. It may be
prudent to check if they are being used in pg_stat_all_indexes.
If there are just a few indexes that are becoming bloated quickly, you'd
want to ensure your autovacuum settings are tuned more aggressively, and
consider lowe
https://www.postgresql.org/docs/release/11.7/
It doesn't seem like it. Always best to run the most current minor version
though.
https://www.postgresql.org/docs/11/runtime-config-replication.html
>
On Thu, May 7, 2020 at 8:50 AM Ashish Chugh <
ashish.ch...@lavainternational.in> wrote
> To improve performance and release index space from database, We are
> running FULL Vacuum on monthly basis.
>
> On PostgreSQL website it is not recommended to run FULL Vacuum on
> Production Database and this
It is trying to do a vacuum freeze. Do you have autovacuum turned off? Any
settings changed from default related to autovacuum?
https://www.postgresql.org/docs/9.6/routine-vacuuming.html
Read 24.1.5. Preventing Transaction ID Wraparound Failures
These may also be of help-
https://info.crunchydata
autovacuum_naptime being only 5 seconds seems too frequent. A lock_timeout
might be 1-5 seconds depending on your system. Usually, DDL can fail and
wait a little time rather than lock the table for minutes and have all
reads back up behind the DDL.
Given you have autovacuum_vacuum_cost_limit set t
The documentation shows it is just a modulus operation. If you partition on
object_key % 3 then you will create three partitions for remainder values
0-2 for instance.
Afaik, hash partition doesn't have real world expected use cases just yet.
List or range is probably what you want to use.
On Mon, May 11, 2020 at 3:13 PM Alvaro Herrera
wrote:
> On 2020-May-11, Michael Lewis wrote:
>
> > Afaik, hash partition doesn't have real world expected use cases just
> yet.
>
> I don't think I agree with this assertion.
>
I didn't mean to be critical a
On Thu, May 14, 2020 at 2:20 PM Eduard Rozenberg
wrote:
> I did verify postgresql.conf has always been properly configured re:
> autovacuum: 'autovacuum = on'and 'track_counts = on'
>
This may be insufficient to keep up if you have large tables. The default
scale factor allows for 20% of the ro
Just wonder, have you compared these on the two servers?
select * from pg_settings where name = 'DateStyle';
1 - 100 of 360 matches
Mail list logo