from:"Igal Sapir"

Case Insensitive Comparison with Postgres 12

2019-10-08 Thread Igal Sapir

I am trying to test a simple case insensitive comparison.  Most likely the
collation that I chose is wrong, but I'm not sure how to choose the correct
one (for English/US?).  Here is my snippet:

create collation case_insensitive(
provider=icu, locale='en-US-x-icu', deterministic=false
);
select 'Abc' = 'abc' collate case_insensitive;

I expected true but am getting false.

Any thoughts?

Thanks,

Igal

Re: Case Insensitive Comparison with Postgres 12

2019-10-11 Thread Igal Sapir

On Fri, Oct 11, 2019 at 1:09 AM stan  wrote:

> On Thu, Oct 10, 2019 at 05:41:47AM -0700, Igal @ Lucee.org wrote:
> > On 10/9/2019 12:34 AM, Laurenz Albe wrote:
> > > Igal Sapir wrote:
> > > > I am trying to test a simple case insensitive comparison.  Most
> likely the
> > > > collation that I chose is wrong, but I'm not sure how to choose the
> correct
> > > > one (for English/US?).  Here is my snippet:
> > > >
> > > > create collation case_insensitive(
> > > >  provider=icu, locale='en-US-x-icu', deterministic=false
> > > > );
> > > > select 'Abc' = 'abc' collate case_insensitive;
> > > >
> > > > I expected true but am getting false.
> > > >
> > > > Any thoughts?
> > > Yes, the LOCALE is wrong. Use
> > >
> > > create collation case_insensitive (
> > > provider=icu, locale='en-US-u-ks-level2', deterministic=false
> > > );
> > >
> > > The name of the locale defines it.
> > >
> > > My blog post can give a simple introduction:
> > >
> https://www.cybertec-postgresql.com/en/icu-collations-against-glibc-2-28-data-corruption/
> >
> > Thank you all for replying.?? I tried to use the locale suggested by both
> > Laurenz and Pavel, en-US-u-ks-level2, but I'm still getting false for a
> > simple comparison of 'Abc' = 'abc'.?? I tried the locale both as a
> 'string'
> > and as an "identifier":
> >
> > > select version();
> >
> > version |
> >
> ---|
> > PostgreSQL 12.0 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5
> 20150623
> > (Red Hat 4.8.5-39), 64-bit|
> >
> > > drop collation if exists case_insensitive;
> >
> > > create collation case_insensitive (
> >  provider=icu, locale="en-US-u-ks-level2", deterministic=false
> > );
> >
> > > select 'Abc' = 'abc' collate case_insensitive as is_equal;
> >
> > is_equal|
> > |
> > false |
> >
> > What am I doing wrong here?
> >
> Out of curiosity is there a eason not to use the citext type for th?
>
>
Using the collation seems like a much cleaner approach, and I trust ICU to
do a better job at comparing strings according to language rules etc.

Igal

Unable to Vacuum Large Defragmented Table

2019-04-07 Thread Igal Sapir

I have a table for which pg_relation_size() shows only 31MB, but
pg_total_relation_size() shows a whopping 84GB.

The database engine is running inside a Docker container, with the data
mounted as a volume from a partition on the host's file system.

When I try to run `VACUUM FULL`, the disk usage goes up until it reaches
the full capacity of the partition (about 27GB of free space), at which
point it fails.

How can I reclaim the disk space here other than write the data to a new
table and drop the old one?

Thank you,

Igal

Re: Unable to Vacuum Large Defragmented Table

2019-04-07 Thread Igal Sapir

David,

On Sun, Apr 7, 2019 at 6:20 PM David Rowley 
wrote:

> On Mon, 8 Apr 2019 at 10:09, Igal Sapir  wrote:
> >
> > I have a table for which pg_relation_size() shows only 31MB, but
> pg_total_relation_size() shows a whopping 84GB.
> >
> > The database engine is running inside a Docker container, with the data
> mounted as a volume from a partition on the host's file system.
> >
> > When I try to run `VACUUM FULL`, the disk usage goes up until it reaches
> the full capacity of the partition (about 27GB of free space), at which
> point it fails.
>
> That sort of indicates that the table might not be as bloated as you
> seem to think it is.  Remember that variable length attributes can be
> toasted and stored in the relation's toast table.
>

I think that you're on to something here.  The table has a JSONB column
which has possibly toasted.

I have deleted many rows from the table itself though, and still fail to
reclaim disk space.  Is there something else I should do to delete the
toasted data?

Thanks,

Igal

Re: Unable to Vacuum Large Defragmented Table

2019-04-07 Thread Igal Sapir

David,

On Sun, Apr 7, 2019 at 7:28 PM David Rowley 
wrote:

> On Mon, 8 Apr 2019 at 14:19, Igal Sapir  wrote:
> >
> > On Sun, Apr 7, 2019 at 6:20 PM David Rowley <
> david.row...@2ndquadrant.com> wrote:
> >>
> >> On Mon, 8 Apr 2019 at 10:09, Igal Sapir  wrote:
> >> >
> >> > I have a table for which pg_relation_size() shows only 31MB, but
> pg_total_relation_size() shows a whopping 84GB.
> >> >
> >> > The database engine is running inside a Docker container, with the
> data mounted as a volume from a partition on the host's file system.
> >> >
> >> > When I try to run `VACUUM FULL`, the disk usage goes up until it
> reaches the full capacity of the partition (about 27GB of free space), at
> which point it fails.
> >>
> >> That sort of indicates that the table might not be as bloated as you
> >> seem to think it is.  Remember that variable length attributes can be
> >> toasted and stored in the relation's toast table.
> >
> >
> > I think that you're on to something here.  The table has a JSONB column
> which has possibly toasted.
> >
> > I have deleted many rows from the table itself though, and still fail to
> reclaim disk space.  Is there something else I should do to delete the
> toasted data?
>
> The toast data is part of the data. It's just stored out of line since
> there's a hard limit of just under 8k per tuple and since tuples
> cannot span multiple pages, PostgreSQL internally breaks them into
> chunks, possibly compresses them and stores them in the toast table.
> This can occur for any variable length type.
>
> This means if you want to remove the toast data, then you'll need to
> remove the data from the main table, either the form of deleting rows
> or updating them to remove the toasted values.
>

However, I have now deleted about 50,000 rows more and the table has only
119,688 rows.  The pg_relation_size() still shows 31MB and
pg_total_relation_size() still shows 84GB.

It doesn't make sense that after deleting about 30% of the rows the values
here do not change.

Attempting to copy the data to a different table results in the out of disk
error as well, so that is in line with your assessment.  But it actually
just shows the problem.  The new table to which the data was copied (though
failed due to out of disk) shows 0 rows, but  pg_total_relation_size() for
that table shows 27GB.  So now I have an "empty" table that takes 27GB of
disk space.

This is mostly transient data, so I don't mind deleting rows, but if some
day this could happen in production then I have to know how to deal with it
without losing all of the data.

Thanks,

Igal

Re: Unable to Vacuum Large Defragmented Table

2019-04-07 Thread Igal Sapir

David,

On Sun, Apr 7, 2019 at 8:11 PM David Rowley 
wrote:

> On Mon, 8 Apr 2019 at 14:57, Igal Sapir  wrote:
> > However, I have now deleted about 50,000 rows more and the table has
> only 119,688 rows.  The pg_relation_size() still shows 31MB and
> pg_total_relation_size() still shows 84GB.
> >
> > It doesn't make sense that after deleting about 30% of the rows the
> values here do not change.
>
> deleting rows does not make the table any smaller, it just creates
> dead rows in the table.  VACUUM tries to release the space used by
> those dead rows and turns it back into free space.  Normal vacuum (not
> FULL) can only shrink the table if completely empty pages are found at
> the end of the table.
>

ACK


>
> > Attempting to copy the data to a different table results in the out of
> disk error as well, so that is in line with your assessment.  But it
> actually just shows the problem.  The new table to which the data was
> copied (though failed due to out of disk) shows 0 rows, but
> pg_total_relation_size() for that table shows 27GB.  So now I have an
> "empty" table that takes 27GB of disk space.
>
> I think the best thing to do is perform a normal VACUUM on the table
>

Running VACUUM on the newly created table cleared the 27GB so that's good
(I was planning to run normal VACUUM but ran FULL).


> then CREATE EXTENSION IF NOT EXISTS pgstattuple; then do; SELECT *
> FROM pgstattuple('); and the same again on the toast table.
> If your table still contains many dead rows then perhaps an open
> transaction is stopping rows from being turned into free space.


I am not sure how to read the below.  I see a lot of "free_space" but not
"dead":

-[ RECORD 1 ]--+
?column?   | primary
table_len  | 32055296
tuple_count| 120764
tuple_len  | 9470882
tuple_percent  | 29.55
dead_tuple_count   | 0
dead_tuple_len | 0
dead_tuple_percent | 0
free_space | 20713580
free_percent   | 64.62
-[ RECORD 2 ]--+
?column?   | toast
table_len  | 88802156544
tuple_count| 15126830
tuple_len  | 30658625743
tuple_percent  | 34.52
dead_tuple_count   | 0
dead_tuple_len | 0
dead_tuple_percent | 0
free_space | 57653329312
free_percent   | 64.92


> Once pgstattuples reports that "tuple_len" from the table, its toast table
> and all its indexes has been reduced to an acceptable value then you
> should try a VACUUM FULL.  Remember that VACUUM FULL must also write
> WAL, so if WAL is on the same volume, then you'll need to consider
> space required for that when deciding how much data to remove from the
> table.
>

WAL is on the same volume.  The PGDATA directory is mounted in a Docker
container.

Isn't there any way to do an in-place VACUUM or pause the WAL at the risk
of losing some data if recovery is required?

There is a catch-22 here.  I can't reclaim the disk space because that
requires disk space.  Surely I'm not the first one to have encountered that
problem with Postgres.


>
> > This is mostly transient data, so I don't mind deleting rows, but if
> some day this could happen in production then I have to know how to deal
> with it without losing all of the data.
>
> For the future, it would be better to delete more often than waiting
> until the table grows too large.  A normal VACUUM will turn space used
> by dead tuples back into free space, so if done often enough there
> won't be a need to vacuum full.
>

ACK.  This issue came up while implementing a retention policy that will be
enforced regularly.

Thank you for all of your help,

Igal

Re: Unable to Vacuum Large Defragmented Table

2019-04-08 Thread Igal Sapir

Pavel,

On Sun, Apr 7, 2019 at 11:22 PM Pavel Stehule 
wrote:

>
> po 8. 4. 2019 v 7:57 odesílatel Igal Sapir  napsal:
>
>> David,
>>
>> On Sun, Apr 7, 2019 at 8:11 PM David Rowley 
>> wrote:
>>
>>> On Mon, 8 Apr 2019 at 14:57, Igal Sapir  wrote:
>>> > However, I have now deleted about 50,000 rows more and the table has
>>> only 119,688 rows.  The pg_relation_size() still shows 31MB and
>>> pg_total_relation_size() still shows 84GB.
>>> >
>>> > It doesn't make sense that after deleting about 30% of the rows the
>>> values here do not change.
>>>
>>> deleting rows does not make the table any smaller, it just creates
>>> dead rows in the table.  VACUUM tries to release the space used by
>>> those dead rows and turns it back into free space.  Normal vacuum (not
>>> FULL) can only shrink the table if completely empty pages are found at
>>> the end of the table.
>>>
>>
>> ACK
>>
>>
>>>
>>> > Attempting to copy the data to a different table results in the out of
>>> disk error as well, so that is in line with your assessment.  But it
>>> actually just shows the problem.  The new table to which the data was
>>> copied (though failed due to out of disk) shows 0 rows, but
>>> pg_total_relation_size() for that table shows 27GB.  So now I have an
>>> "empty" table that takes 27GB of disk space.
>>>
>>> I think the best thing to do is perform a normal VACUUM on the table
>>>
>>
>> Running VACUUM on the newly created table cleared the 27GB so that's good
>> (I was planning to run normal VACUUM but ran FULL).
>>
>
> you can drop some indexes, then you can run vacuum full, and create
> dropped indexes again.
>

The table does not have any indexes.  It is mostly an append-only table.


>
>
>
>>
>>> then CREATE EXTENSION IF NOT EXISTS pgstattuple; then do; SELECT *
>>> FROM pgstattuple('); and the same again on the toast table.
>>> If your table still contains many dead rows then perhaps an open
>>> transaction is stopping rows from being turned into free space.
>>
>>
>> I am not sure how to read the below.  I see a lot of "free_space" but not
>> "dead":
>>
>> -[ RECORD 1 ]--+
>> ?column?   | primary
>> table_len  | 32055296
>> tuple_count| 120764
>> tuple_len  | 9470882
>> tuple_percent  | 29.55
>> dead_tuple_count   | 0
>> dead_tuple_len | 0
>> dead_tuple_percent | 0
>> free_space | 20713580
>> free_percent   | 64.62
>> -[ RECORD 2 ]--+
>> ?column?   | toast
>> table_len  | 88802156544
>> tuple_count| 15126830
>> tuple_len  | 30658625743
>> tuple_percent  | 34.52
>> dead_tuple_count   | 0
>> dead_tuple_len | 0
>> dead_tuple_percent | 0
>> free_space | 57653329312
>> free_percent   | 64.92
>>
>
>
> it say, so your table can be reduced about 60%
>

That's what I thought, and releasing 65% of 84GB would be major here, but
unfortunately I am unable to release it because VACUUM FULL requires more
space than I currently have available.

Perhaps disabling the WAL, if possible, could help VACUUM FULL complete.
Or some way to do an in-place VACUUM so that it doesn't write all the data
to a new table.

Thank you,

Igal



>
>
>>
>>
>>> Once pgstattuples reports that "tuple_len" from the table, its toast
>>> table
>>> and all its indexes has been reduced to an acceptable value then you
>>> should try a VACUUM FULL.  Remember that VACUUM FULL must also write
>>> WAL, so if WAL is on the same volume, then you'll need to consider
>>> space required for that when deciding how much data to remove from the
>>> table.
>>>
>>
>> WAL is on the same volume.  The PGDATA directory is mounted in a Docker
>> container.
>>
>> Isn't there any way to do an in-place VACUUM or pause the WAL at the risk
>> of losing some data if recovery is required?
>>
>> There is a catch-22 here.  I can't reclaim the disk space because that
>> requires disk space.  Surely I'm not the first one to have encountered that
>> problem with Postgres.
>>
>>
>>>
>>> > This is mostly transient data, so I don't mind deleting rows, but if
>>> some day this could happen in production then I have to know how to deal
>>> with it without losing all of the data.
>>>
>>> For the future, it would be better to delete more often than waiting
>>> until the table grows too large.  A normal VACUUM will turn space used
>>> by dead tuples back into free space, so if done often enough there
>>> won't be a need to vacuum full.
>>>
>>
>> ACK.  This issue came up while implementing a retention policy that will
>> be enforced regularly.
>>
>> Thank you for all of your help,
>>
>> Igal
>>
>>
>

Re: Unable to Vacuum Large Defragmented Table

2019-04-09 Thread Igal Sapir

Pavel,

On Mon, Apr 8, 2019 at 8:29 AM Pavel Stehule 
wrote:

>
>
> po 8. 4. 2019 v 17:22 odesílatel Igal Sapir  napsal:
>
>> Pavel,
>>
>> On Sun, Apr 7, 2019 at 11:22 PM Pavel Stehule 
>> wrote:
>>
>>>
>>> po 8. 4. 2019 v 7:57 odesílatel Igal Sapir  napsal:
>>>
>>>> David,
>>>>
>>>> On Sun, Apr 7, 2019 at 8:11 PM David Rowley <
>>>> david.row...@2ndquadrant.com> wrote:
>>>>
>>>>> On Mon, 8 Apr 2019 at 14:57, Igal Sapir  wrote:
>>>>> > However, I have now deleted about 50,000 rows more and the table has
>>>>> only 119,688 rows.  The pg_relation_size() still shows 31MB and
>>>>> pg_total_relation_size() still shows 84GB.
>>>>> >
>>>>> > It doesn't make sense that after deleting about 30% of the rows the
>>>>> values here do not change.
>>>>>
>>>>> deleting rows does not make the table any smaller, it just creates
>>>>> dead rows in the table.  VACUUM tries to release the space used by
>>>>> those dead rows and turns it back into free space.  Normal vacuum (not
>>>>> FULL) can only shrink the table if completely empty pages are found at
>>>>> the end of the table.
>>>>>
>>>>
>>>> ACK
>>>>
>>>>
>>>>>
>>>>> > Attempting to copy the data to a different table results in the out
>>>>> of disk error as well, so that is in line with your assessment.  But it
>>>>> actually just shows the problem.  The new table to which the data was
>>>>> copied (though failed due to out of disk) shows 0 rows, but
>>>>> pg_total_relation_size() for that table shows 27GB.  So now I have an
>>>>> "empty" table that takes 27GB of disk space.
>>>>>
>>>>> I think the best thing to do is perform a normal VACUUM on the table
>>>>>
>>>>
>>>> Running VACUUM on the newly created table cleared the 27GB so that's
>>>> good (I was planning to run normal VACUUM but ran FULL).
>>>>
>>>
>>> you can drop some indexes, then you can run vacuum full, and create
>>> dropped indexes again.
>>>
>>
>> The table does not have any indexes.  It is mostly an append-only table.
>>
>>
>>>
>>>
>>>
>>>>
>>>>> then CREATE EXTENSION IF NOT EXISTS pgstattuple; then do; SELECT *
>>>>> FROM pgstattuple('); and the same again on the toast table.
>>>>> If your table still contains many dead rows then perhaps an open
>>>>> transaction is stopping rows from being turned into free space.
>>>>
>>>>
>>>> I am not sure how to read the below.  I see a lot of "free_space" but
>>>> not "dead":
>>>>
>>>> -[ RECORD 1 ]--+
>>>> ?column?   | primary
>>>> table_len  | 32055296
>>>> tuple_count| 120764
>>>> tuple_len  | 9470882
>>>> tuple_percent  | 29.55
>>>> dead_tuple_count   | 0
>>>> dead_tuple_len | 0
>>>> dead_tuple_percent | 0
>>>> free_space | 20713580
>>>> free_percent   | 64.62
>>>> -[ RECORD 2 ]--+
>>>> ?column?   | toast
>>>> table_len  | 88802156544
>>>> tuple_count| 15126830
>>>> tuple_len  | 30658625743
>>>> tuple_percent  | 34.52
>>>> dead_tuple_count   | 0
>>>> dead_tuple_len | 0
>>>> dead_tuple_percent | 0
>>>> free_space | 57653329312
>>>> free_percent   | 64.92
>>>>
>>>
>>>
>>> it say, so your table can be reduced about 60%
>>>
>>
>> That's what I thought, and releasing 65% of 84GB would be major here, but
>> unfortunately I am unable to release it because VACUUM FULL requires more
>> space than I currently have available.
>>
>> Perhaps disabling the WAL, if possible, could help VACUUM FULL complete.
>> Or some way to do an in-place VACUUM so that it doesn't write all the data
>> to a new table.
>>
>
> maybe this article can be interesting for you
>
> https://www.depesz.com/2013/06/21/bloat-removal-by-tuples-moving/
>
>
It's very interesting, thank you.

I'm trying now to figure out

Starting Postgres when there is no disk space

2019-05-01 Thread Igal Sapir

I have Postgres running in a Docker container with PGDATA mounted from the
host.  Postgres consume all of the disk space, 130GB [1], and can not be
started [2].  The database has a lot of bloat due to much many deletions.
The problem is that now I can not start Postgres at all.

I mounted an additional partition with 100GB, hoping to fix the bloat with
a TABLESPACE in the new mount, but how can I do anything if Postgres will
not start in the first place?

I expected there to be a tool that can defrag the database files, e.g. a
"vacuumdb" utility that can run without Postgres.  Or maybe run Postgres
and disable the WAL so that no new disk space will be required.

Surely, I'm not the first one to experience this issue.  How can I fix this?

Thank you,

Igal

[1]
root@ff818ff7550a:/# du -h --max-depth=1 /pgdata
625M/pgdata/pg_wal
608K/pgdata/global
0   /pgdata/pg_commit_ts
0   /pgdata/pg_dynshmem
8.0K/pgdata/pg_notify
0   /pgdata/pg_serial
0   /pgdata/pg_snapshots
16K /pgdata/pg_subtrans
0   /pgdata/pg_twophase
16K /pgdata/pg_multixact
130G/pgdata/base
0   /pgdata/pg_replslot
0   /pgdata/pg_tblspc
0   /pgdata/pg_stat
0   /pgdata/pg_stat_tmp
7.9M/pgdata/pg_xact
4.0K/pgdata/pg_logical
0   /pgdata/tmp
130G/pgdata

[2]
postgres@1efd26b999ca:/$ /usr/lib/postgresql/11/bin/pg_ctl start
waiting for server to start2019-05-01 20:43:59.301 UTC [34] LOG:
listening on IPv4 address "0.0.0.0", port 5432
2019-05-01 20:43:59.301 UTC [34] LOG:  listening on IPv6 address "::", port
5432
2019-05-01 20:43:59.303 UTC [34] LOG:  listening on Unix socket
"/var/run/postgresql/.s.PGSQL.5432"
2019-05-01 20:43:59.322 UTC [35] LOG:  database system shutdown was
interrupted; last known up at 2019-05-01 19:37:32 UTC
2019-05-01 20:43:59.863 UTC [35] LOG:  database system was not properly
shut down; automatic recovery in progress
2019-05-01 20:43:59.865 UTC [35] LOG:  redo starts at 144/4EFFFC18
...2019-05-01 20:44:02.389 UTC [35] LOG:  redo done at 144/74FFE060
2019-05-01 20:44:02.389 UTC [35] LOG:  last completed transaction was at
log time 2019-04-28 05:05:24.687581+00
.2019-05-01 20:44:03.474 UTC [35] PANIC:  could not write to file
"pg_logical/replorigin_checkpoint.tmp": No space left on device
2019-05-01 20:44:03.480 UTC [34] LOG:  startup process (PID 35) was
terminated by signal 6: Aborted
2019-05-01 20:44:03.480 UTC [34] LOG:  aborting startup due to startup
process failure
2019-05-01 20:44:03.493 UTC [34] LOG:  database system is shut down
 stopped waiting
pg_ctl: could not start server
Examine the log output.

Re: Starting Postgres when there is no disk space

2019-05-01 Thread Igal Sapir

Thank you both.  The symlink sounds like a very good idea.  My other disk
is 100GB and the database is already 130GB so moving the whole thing will
require provisioning that will take more time.  I will try the symlinks
first.  Possibly moving some tables to a tablespace on the other partition
to make more room.

I have a scheduled process that runs daily to delete old data and do full
vacuum.  Not sure why this happened (again).

Thanks,

Igal

On Wed, May 1, 2019 at 6:02 PM Michael Loftis  wrote:

> Best optionCopy/move the entire  pgdata to a larger space. It may also
> be enough to just move the WAL (leaving a symlink) freeing up the 623M but
> I doubt it since VACUUM FULL occurs in the same table space and can need an
> equal amount of space (130G) depending on how much it can actually free up.
>
> You may also get away with just moving (and leaving a symlink) for the
> base but I don't recall if that works or not.
>
> On Wed, May 1, 2019 at 18:07 Igal Sapir  wrote:
>
>> I have Postgres running in a Docker container with PGDATA mounted from
>> the host.  Postgres consume all of the disk space, 130GB [1], and can not
>> be started [2].  The database has a lot of bloat due to much many
>> deletions.  The problem is that now I can not start Postgres at all.
>>
>> I mounted an additional partition with 100GB, hoping to fix the bloat
>> with a TABLESPACE in the new mount, but how can I do anything if Postgres
>> will not start in the first place?
>>
>> I expected there to be a tool that can defrag the database files, e.g. a
>> "vacuumdb" utility that can run without Postgres.  Or maybe run Postgres
>> and disable the WAL so that no new disk space will be required.
>>
>> Surely, I'm not the first one to experience this issue.  How can I fix
>> this?
>>
>> Thank you,
>>
>> Igal
>>
>> [1]
>> root@ff818ff7550a:/# du -h --max-depth=1 /pgdata
>> 625M/pgdata/pg_wal
>> 608K/pgdata/global
>> 0   /pgdata/pg_commit_ts
>> 0   /pgdata/pg_dynshmem
>> 8.0K/pgdata/pg_notify
>> 0   /pgdata/pg_serial
>> 0   /pgdata/pg_snapshots
>> 16K /pgdata/pg_subtrans
>> 0   /pgdata/pg_twophase
>> 16K /pgdata/pg_multixact
>> 130G/pgdata/base
>> 0   /pgdata/pg_replslot
>> 0   /pgdata/pg_tblspc
>> 0   /pgdata/pg_stat
>> 0   /pgdata/pg_stat_tmp
>> 7.9M/pgdata/pg_xact
>> 4.0K/pgdata/pg_logical
>> 0   /pgdata/tmp
>> 130G/pgdata
>>
>> [2]
>> postgres@1efd26b999ca:/$ /usr/lib/postgresql/11/bin/pg_ctl start
>> waiting for server to start2019-05-01 20:43:59.301 UTC [34] LOG:
>> listening on IPv4 address "0.0.0.0", port 5432
>> 2019-05-01 20:43:59.301 UTC [34] LOG:  listening on IPv6 address "::",
>> port 5432
>> 2019-05-01 20:43:59.303 UTC [34] LOG:  listening on Unix socket
>> "/var/run/postgresql/.s.PGSQL.5432"
>> 2019-05-01 20:43:59.322 UTC [35] LOG:  database system shutdown was
>> interrupted; last known up at 2019-05-01 19:37:32 UTC
>> 2019-05-01 20:43:59.863 UTC [35] LOG:  database system was not properly
>> shut down; automatic recovery in progress
>> 2019-05-01 20:43:59.865 UTC [35] LOG:  redo starts at 144/4EFFFC18
>> ...2019-05-01 20:44:02.389 UTC [35] LOG:  redo done at 144/74FFE060
>> 2019-05-01 20:44:02.389 UTC [35] LOG:  last completed transaction was at
>> log time 2019-04-28 05:05:24.687581+00
>> .2019-05-01 20:44:03.474 UTC [35] PANIC:  could not write to file
>> "pg_logical/replorigin_checkpoint.tmp": No space left on device
>> 2019-05-01 20:44:03.480 UTC [34] LOG:  startup process (PID 35) was
>> terminated by signal 6: Aborted
>> 2019-05-01 20:44:03.480 UTC [34] LOG:  aborting startup due to startup
>> process failure
>> 2019-05-01 20:44:03.493 UTC [34] LOG:  database system is shut down
>>  stopped waiting
>> pg_ctl: could not start server
>> Examine the log output.
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> --
>
> "Genius might be described as a supreme capacity for getting its possessors
> into trouble of all kinds."
> -- Samuel Butler
>

Re: Starting Postgres when there is no disk space

2019-05-02 Thread Igal Sapir

Right.  I managed to start up Postgres by symlinking the following
directories to a new mount:  pg_logical, pg_subtrans, pg_wal, pg_xact.

I then created a new tablespace on the new mount, set it to be the default
tablespace, and moved some of the smaller (about 30GB) tables to it.  That
allowed me to delete old data, do full vacuum, and move the data back to
the original disk.

This is timeseries data and there is a daily process that deletes the old
stuff, but vacuum full still fails to return the space to the OS.  Perhaps
I will get better results with table partitioning, or with TimeScaleDB.

Thank you for your help,

Igal

On Wed, May 1, 2019 at 11:08 PM Michael Nolan  wrote:

>
>
> Assuming you get the database back online, I would suggest you put a
> procedure in place to monitor disk space and alert you when it starts to
> get low.
> --
> Mike Nolan
>

Re: Starting Postgres when there is no disk space

2019-05-03 Thread Igal Sapir

Jeff,

On Fri, May 3, 2019 at 6:56 AM Jeff Janes  wrote:

> On Wed, May 1, 2019 at 10:25 PM Igal Sapir  wrote:
>
>>
>> I have a scheduled process that runs daily to delete old data and do full
>> vacuum.  Not sure why this happened (again).
>>
>
> If you are doing a regularly scheduled "vacuum full", you are almost
> certainly doing something wrong.  Are these "vacuum full" completing, or
> are they failing (probably due to transient out of space errors)?
>
> A ordinary non-full vacuum will make the space available for internal
> reuse. It will not return the space to filesystem (usually), so won't get
> you out of the problem.  But it should prevent you from getting into the
> problem in the first place.  If it is failing to reuse the space
> adequately, you should figure out why, rather than just blindly jumping to
> regularly scheduled "vacuum full".  For example, what is it that is
> bloating, the tables themselves, their indexes, or their TOAST tables?  Or
> is there any bloat in the first place? Are you sure your deletions are
> equal to your insertions, over the long term average?  If you are doing
> "vacuum full" and you are certain it is completing successfully, but it
> doesn't free up much space, then that is strong evidence that you don't
> actually have bloat, you just have more live data than you think you do.
> (It could also mean you have done something silly with your "fillfactor"
> settings.)
>
> If you don't want the space to be reused, to keep a high correlation
> between insert time and physical order of the rows for example, then you
> should look into partitioning, as you have already noted.
>
> Now that you have the system up again and some space freed up, I'd create
> a "ballast" file with a few gig of random (to avoid filesystem-level
> compression, should you have such a filesystem) data on the same device
> that holds your main data, that can be deleted in an emergency to give you
> enough free space to at least start the system.  Of course, monitoring is
> also nice, but the ballast file is more robust and there is no reason you
> can't have both.
>

Thank you for the tips.  I stand corrected.  These are regular VACUUM calls
after the deletion, not VACUUM FULL.  It's a daily process that deletes
records from N days ago, and then performs VACUUM, so yes, all of the
inserted records should be deleted after N days.

The bloat is in a TOAST table.  The primary table has a JSONB column which
can get quite large.  The fillfactor setting was not modified from its
default value (does the primary table fillfactor affect the toast table?
either way they are both default in this case).

Ballast file is a great idea.  I was just thinking about that a couple of
days ago, but instead of one file I think that I will have a bunch of them
at 1GB each.  That will give me more flexibility in clearing space as
needed and keeping more "safety buffers" for when I make space.

Thanks for your help,

Igal

Re: Starting Postgres when there is no disk space

2019-05-03 Thread Igal Sapir

If anyone ever needs, I wrote this 1-liner bash loop to create 16 temp
files of 640MB random data each (well, 2-liner if you count the "config"
line):

$ COUNT=16; TMPDIR=/pgdata/tmp/
$ for ((i=1; i<=6; i++)); do dd if=/dev/zero of="/pgdata/tmp/$(cat
/dev/urandom | tr -cd 'a-f0-9' | head -c 20).tmp" count=81920 bs=8192; done;

Which produces about 10GB of unusable space that I can free up in the event
that I run out of disk (10GB might be excessive, but it works for me for
the time being):

$ ls -lh $TMPDIR
total 10G
-rw-r--r-- 1 root root 640M May  3 12:42 0a81845a5de0d926572e.tmp
-rw-r--r-- 1 root root 640M May  3 12:42 1800a815773f34b8be98.tmp
-rw-r--r-- 1 root root 640M May  3 12:42 1b182057d9b764d3b2a8.tmp
-rw-r--r-- 1 root root 640M May  3 12:42 40f7b4cab222699d121a.tmp
-rw-r--r-- 1 root root 640M May  3 12:42 498e9bc0852ed83af04f.tmp
-rw-r--r-- 1 root root 640M May  3 12:42 49e84e5189e424c012be.tmp
-rw-r--r-- 1 root root 640M May  3 12:42 7c984b156d11b5817aa5.tmp
-rw-r--r-- 1 root root 640M May  3 12:42 7d1195b03906e3539495.tmp
-rw-r--r-- 1 root root 640M May  3 12:42 9677ff969c7add0e7f92.tmp
-rw-r--r-- 1 root root 640M May  3 12:42 9ae9d483adddf3317d7c.tmp
-rw-r--r-- 1 root root 640M May  3 12:42 a546f3f363ca733427e7.tmp
-rw-r--r-- 1 root root 640M May  3 12:42 a965856cb1118d98f66a.tmp
-rw-r--r-- 1 root root 640M May  3 12:42 c162da7ecdb8824e3baf.tmp
-rw-r--r-- 1 root root 640M May  3 12:42 d7c97019ce658b90285b.tmp
-rw-r--r-- 1 root root 640M May  3 12:42 e76fc603ffe2c977c826.tmp
-rw-r--r-- 1 root root 640M May  3 12:42 fed72361b202f9492d7f.tmp

Best,

Igal

On Fri, May 3, 2019 at 9:09 AM Igal Sapir  wrote:

> Jeff,
>
> On Fri, May 3, 2019 at 6:56 AM Jeff Janes  wrote:
>
>> On Wed, May 1, 2019 at 10:25 PM Igal Sapir  wrote:
>>
>>>
>>> I have a scheduled process that runs daily to delete old data and do
>>> full vacuum.  Not sure why this happened (again).
>>>
>>
>> If you are doing a regularly scheduled "vacuum full", you are almost
>> certainly doing something wrong.  Are these "vacuum full" completing, or
>> are they failing (probably due to transient out of space errors)?
>>
>> A ordinary non-full vacuum will make the space available for internal
>> reuse. It will not return the space to filesystem (usually), so won't get
>> you out of the problem.  But it should prevent you from getting into the
>> problem in the first place.  If it is failing to reuse the space
>> adequately, you should figure out why, rather than just blindly jumping to
>> regularly scheduled "vacuum full".  For example, what is it that is
>> bloating, the tables themselves, their indexes, or their TOAST tables?  Or
>> is there any bloat in the first place? Are you sure your deletions are
>> equal to your insertions, over the long term average?  If you are doing
>> "vacuum full" and you are certain it is completing successfully, but it
>> doesn't free up much space, then that is strong evidence that you don't
>> actually have bloat, you just have more live data than you think you do.
>> (It could also mean you have done something silly with your "fillfactor"
>> settings.)
>>
>> If you don't want the space to be reused, to keep a high correlation
>> between insert time and physical order of the rows for example, then you
>> should look into partitioning, as you have already noted.
>>
>> Now that you have the system up again and some space freed up, I'd create
>> a "ballast" file with a few gig of random (to avoid filesystem-level
>> compression, should you have such a filesystem) data on the same device
>> that holds your main data, that can be deleted in an emergency to give you
>> enough free space to at least start the system.  Of course, monitoring is
>> also nice, but the ballast file is more robust and there is no reason you
>> can't have both.
>>
>
> Thank you for the tips.  I stand corrected.  These are regular VACUUM
> calls after the deletion, not VACUUM FULL.  It's a daily process that
> deletes records from N days ago, and then performs VACUUM, so yes, all of
> the inserted records should be deleted after N days.
>
> The bloat is in a TOAST table.  The primary table has a JSONB column which
> can get quite large.  The fillfactor setting was not modified from its
> default value (does the primary table fillfactor affect the toast table?
> either way they are both default in this case).
>
> Ballast file is a great idea.  I was just thinking about that a couple of
> days ago, but instead of one file I think that I will have a bunch of them
> at 1GB each.  That will give me more flexibility in clearing space as
> needed and keeping more "safety buffers" for when I make space.
>
> Thanks for your help,
>
> Igal
>
>

Connecting to NOTIFY with telnet

2019-05-03 Thread Igal Sapir

Is it possible to connect to Postgres for notifications via telnet?  This
is obviously more for learning/experimenting purposes.

I expected a simple way to connect and consume notifications but can not
find any example or documentation on how to do that.

Any ideas?

Thanks,

Igal

Re: Connecting to NOTIFY with telnet

2019-05-04 Thread Igal Sapir

Thank you both (Michel replied off-list it seems),

On Fri, May 3, 2019 at 1:10 PM Andres Freund  wrote:

> Hi,
>
> On 2019-05-03 11:06:09 -0700, Igal Sapir wrote:
> > Is it possible to connect to Postgres for notifications via telnet?  This
> > is obviously more for learning/experimenting purposes.
>
> No. The protocol is too complicated to make that realistically doable /
> useful.
>
> > I expected a simple way to connect and consume notifications but can not
> > find any example or documentation on how to do that.
>
> If you really wanted to go down that road, you'd have to read the
> protocol specs. It'd not make sense to document how-to steps for
> something as low-level as this.
>

My main "issue" is that the official pgjdbc driver does not support the
notifications with listen and I was trying to figure out why.

I know that the pgjdbc-ng [1] driver does support that feature, but I'm not
sure how many people use it etc.  It looks like it's being maintained with
a recent release from a couple of months ago so that's good.

I thought that given the fact that it's just a socket connection it might
be fairly simple to open a connection and listen.

Best,

Igal

[1] http://impossibl.github.io/pgjdbc-ng/

Re: Connecting to NOTIFY with telnet

2019-05-04 Thread Igal Sapir

Christoph,

On Sat, May 4, 2019 at 10:44 AM Christoph Moench-Tegeder 
wrote:

> ## Igal Sapir (i...@lucee.org):
>
> > My main "issue" is that the official pgjdbc driver does not support the
> > notifications with listen and I was trying to figure out why.
>
> https://jdbc.postgresql.org/documentation/head/listennotify.html
>
>
I should have been more explicit.  My main issue is with the following
statement from the link that you posted:

> A key limitation of the JDBC driver is that it cannot receive
asynchronous notifications and must poll the backend to check if any
notifications were issued

Polling is much less efficient than event handling and I'm sure that
there's a major performance hit with that.

Thanks,

Igal

Re: Connecting to NOTIFY with telnet

2019-05-04 Thread Igal Sapir

Jeff,

On Sat, May 4, 2019 at 11:34 AM Jeff Janes  wrote:

> On Sat, May 4, 2019 at 1:49 PM Igal Sapir  wrote:
>
>> Christoph,
>>
>> On Sat, May 4, 2019 at 10:44 AM Christoph Moench-Tegeder <
>> c...@burggraben.net> wrote:
>>
>>> ## Igal Sapir (i...@lucee.org):
>>>
>>> > My main "issue" is that the official pgjdbc driver does not support the
>>> > notifications with listen and I was trying to figure out why.
>>>
>>> https://jdbc.postgresql.org/documentation/head/listennotify.html
>>>
>>>
>> I should have been more explicit.  My main issue is with the following
>> statement from the link that you posted:
>>
>> > A key limitation of the JDBC driver is that it cannot receive
>> asynchronous notifications and must poll the backend to check if any
>> notifications were issued
>>
>> Polling is much less efficient than event handling and I'm sure that
>> there's a major performance hit with that.
>>
>
> Isn't that addressed here?:
>
> // If this thread is the only one that uses the
> connection, a timeout can be used to
> // receive notifications immediately:
> // org.postgresql.PGNotification notifications[] =
> pgconn.getNotifications(1);
>
>
It "helps", but it's still not the same as keeping the connection open and
receiving messages in real time.

Looking at the documentation of pgjdbc-ng [1] it looks quite impressive.
I'm thinking to try it out and possibly even try to benchmark it for
performance against the official JDBC driver.

Igal

[1]
http://impossibl.github.io/pgjdbc-ng/docs/current/user-guide/#extensions-notifications


> Cheers,
>
> Jeff
>
>>

Re: Connecting to NOTIFY with telnet

2019-05-04 Thread Igal Sapir

Jeff,

On Sat, May 4, 2019 at 2:10 PM Jeff Janes  wrote:

> On Sat, May 4, 2019 at 3:04 PM Igal Sapir  wrote:
>
>> Jeff,
>>
>> On Sat, May 4, 2019 at 11:34 AM Jeff Janes  wrote:
>>
>>> On Sat, May 4, 2019 at 1:49 PM Igal Sapir  wrote:
>>>
>>>>
>>>> 
>>>>
>>>> > A key limitation of the JDBC driver is that it cannot receive
>>>> asynchronous notifications and must poll the backend to check if any
>>>> notifications were issued
>>>>
>>>> Polling is much less efficient than event handling and I'm sure that
>>>> there's a major performance hit with that.
>>>>
>>>
>>> Isn't that addressed here?:
>>>
>>> // If this thread is the only one that uses the
>>> connection, a timeout can be used to
>>> // receive notifications immediately:
>>> // org.postgresql.PGNotification notifications[] =
>>> pgconn.getNotifications(1);
>>>
>>>
>> It "helps", but it's still not the same as keeping the connection open
>> and receiving messages in real time.
>>
>
> But it is the same as that.  It keeps the connection open, and receives
> the messages in real time.  It blocks either until it gets a NOTIFY, or for
> 10 seconds, whichever occurs first.  You can use 0 to block forever until a
> NOTIFY arrives.  Maybe you didn't remove the sleep a few lines further
> down?  Or is there some other problem here?
>

I didn't try it out, but I know from experience with other systems that
long polling is not as efficient as asynchronous events over an open
socket.  For example, Comet vs. WebSockets in web servers.

The only way to find out how big, or small, the impact is, is to benchmark
both implementations, but I have no doubt that the open socket will be more
efficient.

Best,

Igal

Postgres for SQL Server users

2019-05-06 Thread Igal Sapir

Next month I'll be making a presentation about Postgres to a SQL Server
crowd in L.A. at their SQL Saturday event.

I was wondering if anyone has any tips that are specific for SQL Server
users?  Best features?  Known issues?  Common rebuttals?

Thanks,

Igal

Re: Postgres for SQL Server users

2019-05-06 Thread Igal Sapir

Ravi,

On Mon, May 6, 2019 at 12:28 PM Ravi Krishna  wrote:

> > I was wondering if anyone has any tips that are specific for SQL Server
> users?  Best features?  Known issues?  Common rebuttals?
>
> Are you talking about SS to PG migration.
>
> Generally SQLServer shops use SS specific functions and T-SQL heavily
> since they provide very good functionality.
> For example:
>
> 1, convert function is used heavily in SQL to format various datetime
> formats.
>
> 2.  SS does case insensitive searches. Converting it to PG involves
> modifying the SQL. Not an easy job to do.
>
> 3. SS has 3 part structures.  DB.SCHEMA.TABLE which can be used in SQL.
> PG can go only up to 2 part SCHEMA.TABLE.
> 3 part is possible, but using FDW or other cumbersome mechanism.
>

Thank you.  I am referring more to features, functionality, scalability,
etc.

I want to get them interested in giving Postgres a try instead of SQL
Server.  Surely the savings on licenses can be substantial (I have no idea
how much, TBH), but I want to instill confidence in them that anything they
do with SQL Server can be done with Postgres.

Igal

Re: Postgres for SQL Server users

2019-05-06 Thread Igal Sapir

Brent,

On Mon, May 6, 2019 at 1:44 PM Brent Wood  wrote:

> Hi Igal,
>
> One relevant comment I found interesting a couple of years ago...
>
> A New Zealand Govt agency was installing an institutional GIS system
> (several thousand potential users). It supported different back-end spatial
> databases. Previous installs of this system for other clients had used MS
> SQL Server, this was the first that was installed on a Postgres/Postgis
> back end.
>
> As it was about to go into action as a live, production system, I was
> speaking to one of the team doing the installation, and asked how they had
> found working with Postgres instead of SQL Server. The reply was worth a
> few formal benchmarks:
>
> "It's so easy to connect to and sh*t it's fast!"
>
> Might be a useful comment for your presentation :-) A few years on & it is
> still working fine...
>

This is a great quote, but can I quote a real source?  Otherwise it's just
something I heard from Brent on the mailing list.

Best,

Igal

Re: Postgres for SQL Server users

2019-05-06 Thread Igal Sapir

Ron,

On Mon, May 6, 2019 at 12:54 PM Ron  wrote:

> On 5/6/19 2:47 PM, Igal Sapir wrote:
> > but I want to instill confidence in them that anything they do with SQL
> > Server can be done with Postgres.
>
> Right off the top of my head, here are some things you can't (easily and
> trivially) do in Postgres:
> - Transparent Data Encryption
> - Block level full, differential and log backups that do more than
> same-version DR.
>
>
I'll be sure to try and avoid these topics then lol.

I am trying to make the opposite case.

Thanks,

Igal



>
> --
> Angular momentum makes the world go 'round.
>
>
>

Re: Postgres for SQL Server users

2019-05-06 Thread Igal Sapir

Tony,

On Mon, May 6, 2019 at 10:35 PM Tony Shelver  wrote:

> I have to agree on the geospatial (GIS) features.
> I converted from SQL Server to Postgresql for our extended tracking
> database.  The SS geospatial feature set doesn't seem nearly as robust or
> complete or perfoirmant as that supplied by PostGIS.
> The PostGIS ecosystem of open source / 3rd party tools is also far bigger,
> for anything to do with mapping.  Openstreetmaps.org stores their world
> dataset on Postgresql / PostGIS, and there a ton of mostly open
> source-based tools and organizations that work with it or any other PostGIS
> data to provide a complete GIS solution.
>
> My first sS implementation had me backing out of storing geographic points
> in the relevant SQL Server datatype as the performance hit during loading
> was just too big.  Doing the same thing in Postgresql / PostGIS is nardly
> noticeable.
>
> Another feature in Postgres is that you are not restricted to just plpgsql
> as an internal procedural language.
>
> I am not an expert, but it also seems far easier to create, install and
> work with major extensions to Postgresql than SQL Server.  I found
> installing the GIS featureset in SS to be a bit of a pain back oin the
> day..
>

GIS is a good feature but it's a niche feature, so while I'll mention it
with extensions I am looking for more general-purpose comparisons and areas
where Postgres is as-good or better than SQL Server.

Thanks,

Igal

Re: Postgres for SQL Server users

2019-05-08 Thread Igal Sapir

Brent,

On Tue, May 7, 2019 at 12:42 PM Brent Wood  wrote:

> I have not used SS for spatial data, but I don't have a Postgres database
> without Postgis installed. The OSGEO ecosystem and synergies with other
> FOSS GIS tools is fantastic.
>
> And it does not stop with the Postgis extension. For time series data
> (anything from fleet management to sensor data) Postgres has the (new)
> TimescaleDB extension. I ran this very effectively with a 600,000,000
> record database of sensor readings from a research vessel - on a $400
> laptop (with an SSD) for testing/prototyping. The sensor data was stored in
> Timescaledb Hypertables & the location data in Postgis geometry columns in
> those tables. Significantly better performance than native Postgres.
>

Excellent information with impressive numbers.


>
> Also consider language support for database functions... pl/R supports
> some very nice capabilities, especially supporting websites. Instead if
> running a Postgres query to return the data to plot via the web page, or
> storing static plots in your CMS that need updating when you get new data,
> you can use Postgres functions in pl/R to render the plot of the data in a
> file, and return the name of the file. The web site does no rendering, just
> invokes the SQL & displays the file that is returned. So the DB can return
> the data and/or the graphic. Back up your database & back up your
> functions. This paradigm can work very effectively...
>

Is there any tutorial/example code for rendering pl/R images on a website?
Cool feature


>
> Generally, the FOSS ecosystem around Postgres offers an incredible array
> of tools and capabilities that I don't think any other db - FOSS or not -
> can provide. I have had limited exposure to Oracle, SQL Server, Sybase,
> Empress, Teradata, Netezza, DB2, Sqlite/Spatialite, Interbase & Informix.
> Of these, Postgres & Sqlite3 (which one depends on use cases) are all I use
> these days.
>

Yep.  agreed.

Thanks,

Igal


>
>
> On Tuesday, May 7, 2019, 5:36:00 PM GMT+12, Tony Shelver <
> tshel...@gmail.com> wrote:
>
>
> I have to agree on the geospatial (GIS) features.
> I converted from SQL Server to Postgresql for our extended tracking
> database.  The SS geospatial feature set doesn't seem nearly as robust or
> complete or perfoirmant as that supplied by PostGIS.
> The PostGIS ecosystem of open source / 3rd party tools is also far bigger,
> for anything to do with mapping.  Openstreetmaps.org stores their world
> dataset on Postgresql / PostGIS, and there a ton of mostly open
> source-based tools and organizations that work with it or any other PostGIS
> data to provide a complete GIS solution.
>
> My first sS implementation had me backing out of storing geographic points
> in the relevant SQL Server datatype as the performance hit during loading
> was just too big.  Doing the same thing in Postgresql / PostGIS is nardly
> noticeable.
>
> Another feature in Postgres is that you are not restricted to just plpgsql
> as an internal procedural language.
>
> I am not an expert, but it also seems far easier to create, install and
> work with major extensions to Postgresql than SQL Server.  I found
> installing the GIS featureset in SS to be a bit of a pain back oin the
> day..
>
> On Tue, 7 May 2019 at 00:53, Michel Pelletier 
> wrote:
>
> On Mon, May 6, 2019 at 2:49 PM Adam Brusselback 
> wrote:
>
> I think the main "gotcha" when I moved from SQL Server to Postgres was I
> didn't even realize the amount of in-line t-sql I would use to just get
> stuff done for ad-hoc analysis. Postgres doesn't have a good way to emulate
> this. DO blocks cannot return resultsets, so short of creating a function
> and dropping it, it's not possible to get the same workflow.
>
>
> Just ruminating here, and this has probably already been discussed in the
> past, but I've always wanted something like a 'SELECT DO [LANGUAGE ...]
> RETURNS rettype | TABLE (...) $$ RETURN [NEXT | QUERY] ... $$; but haven't
> had any serious problem with creating/dropping functions like you mentioned.
>
> -Michel
>
>
> The lack of GUI tooling was also a huge "whoa" moment for me, which I
> still grapple with.
>
>

Featured Big Name Users of Postgres

2019-06-11 Thread Igal Sapir

I'm doing a presentation about Postgres to SQL Server users this weekend,
and I want to showcase some of the big names that use Postgres, e.g.
MasterCard, Government agencies, Banks, etc.

There used to be a Wiki page of Featured Users but that link is broken now.

I also "found" a page about MasterCard at the EnterpriseDB website, but
that one throws an error with too many redirects.

Any suggestions on where to find a recent user list?  Thanks!

Re: Featured Big Name Users of Postgres

2019-06-11 Thread Igal Sapir

On Tue, Jun 11, 2019 at 2:11 PM Chris Travers 
wrote:

> At Adjust GmbH we have 5-10 PB data in Postgres.
>
>
> On Tue, Jun 11, 2019 at 9:28 PM Ireneusz Pluta/wp.pl  wrote:
>
>> W dniu 2019-06-11 o 19:45, Igal Sapir pisze:
>> > I'm doing a presentation about Postgres to SQL Server users this
>> weekend, and I want to showcase
>> > some of the big names that use Postgres, e.g. MasterCard, Government
>> agencies, Banks, etc.
>> >
>> You might be interested in this:
>> https://www.theguardian.com/info/2018/nov/30/bye-bye-mongo-hello-postgres
>>
>>
Andreas - unfortunately I do not recognize any of the names so it's
probably European entities that are not very popular here.

Ireneusz - I love the article!  Will definitely use this.

Chris - very impressive!  I will mention it in my talk.

Thank you all,

Igal

Passing a dynamic interval to generate_series()

2024-06-30 Thread Igal Sapir

Hello,

I am trying to pass a dynamic interval to generate_series() with date range.

This works as expected, and generates a series with an interval of 1 month:

SELECT generate_series(
date_trunc('month', current_date),
date_trunc('month', current_date + interval '7 month'),
interval '1 month'
)


This works as expected and returns an interval of 1 month:

SELECT ('1 ' || 'month')::interval;


But this throws an error (SQL Error [42601]: ERROR: syntax error at or near
"'1 '"):

SELECT generate_series(
date_trunc('month', current_date),
date_trunc('month', current_date + interval '7 month'),
interval ('1 ' || 'month')::interval
)


And this returns a series with interval of 1 second??

SELECT generate_series(
date_trunc('month', current_date),
date_trunc('month', current_date + interval '7 month'),
(interval '1 ' || 'month')::interval
)

Because this returns an interval of 1 second:

SELECT (interval '1 ' || 'month')::interval;

Is that a bug?


I am able to work around the issue using a CASE statement, but shouldn't it
work simply by concatenating the string with the || operator?

Thank you,

Igal

Re: Passing a dynamic interval to generate_series()

2024-06-30 Thread Igal Sapir

On Sun, Jun 30, 2024 at 3:51 PM Tom Lane  wrote:

> Igal Sapir  writes:
> > But this throws an error (SQL Error [42601]: ERROR: syntax error at or
> near
> > "'1 '"):
>
> > SELECT generate_series(
> > date_trunc('month', current_date),
> > date_trunc('month', current_date + interval '7 month'),
> > interval ('1 ' || 'month')::interval
> > )
>
> You're overthinking it.
>
> SELECT generate_series(
> date_trunc('month', current_date),
> date_trunc('month', current_date + interval '7 month'),
> ('1 ' || 'month')::interval
> );
> generate_series
> 
>  2024-06-01 00:00:00-04
>  2024-07-01 00:00:00-04
>  2024-08-01 00:00:00-04
>  2024-09-01 00:00:00-04
>  2024-10-01 00:00:00-04
>  2024-11-01 00:00:00-04
>  2024-12-01 00:00:00-05
>  2025-01-01 00:00:00-05
> (8 rows)
>

Thank you, Tom.  I thought that I tried that too, but apparently I did not
because it works the way you wrote it.



>
> It might help to read this:
>
>
> https://www.postgresql.org/docs/current/sql-syntax-lexical.html#SQL-SYNTAX-CONSTANTS-GENERIC
>
> and to experiment with what you get from the constituent elements
> of what you tried, rather than trying to guess what they are from
> generate_series's behavior.  For example,
>
> select (interval '1 ');
>  interval
> --
>  00:00:01
> (1 row)
>
> select (interval '1 ' || 'month');
>?column?
> ---
>  00:00:01month
> (1 row)
>

I actually did test the expression that I posted, but it might be casting
it twice.  While your examples that you wrote show 1 month correctly:

SELECT (interval '1 ' || 'month');

?column? |
-+
00:00:01month|

SELECT ('1 ' || 'month')::interval;

interval|
+
   1 mon|

When the expression includes the "::interval" suffix as in the example that
I posted it returns 1 second, possibly because it is casting to interval
twice (at least on PostgreSQL 16.2 (Debian 16.2-1.pgdg120+2)):

SELECT (interval '1 ' || 'month')::interval;

interval|
+
00:00:01|

Anyway, you solved my issue, so thank you very much as always,

Igal



>
> regards, tom lane
>

Re: [OT] subscribtion with no-mail option?

2017-12-25 Thread Igal Sapir

I don't think that it's possible, but why don't you open a free gmail
account or another web-based email service that's available in your
region/language and subscribe to all of the mailing lists through that
account?


Igal

On Mon, Dec 25, 2017 at 2:49 AM, Michelle Konzack <
linux4miche...@tamay-dogan.net> wrote:

> Hello anyone,
>
> is there an option for the pgsql lists to subscribe only for posting
> and not receiving messages?  I can read them in the archives, but the
>  is one of the most active lists I am subscribed to.
>
> Not even  can beat that.
> (there I can whitelist me, without receiving messages)
>
> Oh, I should say, that I live in Estonia far away from any lelephone
> land lines, hence no DSL availlable and I am on GSM/4G/LTE and if I
> would not have the NO-MAIL option, I would need to download more the
> 3000 messages a day where I am only interested in less the 5%.
>
> Note:  I am currently programming a new PHP-Webinterface for my mail
>server which let me better choose, what I want to se and what
>not, hence deleting anything which is not relevant to me and
>has not some keywords in it.
>
> It is just worse, that I have to be subscribed to more then 80 mailing
> lists only to be able to post 200-300 messages per month and get maybe
> 2000 messages back, but have to download more then 100.000 messages a
> month with more then 1000 MByte.  THis is just waste of resources and
> is very costly over the time.
>
> There is no benefit for the mailinglste except for my ISP and Telecom
> provider which bill me nicely... more then 100€ per month!
>
> Thanks in avance
> Michelle
>
>
> --
> Michelle Konzack
> 00372-54541400
>
>
>
>

Returning Values from INSERT ON CONFLICT DO NOTHING

2018-01-01 Thread Igal Sapir

Hello,

I want to use the "UPSERT" syntax for returning an ID if it exists, or
inserting a record and returning the new ID if it does not exist.

INSERT INTO users(email, name)
  VALUES('u...@domain.tld', 'User')
  ON CONFLICT (email) DO NOTHING
  RETURNING user_id, (xmax::text::int > 0) as existed;

When an email address does not exist then it works fine and I get the new
user_id, but when it does exist, I get no results at all.

How can I get the results in case of a CONFLICT?

Thanks,


Igal

Re: Returning Values from INSERT ON CONFLICT DO NOTHING

2018-01-01 Thread Igal Sapir

It seems that if I do a simple update it resolves my issue:

INSERT INTO users(email, name)
  VALUES('u...@domain.tld', 'User')
  ON CONFLICT (email) DO UPDATE
SET email = excluded.email  -- users.email works too, not sure if makes
a difference
  RETURNING user_id, (xmax::text::int > 0) as existed;

But if anyone has a better solution then I'd love to hear it.

Thanks,


Igal

On Mon, Jan 1, 2018 at 11:07 PM, Igal Sapir  wrote:

> Hello,
>
> I want to use the "UPSERT" syntax for returning an ID if it exists, or
> inserting a record and returning the new ID if it does not exist.
>
> INSERT INTO users(email, name)
>   VALUES('u...@domain.tld', 'User')
>   ON CONFLICT (email) DO NOTHING
>   RETURNING user_id, (xmax::text::int > 0) as existed;
>
> When an email address does not exist then it works fine and I get the new
> user_id, but when it does exist, I get no results at all.
>
> How can I get the results in case of a CONFLICT?
>
> Thanks,
>
>
> Igal
>

Re: Returning Values from INSERT ON CONFLICT DO NOTHING

2018-01-02 Thread Igal Sapir

On Mon, Jan 1, 2018 at 11:59 PM, Jov  wrote:

> From https://www.postgresql.org/docs/devel/static/sql-insert.html:
>
>> The optional RETURNING clause causes INSERT to compute and return
>> value(s) based on each row actually inserted (or updated, if an ON
>> CONFLICT DO UPDATE clause was used). This is primarily useful for
>> obtaining values that were supplied by defaults, such as a serial sequence
>> number. However, any expression using the table's columns is allowed. The
>> syntax of the RETURNING list is identical to that of the output list of
>> SELECT. Only rows that were successfully inserted or updated will be
>> returned. For example, if a row was locked but not updated because an ON
>> CONFLICT DO UPDATE ... WHERE clause *condition* was not satisfied, the
>> row will not be returned.
>
>
> do update will return values while do nothing will not.
>

That explains it, thank you.


>
> 2018-01-02 15:43 GMT+08:00 Igal Sapir :
>
>> It seems that if I do a simple update it resolves my issue:
>>
>> INSERT INTO users(email, name)
>>   VALUES('u...@domain.tld', 'User')
>>   ON CONFLICT (email) DO UPDATE
>> SET email = excluded.email  -- users.email works too, not sure if
>> makes a difference
>>   RETURNING user_id, (xmax::text::int > 0) as existed;
>>
>
> Do not update email column because there is index on this column. It is
> better to update other non-index column for HOT update.
>

Makes sense, thanks again.


Igal

Migrate2Postgres - A new tool for migration from other DBMSs

2018-02-18 Thread Igal Sapir

Hi everybody,

I published a tool that makes it easy to migrate a database from other
DBMSs to Postgres:
https://github.com/isapir/Migrate2Postgres

Currently it supports migrations from MS SQL Server, but it is written in a
way that will make it easy to migrate from other DBMSs as well.

I also published a video (didn't spend too much time editing it, sorry)
with an overview and a demo of migrating the AdventureWorld database, which
is one of the MS SQL Server sample databases.  The migration itself takes
about a minute.  The overview of the tool and the explanations take longer:
https://youtu.be/5eF9_UB73TI

For the impatient, this is the point in the video where I start using the
tool (so you can watch that first and then later watch the overview and the
configuration options from the beginning);
https://youtu.be/5eF9_UB73TI?t=9m51s

Feedback welcome!  Thank you,


Igal

Case Insensitive Comparison with Postgres 12

Re: Case Insensitive Comparison with Postgres 12

Unable to Vacuum Large Defragmented Table

Re: Unable to Vacuum Large Defragmented Table

Re: Unable to Vacuum Large Defragmented Table

Re: Unable to Vacuum Large Defragmented Table

Re: Unable to Vacuum Large Defragmented Table

Re: Unable to Vacuum Large Defragmented Table

Starting Postgres when there is no disk space

Re: Starting Postgres when there is no disk space

Re: Starting Postgres when there is no disk space

Re: Starting Postgres when there is no disk space

Re: Starting Postgres when there is no disk space

Connecting to NOTIFY with telnet

Re: Connecting to NOTIFY with telnet

Re: Connecting to NOTIFY with telnet

Re: Connecting to NOTIFY with telnet

Re: Connecting to NOTIFY with telnet

Postgres for SQL Server users

Re: Postgres for SQL Server users

Re: Postgres for SQL Server users

Re: Postgres for SQL Server users

Re: Postgres for SQL Server users

Re: Postgres for SQL Server users

Featured Big Name Users of Postgres

Re: Featured Big Name Users of Postgres

Passing a dynamic interval to generate_series()

Re: Passing a dynamic interval to generate_series()

Re: [OT] subscribtion with no-mail option?

Returning Values from INSERT ON CONFLICT DO NOTHING

Re: Returning Values from INSERT ON CONFLICT DO NOTHING

Re: Returning Values from INSERT ON CONFLICT DO NOTHING

Migrate2Postgres - A new tool for migration from other DBMSs

33 matches

Site Navigation

Mail list logo

Footer information