Issues with inconsistent COLLATION installation

2019-09-09 Thread Cory Nemelka
We are having issues with some databases getting our locales generated.  We
are using Ubuntu 18.04 and postgresql 10.8.

Example:

*from bash prompt:*

$ locale -a | egrep fr
fr_BE
fr_BE@euro
fr_BE.iso88591
fr_BE.iso885915@euro
fr_BE.utf8
fr_CA
fr_CA.iso88591
fr_CA.utf8
fr_CH
fr_CH.iso88591
fr_CH.utf8
french
fr_FR
fr_FR@euro
fr_FR.iso88591
fr_FR.iso885915@euro
fr_FR.utf8
fr_LU
fr_LU@euro
fr_LU.iso88591
fr_LU.iso885915@euro
fr_LU.utf8

*from psql prompt:*

[local] cnemelka@postgres=# create collation french
(provider=libc,locale='fr_FR.utf8');
ERROR:  22023: could not create locale "fr_FR.utf8": No such file or
directory
DETAIL:  The operating system could not find any locale data for the locale
name "fr_FR.utf8".
LOCATION:  report_newlocale_failure, pg_locale.c:1312

*Anyone having similar issues or know of the solution?*

*TIA,*
--cnemelka


Re: Issues with inconsistent COLLATION installation

2019-09-09 Thread Cory Nemelka
We have already run pg_import_system_collations('pg_catalog')
--cnemelka


On Mon, Sep 9, 2019 at 12:43 PM Cory Nemelka  wrote:

> We are having issues with some databases getting our locales generated.
> We are using Ubuntu 18.04 and postgresql 10.8.
>
> Example:
>
> *from bash prompt:*
>
> $ locale -a | egrep fr
> fr_BE
> fr_BE@euro
> fr_BE.iso88591
> fr_BE.iso885915@euro
> fr_BE.utf8
> fr_CA
> fr_CA.iso88591
> fr_CA.utf8
> fr_CH
> fr_CH.iso88591
> fr_CH.utf8
> french
> fr_FR
> fr_FR@euro
> fr_FR.iso88591
> fr_FR.iso885915@euro
> fr_FR.utf8
> fr_LU
> fr_LU@euro
> fr_LU.iso88591
> fr_LU.iso885915@euro
> fr_LU.utf8
>
> *from psql prompt:*
>
> [local] cnemelka@postgres=# create collation french
> (provider=libc,locale='fr_FR.utf8');
> ERROR:  22023: could not create locale "fr_FR.utf8": No such file or
> directory
> DETAIL:  The operating system could not find any locale data for the
> locale name "fr_FR.utf8".
> LOCATION:  report_newlocale_failure, pg_locale.c:1312
>
> *Anyone having similar issues or know of the solution?*
>
> *TIA,*
> --cnemelka
>


Re: Issues with inconsistent COLLATION installation

2019-09-09 Thread Cory Nemelka
Here is encoding for existing database:
  List of databases
┌───┬──┬──┬─┬─┬───┐
│   Name│  Owner   │ Encoding │   Collate   │Ctype│   Access
privileges   │
├───┼──┼──┼─┼─┼───┤
│ thedatabase │ postgres │ UTF8 │ en_US.UTF-8 │ en_US.UTF-8 │
│
│ postgres  │ postgres │ UTF8 │ en_US.UTF-8 │ en_US.UTF-8 │
  │
│ template0 │ postgres │ UTF8 │ en_US.UTF-8 │ en_US.UTF-8 │ =c/postgres
 ↵│
│   │  │  │ │ │
postgres=CTc/postgres │
│ template1 │ postgres │ UTF8 │ en_US.UTF-8 │ en_US.UTF-8 │
postgres=CTc/postgres↵│
│   │  │  │ │ │ =c/postgres
  │
└───┴──┴──┴─┴─┴───┘
--cnemelka


On Mon, Sep 9, 2019 at 12:45 PM Cory Nemelka  wrote:

> We have already run pg_import_system_collations('pg_catalog')
> --cnemelka
>
>
> On Mon, Sep 9, 2019 at 12:43 PM Cory Nemelka  wrote:
>
>> We are having issues with some databases getting our locales generated.
>> We are using Ubuntu 18.04 and postgresql 10.8.
>>
>> Example:
>>
>> *from bash prompt:*
>>
>> $ locale -a | egrep fr
>> fr_BE
>> fr_BE@euro
>> fr_BE.iso88591
>> fr_BE.iso885915@euro
>> fr_BE.utf8
>> fr_CA
>> fr_CA.iso88591
>> fr_CA.utf8
>> fr_CH
>> fr_CH.iso88591
>> fr_CH.utf8
>> french
>> fr_FR
>> fr_FR@euro
>> fr_FR.iso88591
>> fr_FR.iso885915@euro
>> fr_FR.utf8
>> fr_LU
>> fr_LU@euro
>> fr_LU.iso88591
>> fr_LU.iso885915@euro
>> fr_LU.utf8
>>
>> *from psql prompt:*
>>
>> [local] cnemelka@postgres=# create collation french
>> (provider=libc,locale='fr_FR.utf8');
>> ERROR:  22023: could not create locale "fr_FR.utf8": No such file or
>> directory
>> DETAIL:  The operating system could not find any locale data for the
>> locale name "fr_FR.utf8".
>> LOCATION:  report_newlocale_failure, pg_locale.c:1312
>>
>> *Anyone having similar issues or know of the solution?*
>>
>> *TIA,*
>> --cnemelka
>>
>


Re: Issues with inconsistent COLLATION installation

2019-09-09 Thread Cory Nemelka
Thank you :) I'll try restarting postgresql during our next maintenance
window and report back.
--cnemelka


On Mon, Sep 9, 2019 at 3:08 PM Tom Lane  wrote:

> "Daniel Verite"  writes:
> > I can reproduce this by creating a new locale *after* starting
> > PostgreSQL and trying to use it before a restart.
>
> That is interesting.  I think it must mean that glibc's setlocale()
> and newlocale() maintain some kind of internal cache about available
> locales ... and there's no logic to flush it if /usr/share/locale
> changes.  The individual backends are probably inheriting the cache
> state via fork from the postmaster.
>
> regards, tom lane
>


Re: pg full text search very slow for Chinese characters

2019-09-10 Thread Cory Nemelka
Well, there is a lot of information we would need to diagnose this.  How
much tuning have you done?, etc.

My advice is pretty simple.  Don't expect performance on a notebook and,
unless you are planning on hosting it on a notebook, use the notebook for
development only .  Test performance on a properly configured and tuned
server.

--cnemelka


On Tue, Sep 10, 2019 at 9:53 AM Jimmy Huang  wrote:

> Hi Team,
>
>
>
> Can anyone shed some light on why postgres 11 is extremely slow in my case?
>
>
>
> I am making a mirror of zh.wikisource.org and I have downloaded 303049
> pages and stored them in a postgres 11 database.
>
>
>
> My postgres instance is based on docker image postgres:11 and runs on my
> MacBook Pro i7 16GB.
>
>
>
> Database schema is as follows
>
>
>
> Table pages(id, url, html, downloaded, inserted_at, updated_at) and
> books(id, name, info, preface, text, html, url, parent_id, inserted_at,
> updated_at, info_html, preface_html)
>
>
>
> A wikisource web page is downloaded and its html text is inserted into
> table “pages” column “html.
>
> Later, books.{name, info, preface, text, html, info_html, preface_html}
> are extracted from pages.html. The text column of books is a txt version of
> the content of html column of table pages.
>
>
>
> On average there are 7635 characters (each characters is 3 bytes long
> because of utf-8 encoding) for text column of table books and I want to add
> full text search to books(text).
>
>
>
> I tried pg_trgm and my own customized token parser
> https://github.com/huangjimmy/pg_cjk_parser
>
>
>
> To my surprise, postgres 11 is extremely slow when creating a full text
> index.
>
>
>
> I added a column of tsvector type and tried to create an index on that
> column. Pg could not finish creating a GIN index for a long time and I had
> to cancel the execution.
>
> I then tried to create a partial full text index for 500 rows and it took
> postgres 2 to 3 minutes to create the index. Based on this estimation, pg
> will need at least one day to create a full GIN full text search index for
> 303049 rows of data. I think this is ridiculous slow.
>
> If I tried to create fts index for books(name) or books(info), it took
> just 3 minutes to create the index. However, name and info are extremely
> short compared to books(text).
>
>
>
> I switched to Elasticsearch and it turned out that Elasticsearch is
> extremely efficient for my case. It took Elasticsearch 3 hours to index all
> 303049 rows.
>
>
>
> Jimmy Huang
>
> jimmy_hu...@live.com
>


Re: pg full text search very slow for Chinese characters

2019-09-10 Thread Cory Nemelka
On Tue, Sep 10, 2019 at 10:11 AM Michael Lewis  wrote:

> >My postgres instance is based on docker image postgres:11 and runs on my
> MacBook Pro i7 16GB.
>
> How much ram and such did you give to this vm?
>
>
> >To my surprise, postgres 11 is extremely slow when creating a full text
> index. I added a column of tsvector type and tried to create an index on
> that column. Pg could not finish creating a GIN index for a long time and I
> had to cancel the execution.I then tried to create a partial full text
> index for 500 rows and it took postgres 2 to 3 minutes to create the index.
>
>
> Did you customize any config? maintenance_work_mem specifically would be
> relevant to the time to create an index and default value is only 64MB.
> Especially if you are running a spinning hard drive and not ssd, then this
> could be problematic.
>


I apologize for my top post. :D Won't happen again


Re: DDL support for logical replication

2019-10-10 Thread Cory Nemelka
Another use case is installations where there is heavy reliance on
temporary tables in queries.  Since you can't run queries that create
temporary tables on servers that are binary replicas, this leaves the
master (not horizontally scalable) or  logical replicas

--cnemelka


On Thu, Oct 10, 2019 at 2:09 PM Lev Kokotov  wrote:

> Hi Miles,
>
> One issue is keeping the subscriber and the publisher schema identical.
> Running migrations on both the publisher and subscriber does not seem
> atomic to me, therefore I don't have a way to enforce consistency between
> the two. The use case is simple: schemas change all the time, and keeping
> two databases (or more!) in sync manually is tough.
>
> Another issue is migrating large databases (in the terabytes) to a new
> version of Postgres with minimal downtime. Say it takes a week or two to
> sync up the subscriber with the publisher, we have to enforce a migration
> freeze for that duration. That's often inconvenient.
>
> We often want to have a database with a subset of data of a particular
> table somewhere else, and logical replication is great, since it allows me
> to run write operations on the subscriber. Binary replication forces me to
> have two identical databases.
>
> Best,
> Lev
>
> On Thu, Oct 10, 2019 at 12:49 PM Miles Elam 
> wrote:
>
>> Hi Lev,
>>
>> While I don't have an answer to your roadmap question, you've raised a
>> different question for me.
>>
>> What are you expecting to get from logical replication of DDL commands
>> that is not served by binary replication?  I ask because typically someone
>> would want to use logical replication if they wanted triggers to fire on
>> the subscriber, they only a subset of all tables replicated, etc.
>>
>> Perhaps a better question would be "What problem are you trying to
>> solve?" rather than focus on how you expected to solve that problem.
>>
>>
>> Cheers,
>>
>> Miles Elam
>>
>> On Thu, Oct 10, 2019 at 11:08 AM Lev Kokotov 
>> wrote:
>>
>>> Hello,
>>>
>>> Is DDL support on a roadmap for logical replication?
>>>
>>> Thank you.
>>> - Lev
>>>
>>


Re: Writing WAL files

2020-10-10 Thread Cory Nemelka
On Sat, Oct 10, 2020 at 3:41 AM Peter J. Holzer  wrote:

> On 2020-10-05 11:29:04 -0600, Michael Lewis wrote:
> >
> > I suggest that in PG12 you can monitor the
> > "lag" of a standby server more directly by looking at columns
> > write_lag,
> > flush_lag, replay_lag in the pg_stat_replication view.
> >
> >
> > And are those things updated when there are no changes to the master
> > database?
>
> Probably not, as there is nothing to replicate, so no new data it
> provided.
>
>
> > If you setup a scripted process to update a single row with a
> timestamptz on
> > the source/primary every minute, then you have a very simple consistent
> change
> > and also a way to check on the replica what is current time vs
> > last_scripted_update_time if you will and know the approx lag. It would
> seem
> > like a simple albeit hacky solution to you wanting a file every X minutes
> > regardless of server activity.
>
> It also has the advantage that you don't have to wait for the WAL file
> to be written. You can just check whether the change appears on the
> replicas. About 2 years ago I wrote a Nagios/Icinga check that does
> that: Update a timestamp in a table on the master, then connect to all
> the replicas and wait for the change to show up on them. It then reports
> the lag for each replica and a final status (OK, WARNING, CRITICAL)
> based on the maximal lag.
>
> I think I wrote it because the PostgreSQL version we were using at the
> time didn't have the lag columns yet, but it does have the advantage of
> providing an end to end check (do I really get the correct value?), not
> the database's idea of whether replication is working.
>
> (The check is written in Go and buried in a svn repo at work, but I
> could publish it if there is interest)
>
> hp
>
> --
>_  | Peter J. Holzer| Story must make more sense than reality.
> |_|_) ||
> | |   | h...@hjp.at |-- Charles Stross, "Creative writing
> __/   | http://www.hjp.at/ |   challenge!"
>

I would be interested in the Nagios/Icinga check you wrote.


Re: Implicit table removal from logical replication publication

2021-06-10 Thread Cory Nemelka
On Thu, Jun 10, 2021 at 12:39 PM Vijaykumar Jain <
vijaykumarjain.git...@gmail.com> wrote:

> Wow, the drop table silently removes entry from publication without any
> logs.
>
> I could not find any monitoring view to help me figure out if the
> publication is broken due to ddl change.
> pg_stat_replication on publisher, and pg_stat_subscription on
> subscriber only help with lsn based lag.
> unless this is not an issue but a feature ?
> i'll check more on the better part of monitoring logical replication stuff.
>
> but for your case,
> you can set up an event trigger that would avoid dropping the table.
> basically any drop of a table or any ddl that would break publication.
>
> functions-event-triggers
> 
> event-trigger-definition
> 
> how-use-event-triggers-postgresql
> 
>
> you can have a custom query filter that would prevent dropping of objects
> part of publication accidentally.
>
> and then you want to exclusively drop the table, once not part of
> publication, you have to first remove the table from publication and then
> drop.
>
> I have not run this in production, so I believe others may chime in, but
> logical replication issues from logs are not the best.
> I am happy to be corrected.
> I'll update on more scenarios.
>
> is pg_publication_tables what you are looking for?

>
>
> --
--cnemelka


Re: Logical replication from Rds into on-premise

2021-07-26 Thread Cory Nemelka
On Mon, Jul 26, 2021 at 12:06 PM Rama Krishnan  wrote:

> Hi all,
>
> I have a postgres server on Aws RDS no i want to replicate the data or
> logical replication into the on-premise server. I have gone through DMS
> provides the service buy it pricing was high. Do we have any option or
> method to achieve this?
>
> Thanks
>
> RamaKrishnan
>

I would start here:
https://www.postgresql.org/docs/12/sql-createpublication.html

make sure all your tables have primary keys. :D