Can I tell libpq to connect to the primary?
Hi I'm setting up a new environment with a primary/hot standby replication pair. For read-only clients, I have found the host=host1,host2 connection string[1] which allows a connection when any one of the servers is up. However I'm unsure how to achieve something similar for read/write clients. If I'm providing this list, the client will connect to the first host, which might have become the standby in the meantime. I see that in pgjdbc there are additional options for targetServerType = any, primary, secondary, preferSlave and preferSecondary[2]. However this seems to be java-specific and not implemented in libpq? Is there a way to get this behaviour in PHP/Psycopg/Perl DBI? If not, what is the best alternative to achieve this? My primary & standby are not in the same L3 network, so moving around and IP address with Linux OS clustering is not an option. I'm tending to scripting the REST API of our DNS service to point a CNAME at the new primary during promotion, but maybe there is an easier way I haven't found yet? Maybe something in pgpool/pgbouncer etc? Cheers Christian [1] https://www.postgresql.org/docs/current/libpq-connect.html [2] https://jdbc.postgresql.org/documentation/head/connect.html
Re: Can I tell libpq to connect to the primary?
On 22.04.20 21:10, Christian Ramseyer wrote: > > I see that in pgjdbc there are additional options for targetServerType = > any, primary, secondary, preferSlave and preferSecondary[2]. However > this seems to be java-specific and not implemented in libpq? Is there a > way to get this behaviour in PHP/Psycopg/Perl DBI? > Never mind, after RTFM'ing to the very end of https://www.postgresql.org/docs/current/libpq-connect.html I have discovered target_session_attrs=read-write|any which seems to do exactly what I want. Cheers Christian
Encoding conversion: Use replacement character instead of failing query with "ERROR: character with byte sequence 0xd0 0xad in encoding "UTF8" has no equivalent in encoding LATIN1" ?
Hello list I'm slowly converting all of our databases from Postgres 9 and LATIN1 to Postgres 12 and UTF8, one by one. I was wondering if there is a solution for this issue: if a database that is still latin1 has a postgres_fdw foreign table onto a converted utf8 database, and somehow a character that is not in latin1 has already gotten in there, a query might fail with e.g. PG9=# select * from fdw_table_test where hostname ~* 'moscow-ix-02'; ERROR: character with byte sequence 0xd0 0xad in encoding "UTF8" has no equivalent in encoding "LATIN1" -- (0xd0 0xad being CYRILLIC CAPITAL LETTER E: Э) Can I somehow influence the client:UTF8->server:LATIN1 character set conversion so that instead of failing, it inserts an invalid codepoint character, the utf8 hex bytes as string, drops the character or something like that? I do agree that the default behavior is correct and in general failing is a lot better than mutilating or losing data in the conversion. However in this specific case we don't care all that much about the bits in possibly foreign scripts, and just having some ? in the string would be easier to handle than one Cyrillic character in a single row failing a large import job completely. Is there any way I can do this on the conversion level* ? Cheers Christian * I get that I could somehow write views too look at the content first and filter out characters that won't work in LATIN1 before going through the fdw, but I don't quite know all the tables and columns where this can become an issue. But if you have a copy/paste ready solution for that I'll take it as well of course :)
Re: Lock Postgres account after X number of failed logins?
On 05.05.20 16:13, Wolff, Ken L wrote: > Hi, everyone. Wondering if there’s a way in PostgreSQL to automatically > lock accounts after a number of failed logins (a security requirement > for my organization). > > Locking accounts after X number of failed logins is an excellent way to > defeat brute force attacks, so I’m just wondering if there’s a way to do > this, other than the aforementioned hook. > > Hi Ken This doesn't seem mentioned in other replies so far: a very "unixy" approach to bolt this feature onto almost any Linux server process is the fail2ban (https://github.com/fail2ban/fail2ban) utility. This is a daemon that reads arbitrary logfiles, and then triggers an action if some failure condition is seen a number of times. Typically this will scan the logfile for an IP and on failure add a temporary firewall rule to block the source, but all of this is configurable. So in your case you can lock the account instead, and then decide if you want automatic unlocking after a while, if you want to drop the IP that tried to login additionally on the firewall as well, etc. Here is a quick, rough example with still some blanks to fill in - I put it on github for readability: <https://gist.github.com/rc9000/fd1be13b5c8820f63d982d0bf8154db1> The main blanks are in the postgres-action.conf section. The called scripts in /usr/local/bin would need to be written. It can be as simple as "psql -c alter role xxx nologin", but you might add some features like connecting to the primary server if fail2ban triggered on the standby. Also I'm not sure if setting nologin is the best way to disable an account, but I'm sure somebody on here could tell you. Cheers Christian -- Christian Ramseyer, netnea ag Network Management. Security. OpenSource. https://www.netnea.com
Re: Abnormal Growth of Index Size - Index Size 3x large than table size.
Hi On 06.05.20 11:48, Ram Pratap Maurya wrote: > We are facing a problem in our PostgreSQL production database related to > abnormal growth of index size. Some of the indexes are having abnormal > growth and index size is larger than table data size. > > One table is having 75 G.B of index though table size is only 25 G.B. On > monthly basis we are performing vacuum to release the used space. > > > I am attaching the screen shot for your reference. Could you please help > us in resolving the same as this is degrading performance drastically. > Under some usage patterns, a periodic REINDEX might be advisible. See <https://www.postgresql.org/docs/current/routine-reindex.html> for more details, it might free up a lot of space for you. If it doesn't, you'll need to dive deeper into what this indexes actually are, if they are really used etc. But in cases of abnormal growth that gets worse and worse over time, the above is the first thing to try in my experience. Cheers Christian -- Christian Ramseyer, netnea ag Network Management. Security. OpenSource. https://www.netnea.com
Re: Encoding conversion: Use replacement character instead of failing query with "ERROR: character with byte sequence 0xd0 0xad in encoding "UTF8" has no equivalent in encoding LATIN1" ?
On 06.05.20 02:00, Tom Lane wrote: > Christian Ramseyer writes: >> Can I somehow influence the client:UTF8->server:LATIN1 character set >> conversion so that instead of failing, it inserts an invalid codepoint >> character, the utf8 hex bytes as string, drops the character or >> something like that? > > There's nothing built-in for that, but it seems like it wouldn't be > hard to modify the code if you wanted a quick hack to do this. > > In general, the system nominally supports multiple conversion functions > per encoding pair, so you could imagine having an alternate conversion > that doesn't throw errors. Problem is that it's quite difficult to get > the system to actually *use* a non-default conversion for anything really > significant, like say client I/O. I don't know that anyone's thought > hard about how to improve that. > Thanks Tom, that's basically like I suspected how it was, but I wanted to make sure I'm not overlooking an easy workaround with a simple "create conversion" or similar. I really appreciate the quick answers from highly qualified people I'm getting on here, without exceptions. If only "enterprise" product support worked like that :) Cheers Christian -- Christian Ramseyer, netnea ag Network Management. Security. OpenSource. https://www.netnea.com
Re: Lock Postgres account after X number of failed logins?
On 06.05.20 13:48, Guillaume Lelarge wrote: > Le mer. 6 mai 2020 à 04:18, Christian Ramseyer <mailto:r...@networkz.ch>> a écrit : > > Here is a quick, rough example with still some blanks to fill in - I put > it on github for readability: > <https://gist.github.com/rc9000/fd1be13b5c8820f63d982d0bf8154db1> > > The main blanks are in the postgres-action.conf section. The called > scripts in /usr/local/bin would need to be written. It can be as simple > as "psql -c alter role xxx nologin", but you might add some features > like connecting to the primary server if fail2ban triggered on the > standby. Also I'm not sure if setting nologin is the best way to disable > an account, but I'm sure somebody on here could tell you. > > > I already knew about fail2ban, but didn't know it could be set up this > way. That's pretty impressive. I've just finished testing your config > files, and it works really well (well, when you finally get rid of the > selinux permission errors :) ). Anyway, thanks a lot for sharing this. > Thanks for trying it out and the kind words, Guillaume & Ken ! There are some rough corners, I think to make it useful we would need to do at least: 1. Write reasonable scripts for account locking/unlocking 2. Currently the lockout will also be executed for non-existing user names and thus make the DOS worse, so we'd need a smart solution for that (config file with valid users, or cached queries into PG from time to time to get the existing users, or just being smarter on the log parsing DETAILS line) 3. Examples how to combine with https://www.postgresql.org/docs/current/auth-delay.html and/or firewall drops, so that an attacker gets slowed down. Even if the account is locked already, the system will still be harmed otherwise. I'm happy to host this project if it helps enterprise adaption of Postgres. I've converted the gist into an acutal repository, and you're all very welcome to become contributors: https://github.com/rc9000/postgres-fail2ban-lockout Cheers Christian -- Christian Ramseyer, netnea ag Network Management. Security. OpenSource. https://www.netnea.com
Re: Removing Last field from CSV string
On 16.05.20 17:18, Alex Magnum wrote: > Now I try to remove the last field and comma ",Class" > > To get Class V,Class VI,Class VII,Competitive Exam,Class VIII > > Is there a function or easy way to do this? > Any help would be appreciated. > Hi Alex Many options to do this with regexp_replace, here's one way: with test as ( select 'Class VII,Competitive Exam,Class VIII,Class' as str union select 'Class VIIx,Competitive Exam22,Class VIIIabc,Classx' ) select str, regexp_replace(str, '^(.*),(.*?)$', '\1') res from test; |str |res | |--| |Class VII,Competitive Exam,Class VIII,Class |Class VII,Competitive Exam,Class VIII |--| |Class VIIx,Competitive Exam22,Class VIIIabc,Classx |Class VIIx,Competitive Exam22,Class VIIIabc | (I cut some columns at the start to better fit email width) Cheers Christian -- Christian Ramseyer, netnea ag Network Management. Security. OpenSource. https://www.netnea.com
Re: Slow SELECT
Hi On 26.05.20 09:22, Frank Millman wrote: > > I have looked at the EXPLAIN, but I don't really know what to look for. > I can supply it if that would help. > My favorite approach to tuning Postgres queries is: 1. Run EXPLAIN ANALYZE 2. Copy/Paste the output into the fantastic https://explain.depesz.com/ This will turn the somewhat hard-to-understand explain output into a nice colored structure. If it's not obvious from the orange-reddish boxes where the slowness comes from, please post the link here and somebody will certainly have some advice. Cheers Christian -- Christian Ramseyer, netnea ag Network Management. Security. OpenSource. https://www.netnea.com
Re: Cluster for an appliance-type deployment
On 06.11.23 20:26, Matthias Leisi wrote: Dear all, I’m reasonably experienced with Postgres with simple (single, „rebuild and restore“) requirements, but would need some hints on what to look for in a more complex situation - deploying Postgres as the backend for a (virtual) appliance. This appliance can scale horizontally from a single to dozens of VMs (theoretically more, but most installations are small-ish). It is feasible to configure VMs for particular purposes (eg „you are [also] a DB node“), but basically all instances will/should be able to perform their main tasks besides (also) being a DB node. As the VMs may be installed in very different environments, network-based solutions are less feasible and we would prefer a DB-level solution. We assume that for most cases, primary/stand-by configurations would be sufficient in terms of availability / latency / throughput. We must also assume that there is no person who would be able to touch things if an error occurs. Data consistency and (as much as possible) automated recovery from error situations („VM down“, „network lost“, …) are therefor more important than „n nines". We can assume that the VMs can talk to each other over TCP (eg using SSH tunnels, direct Postgres connection, or some other suitable protocol). Scripting „around“ the database is available to initialize instances and for similar tasks. Would Postgres’ own log-shipping (file-based + streaming replication, possibly with remote_write) be sufficient for such a set of requirements? What aspects would you consider important for such a scenario? The replication that ships with Postgres gives you one writeable primary server and a number of standbys, but it has no tools to automatically discover or recover from failure. From https://www.postgresql.org/docs/current/warm-standby-failover.html: --- PostgreSQL does not provide the system software required to identify a failure on the primary and notify the standby database server. Many such tools exist and are well integrated with the operating system facilities required for successful failover, such as IP address migration. Once failover to the standby occurs, there is only a single server in operation. This is known as a degenerate state. The former standby is now the primary, but the former primary is down and might stay down. To return to normal operation, a standby server must be recreated, either on the former primary system when it comes up, or on a third, possibly new, system. The pg_rewind utility can be used to speed up this process on large clusters. Once complete, the primary and standby can be considered to have switched roles. Some people choose to use a third server to provide backup for the new primary until the new standby server is recreated, though clearly this complicates the system configuration and operational processes. --- So you need to add additional parts from the ecosystem to detect failure, handle failover, potentially move an IP address with the Primary etc. Popular tools are repmgr, Patroni, CloudNativePG, BDR, pacemaker+corosync and endless others. They will address many of your requirements, but still some work and understanding is required to make them fully unattended, as well as have them expand dynamically with new replicas if an appliance is added. Postgres is an amazing product and I like to use it for almost everything, but in this scenario we are of course making our live hard with ACID compliance. In NoSQL/"eventual consistency" land, there are products that are a lot friendlier to a setup like this - stuff like Cassandra, etcd, CouchDb comes to mind. I'd compare the pros and cons of such alternatives, the big con of course being a lot less consistency and durability guarantees - the question is whether the application needs it. But enough about non-Postgres topics on this list :) To go with Postgres, on a hunch I'd try Patroni first: it does a lot of the advanced failover stuff, has a great track record, and supposedly runs on BSD: https://openports.pl/path/databases/patroni Cheers Christian -- Christian Ramseyer, netnea ag Network Management. Security. OpenSource. https://www.netnea.com Phone: +41 79 644 77 64
Re: postgresql order lowercase before uppercase
On 19.03.21 10:12, basti wrote: > On 19.03.21 08:33, Laurenz Albe wrote: >> On Thu, 2021-03-18 at 23:51 +0100, basti wrote: >>> Am 18.03.21 um 17:19 schrieb Laurenz Albe: >>>> On Thu, 2021-03-18 at 15:39 +0100, basti wrote: >>>>> I need to as follow: >>>>> >>>>> ande >>>>> Amma >>>>> Anit >>>>> Anti >>>>> Brac >>>>> Cali >>>>> >>>> Laurenz' approach is sound, it just needs a little tweak to not trip up on the "andere Marken" uppercase M. Try this: select id, marke.name from marke ORDER BY left(marke.name,1) <> left(lower(marke.name),1), marke.name Fiddle: http://sqlfiddle.com/#!17/d9d83e/9 Cheers Christian -- Christian Ramseyer, netnea ag Network Management. Security. OpenSource. https://www.netnea.com
Re: PostgreSQL CHECK Constraint
On 03.10.21 09:31, Shaozhong SHI wrote: > > Has anyone got experience with data quality checking, validation and > reporting within PostgreSQL? > > How best to use PostgreSQL CHECK Constraint for data quality checking, > validation and reporting? > > Can we report on errors in a detailed and specific way? For instance, > can we produce report on specific issues of erroneous cells in which row > and etc.? > Yes that's all possible. Given a constraint like alter table test_customers add constraint check_age check (age >= 18); The reported error looks like this: postgres@dellstore ERROR: new row for relation "test_customers" violates check constraint "check_age" postgres@dellstore DETAIL: Failing row contains (1, Jimmy, Schmoe, 15). postgres@dellstore STATEMENT: insert into test_customers (firstname, lastname, age) values ( 'Jimmy', 'Schmoe', 15); This errors appears in the serverlog which has many format and forwarding options, you can read about them here: https://www.postgresql.org/docs/current/runtime-config-logging.html Cheers Christian -- Christian Ramseyer, netnea ag Network Management. Security. OpenSource. https://www.netnea.com
Re: PostgreSQL CHECK Constraint
> The reported error looks like this: > > postgres@dellstore ERROR: new row for relation "test_customers" > violates check constraint "check_age" .. > > This errors appears in the serverlog which has many format and > forwarding options, you can read about them here: On 03.10.21 20:16, Shaozhong SHI wrote: > Hi, Christian, > That is interesting. Can errors be captured and saved as data with > scripting? Yes that works quite the same, e.g. in Python you can do try: cur.execute("insert into test_customers (firstname, lastname, age) values ( %s, %s, %s)", ("Bobby", "Tables", 10)) except psycopg2.errors.CheckViolation as e: print(f"That didn't work: {e.cursor.query} failed") print(f"{e.pgerror}") HTH
Re: pg_trgm vs. Solr ngram
On 10.02.23 04:48, Laurenz Albe wrote: On Fri, 2023-02-10 at 03:20 +0100, Chris wrote: In Solr I was using ngrams and customized the TokenizerFactories until more or less only whitespace was as separator, while [.:-_\d] remains part of the ngrams. This allows to search for ".12.255/32" or "xzy-eth5.example.org" without any false positives. It looks like a straight conversion of this method is not possible On 10.02.23 04:48, Laurenz Albe wrote: Here is a hack that you can try: pre-process your strings and replace symbols with rare characters: SELECT show_trgm(translate('127.0.0.1/32', './', 'qx')); show_trgm ═ {" 1"," 12",0q0,0q1,127,1x3,27q,"32 ",7q0,q0q,q1x,x32} (1 row) Then you could search like WHERE translate(search_string, './', 'qx') LIKE translate('%127.0.0.1/32%', './', 'qx') AND search_string LIKE '%127.0.0.1/32%' The first condition can use a trigram index, and the second filters out false positives. Hehe that is a nifty idea. I went to try this but then it turned out that I was probably overthinking the whole issue already. Using a gist_trgm_ops index and % as operator works perfectly well: insert into docs (orig) values ('120.2.10.22'); insert into docs (orig) values ('120 2 10 22'); CREATE INDEX iorig ON docs USING GIST (orig gist_trgm_ops); set enable_seqscan = off; explain analyze verbose select * from docs where orig like '%.10.22%'; Index Scan using iorig on public.docs (cost=0.14..8.16 rows=1 width=32) (actual time=0.952..1.018 rows=1 loops=1) Output: orig Index Cond: (docs.orig ~~ '%.10.22%'::text) Rows Removed by Index Recheck: 1 Even though this query has the same trigrams like e.g. '% 10 22%', the index recheck takes care of it and only the matching row is returned. Excellent, not quite sure why I was expecting false positives in the first place, it would be a pretty stark violation of how % is supposed to behave. Not quite sure how big the performance hit of not having the optimal trigrams with punctuation in the index and rechecking some hits will be, but for now I'll assume it's negligible. Otherwise I'll try the translate variant. Many thanks!