Can I tell libpq to connect to the primary?

2020-04-22 Thread Christian Ramseyer
Hi

I'm setting up a new environment with a primary/hot standby replication
pair.

For read-only clients, I have found the host=host1,host2 connection
string[1] which allows a connection when any one of the servers is up.

However I'm unsure how to achieve something similar for read/write
clients. If I'm providing this list, the client will connect to the
first host, which might have become the standby in the meantime.

I see that in pgjdbc there are additional options for targetServerType =
any, primary, secondary, preferSlave and preferSecondary[2]. However
this seems to be java-specific and not implemented in libpq? Is there a
way to get this behaviour in PHP/Psycopg/Perl DBI?

If not, what is the best alternative to achieve this? My primary &
standby are not in the same L3 network, so moving around and IP address
with Linux OS clustering is not an option. I'm tending to scripting the
REST API of our DNS service to point a CNAME at the new primary during
promotion, but maybe there is an easier way I haven't found yet? Maybe
something in pgpool/pgbouncer etc?


Cheers
Christian

[1] https://www.postgresql.org/docs/current/libpq-connect.html
[2] https://jdbc.postgresql.org/documentation/head/connect.html




Re: Can I tell libpq to connect to the primary?

2020-04-22 Thread Christian Ramseyer
On 22.04.20 21:10, Christian Ramseyer wrote:
> 
> I see that in pgjdbc there are additional options for targetServerType =
> any, primary, secondary, preferSlave and preferSecondary[2]. However
> this seems to be java-specific and not implemented in libpq? Is there a
> way to get this behaviour in PHP/Psycopg/Perl DBI?
> 

Never mind, after RTFM'ing to the very end of
https://www.postgresql.org/docs/current/libpq-connect.html I have
discovered target_session_attrs=read-write|any which seems to do exactly
what I want.

Cheers
Christian




Encoding conversion: Use replacement character instead of failing query with "ERROR: character with byte sequence 0xd0 0xad in encoding "UTF8" has no equivalent in encoding LATIN1" ?

2020-05-05 Thread Christian Ramseyer
Hello list

I'm slowly converting all of our databases from Postgres 9 and LATIN1 to
Postgres 12 and UTF8, one by one.

I was wondering if there is a solution for this issue: if a database
that is still latin1 has a postgres_fdw foreign table onto a converted
utf8 database, and somehow a character that is not in latin1 has already
gotten in there, a query might fail with e.g.

PG9=# select * from fdw_table_test where hostname ~* 'moscow-ix-02';
ERROR:  character with byte sequence 0xd0 0xad in encoding "UTF8" has no
equivalent in encoding "LATIN1"
-- (0xd0 0xad being CYRILLIC CAPITAL LETTER E: Э)

Can I somehow influence the client:UTF8->server:LATIN1 character set
conversion so that instead of failing, it inserts an invalid codepoint
character, the utf8 hex bytes as string, drops the character or
something like that?

I do agree that the default behavior is correct and in general failing
is a lot better than mutilating or losing data in the conversion.
However in this specific case we don't care all that much about the bits
in possibly foreign scripts, and just having some ? in the string would
be easier to handle than one Cyrillic character in a single row failing
a large import job completely.

Is there any way I can do this on the conversion level* ?

Cheers
Christian


* I get that I could somehow write views too look at the content first
and filter out characters that won't work in LATIN1 before going through
the fdw, but I don't quite know all the tables and columns where this
can become an issue. But if you have a copy/paste ready solution for
that I'll take it as well of course :)




Re: Lock Postgres account after X number of failed logins?

2020-05-05 Thread Christian Ramseyer



On 05.05.20 16:13, Wolff, Ken L wrote:
> Hi, everyone.  Wondering if there’s a way in PostgreSQL to automatically
> lock accounts after a number of failed logins (a security requirement
> for my organization).  
> 
> Locking accounts after X number of failed logins is an excellent way to
> defeat brute force attacks, so I’m just wondering if there’s a way to do
> this, other than the aforementioned hook.
> 
>  

Hi Ken

This doesn't seem mentioned in other replies so far: a very "unixy"
approach to bolt this feature onto almost any Linux server process is
the fail2ban (https://github.com/fail2ban/fail2ban) utility. This is a
daemon that reads arbitrary logfiles, and then triggers an action if
some failure condition is seen a number of times.

Typically this will scan the logfile for an IP and on failure add a
temporary firewall rule to block the source, but all of this is
configurable. So in your case you can lock the account instead, and then
decide if you want automatic unlocking after a while, if you want to
drop the IP that tried to login additionally on the firewall as well, etc.

Here is a quick, rough example with still some blanks to fill in - I put
it on github for readability:
<https://gist.github.com/rc9000/fd1be13b5c8820f63d982d0bf8154db1>

The main blanks are in the postgres-action.conf section. The called
scripts in /usr/local/bin would need to be written. It can be as simple
as "psql -c alter role xxx nologin", but you might add some features
like connecting to the primary server if fail2ban triggered on the
standby. Also I'm not sure if setting nologin is the best way to disable
an account, but I'm sure somebody on here could tell you.

Cheers
Christian

-- 
Christian Ramseyer, netnea ag
Network Management. Security. OpenSource.
https://www.netnea.com






Re: Abnormal Growth of Index Size - Index Size 3x large than table size.

2020-05-06 Thread Christian Ramseyer
Hi

On 06.05.20 11:48, Ram Pratap Maurya wrote:

> We are facing a problem in our PostgreSQL production database related to
> abnormal growth of index size. Some of the indexes are having abnormal
> growth and index size is larger than table data size.
> 
> One table is having 75 G.B of index though table size is only 25 G.B. On
> monthly basis we are performing vacuum to release the used space.
> 
> 
> I am attaching the screen shot for your reference. Could you please help
> us in resolving the same as this is degrading performance drastically.
> 

Under some usage patterns, a periodic REINDEX might be advisible. See
<https://www.postgresql.org/docs/current/routine-reindex.html> for more
details, it might free up a lot of space for you.

If it doesn't, you'll need to dive deeper into what this indexes
actually are, if they are really used etc. But in cases of abnormal
growth that gets worse and worse over time, the above is the first thing
to try in my experience.


Cheers
Christian



-- 
Christian Ramseyer, netnea ag
Network Management. Security. OpenSource.
https://www.netnea.com











Re: Encoding conversion: Use replacement character instead of failing query with "ERROR: character with byte sequence 0xd0 0xad in encoding "UTF8" has no equivalent in encoding LATIN1" ?

2020-05-06 Thread Christian Ramseyer



On 06.05.20 02:00, Tom Lane wrote:
> Christian Ramseyer  writes:
>> Can I somehow influence the client:UTF8->server:LATIN1 character set
>> conversion so that instead of failing, it inserts an invalid codepoint
>> character, the utf8 hex bytes as string, drops the character or
>> something like that?
> 
> There's nothing built-in for that, but it seems like it wouldn't be
> hard to modify the code if you wanted a quick hack to do this.
> 
> In general, the system nominally supports multiple conversion functions
> per encoding pair, so you could imagine having an alternate conversion
> that doesn't throw errors.  Problem is that it's quite difficult to get
> the system to actually *use* a non-default conversion for anything really
> significant, like say client I/O.  I don't know that anyone's thought
> hard about how to improve that.
> 

Thanks Tom, that's basically like I suspected how it was, but I wanted
to make sure I'm not overlooking an easy workaround with a simple
"create conversion" or similar.

I really appreciate the quick answers from highly qualified people I'm
getting on here, without exceptions. If only "enterprise" product
support worked like that :)


Cheers
Christian


-- 
Christian Ramseyer, netnea ag
Network Management. Security. OpenSource.
https://www.netnea.com











Re: Lock Postgres account after X number of failed logins?

2020-05-07 Thread Christian Ramseyer



On 06.05.20 13:48, Guillaume Lelarge wrote:
> Le mer. 6 mai 2020 à 04:18, Christian Ramseyer  <mailto:r...@networkz.ch>> a écrit :
> 
> Here is a quick, rough example with still some blanks to fill in - I put
> it on github for readability:
> <https://gist.github.com/rc9000/fd1be13b5c8820f63d982d0bf8154db1>
> 
> The main blanks are in the postgres-action.conf section. The called
> scripts in /usr/local/bin would need to be written. It can be as simple
> as "psql -c alter role xxx nologin", but you might add some features
> like connecting to the primary server if fail2ban triggered on the
> standby. Also I'm not sure if setting nologin is the best way to disable
> an account, but I'm sure somebody on here could tell you.
> 
> 
> I already knew about fail2ban, but didn't know it could be set up this
> way. That's pretty impressive. I've just finished testing your config
> files, and it works really well (well, when you finally get rid of the
> selinux permission errors :) ). Anyway, thanks a lot for sharing this.
> 

Thanks for trying it out and the kind words, Guillaume & Ken !

There are some rough corners, I think to make it useful we would need to
do at least:

1. Write reasonable scripts for account locking/unlocking

2. Currently the lockout will also be executed for non-existing user
names and thus make the DOS worse, so we'd need a smart solution for
that (config file with valid users, or cached queries into PG from time
to time to get the existing users, or just being smarter on the log
parsing DETAILS line)

3. Examples how to combine with
https://www.postgresql.org/docs/current/auth-delay.html and/or firewall
drops, so that an attacker gets slowed down. Even if the account is
locked already, the system will still be harmed otherwise.


I'm happy to host this project if it helps enterprise adaption of
Postgres. I've converted the gist into an acutal repository, and you're
all very welcome to become contributors:
https://github.com/rc9000/postgres-fail2ban-lockout

Cheers
Christian


-- 
Christian Ramseyer, netnea ag
Network Management. Security. OpenSource.
https://www.netnea.com







Re: Removing Last field from CSV string

2020-05-16 Thread Christian Ramseyer



On 16.05.20 17:18, Alex Magnum wrote:

> Now I try to remove the last  field and comma  ",Class"
> 
> To get  Class V,Class VI,Class VII,Competitive Exam,Class VIII
> 
> Is there a function or easy way to do this?
> Any help would be appreciated.
> 

Hi Alex

Many options to do this with regexp_replace, here's one way:


with test as (
select 'Class VII,Competitive Exam,Class VIII,Class' as str
union
select 'Class VIIx,Competitive Exam22,Class VIIIabc,Classx'
)
select str, regexp_replace(str, '^(.*),(.*?)$', '\1') res from test;


|str
 |res
  |
|--|
|Class VII,Competitive Exam,Class VIII,Class
 |Class VII,Competitive Exam,Class VIII

|--|
|Class VIIx,Competitive Exam22,Class VIIIabc,Classx
 |Class VIIx,Competitive Exam22,Class
VIIIabc |


(I cut some columns at the start to better fit email width)

Cheers
Christian


-- 
Christian Ramseyer, netnea ag
Network Management. Security. OpenSource.
https://www.netnea.com









Re: Slow SELECT

2020-05-26 Thread Christian Ramseyer
Hi

On 26.05.20 09:22, Frank Millman wrote:
> 
> I have looked at the EXPLAIN, but I don't really know what to look for.
> I can supply it if that would help.
> 

My favorite approach to tuning Postgres queries is:

1. Run EXPLAIN ANALYZE 
2. Copy/Paste the output into the fantastic https://explain.depesz.com/

This will turn the somewhat hard-to-understand explain output into a
nice colored structure. If it's not obvious from the orange-reddish
boxes where the slowness comes from, please post the link here and
somebody will certainly have some advice.

Cheers
Christian



-- 
Christian Ramseyer, netnea ag
Network Management. Security. OpenSource.
https://www.netnea.com







Re: Cluster for an appliance-type deployment

2023-11-07 Thread Christian Ramseyer




On 06.11.23 20:26, Matthias Leisi wrote:

Dear all,

I’m reasonably experienced with Postgres with simple (single, „rebuild 
and restore“) requirements, but would need some hints on what to look 
for in a more complex situation - deploying Postgres as the backend for 
a (virtual) appliance.


This appliance can scale horizontally from a single to dozens of VMs 
(theoretically more, but most installations are small-ish). It is 
feasible to configure VMs for particular purposes (eg „you are [also] a 
DB node“), but basically all instances will/should be able to perform 
their main tasks besides (also) being a DB node. As the VMs may be 
installed in very different environments, network-based solutions are 
less feasible and we would prefer a DB-level solution. We assume that 
for most cases, primary/stand-by configurations would be sufficient in 
terms of availability / latency / throughput.


We must also assume that there is no person who would be able to touch 
things if an error occurs. Data consistency and (as much as possible) 
automated recovery from error situations („VM down“, „network lost“, …) 
are therefor more important than „n nines". We can assume that the VMs 
can talk to each other over TCP (eg using SSH tunnels, direct Postgres 
connection, or some other suitable protocol). Scripting „around“ the 
database is available to initialize instances and for similar tasks.


Would Postgres’ own log-shipping (file-based + streaming replication, 
possibly with remote_write) be sufficient for such a set of requirements?


What aspects would you consider important for such a scenario?



The replication that ships with Postgres gives you one writeable primary 
server and a number of standbys, but it has no tools to automatically 
discover or recover from failure. From 
https://www.postgresql.org/docs/current/warm-standby-failover.html:


---

PostgreSQL does not provide the system software required to identify a 
failure on the primary and notify the standby database server. Many such 
tools exist and are well integrated with the operating system facilities 
required for successful failover, such as IP address migration.


Once failover to the standby occurs, there is only a single server in 
operation. This is known as a degenerate state. The former standby is 
now the primary, but the former primary is down and might stay down. To 
return to normal operation, a standby server must be recreated, either 
on the former primary system when it comes up, or on a third, possibly 
new, system. The pg_rewind utility can be used to speed up this process 
on large clusters. Once complete, the primary and standby can be 
considered to have switched roles. Some people choose to use a third 
server to provide backup for the new primary until the new standby 
server is recreated, though clearly this complicates the system 
configuration and operational processes.


---

So you need to add additional parts from the ecosystem to detect 
failure, handle failover, potentially move an IP address with the 
Primary etc. Popular tools are repmgr, Patroni, CloudNativePG, BDR, 
pacemaker+corosync and endless others. They will address many of your 
requirements, but still some work and understanding is required to make 
them fully unattended, as well as have them expand dynamically with new 
replicas if an appliance is added.


Postgres is an amazing product and I like to use it for almost 
everything, but in this scenario we are of course making our live hard 
with ACID compliance. In NoSQL/"eventual consistency" land, there are 
products that are a lot friendlier to a setup like this - stuff like 
Cassandra, etcd, CouchDb comes to mind. I'd compare the pros and cons of 
such alternatives, the big con of course being a lot less consistency 
and durability guarantees - the question is whether the application 
needs it.


But enough about non-Postgres topics on this list :) To go with 
Postgres, on a hunch I'd try Patroni first: it does a lot of the 
advanced failover stuff, has a great track record, and supposedly runs 
on BSD: https://openports.pl/path/databases/patroni



Cheers
Christian


--
Christian Ramseyer, netnea ag
Network Management. Security. OpenSource.
https://www.netnea.com
Phone: +41 79 644 77 64




Re: postgresql order lowercase before uppercase

2021-03-19 Thread Christian Ramseyer



On 19.03.21 10:12, basti wrote:
> On 19.03.21 08:33, Laurenz Albe wrote:
>> On Thu, 2021-03-18 at 23:51 +0100, basti wrote:
>>> Am 18.03.21 um 17:19 schrieb Laurenz Albe:
>>>> On Thu, 2021-03-18 at 15:39 +0100, basti wrote:
>>>>> I need to as follow:
>>>>>
>>>>> ande
>>>>> Amma
>>>>> Anit
>>>>> Anti
>>>>> Brac
>>>>> Cali
>>>>> 
>>>>


Laurenz' approach is sound, it just needs a little tweak to not trip up
on the "andere Marken" uppercase M. Try this:

select id,
marke.name from marke
ORDER BY
  left(marke.name,1) <> left(lower(marke.name),1),
  marke.name

Fiddle:
http://sqlfiddle.com/#!17/d9d83e/9

Cheers
Christian

-- 
Christian Ramseyer, netnea ag
Network Management. Security. OpenSource.
https://www.netnea.com




Re: PostgreSQL CHECK Constraint

2021-10-03 Thread Christian Ramseyer



On 03.10.21 09:31, Shaozhong SHI wrote:
> 
> Has anyone got experience with data quality checking, validation and
> reporting within PostgreSQL?
> 
> How best to use PostgreSQL CHECK Constraint for data quality checking,
> validation and reporting?  
> 
> Can we report on errors in a detailed and specific way?  For instance,
> can we produce report on specific issues of erroneous cells in which row
> and etc.?
> 

Yes that's all possible. Given a constraint like

alter table test_customers
   add constraint check_age check (age >= 18);


The reported error looks like this:

postgres@dellstore ERROR:  new row for relation "test_customers"
violates check constraint "check_age"

postgres@dellstore DETAIL:  Failing row contains (1, Jimmy, Schmoe, 15).

postgres@dellstore STATEMENT:  insert into test_customers (firstname,
lastname, age) values ( 'Jimmy', 'Schmoe', 15);

This errors appears in the serverlog which has many format and
forwarding options, you can read about them here:

https://www.postgresql.org/docs/current/runtime-config-logging.html

Cheers
Christian

-- 
Christian Ramseyer, netnea ag
Network Management. Security. OpenSource.
https://www.netnea.com





Re: PostgreSQL CHECK Constraint

2021-10-03 Thread Christian Ramseyer
> The reported error looks like this:
>
> postgres@dellstore ERROR:  new row for relation "test_customers"
> violates check constraint "check_age"
..
>
> This errors appears in the serverlog which has many format and
> forwarding options, you can read about them here:

On 03.10.21 20:16, Shaozhong SHI wrote:
> Hi, Christian,
> That is interesting.  Can errors be captured and saved as data with
> scripting?

Yes that works quite the same, e.g. in Python you can do

try:
cur.execute("insert into test_customers (firstname, lastname, age)
values ( %s, %s, %s)", ("Bobby", "Tables", 10))
except psycopg2.errors.CheckViolation as e:
print(f"That didn't work: {e.cursor.query} failed")
print(f"{e.pgerror}")


HTH




Re: pg_trgm vs. Solr ngram

2023-02-11 Thread Christian Ramseyer (mx04)

On 10.02.23 04:48, Laurenz Albe wrote:

On Fri, 2023-02-10 at 03:20 +0100, Chris wrote:

In Solr I was using ngrams and customized the
TokenizerFactories until more or less only whitespace was as separator,
while [.:-_\d] remains part of the ngrams. This allows to search for
".12.255/32" or "xzy-eth5.example.org" without any false positives.

It looks like a straight conversion of this method is not possible 



On 10.02.23 04:48, Laurenz Albe wrote:

Here is a hack that you can try: pre-process your strings and replace
symbols with rare characters:

   SELECT show_trgm(translate('127.0.0.1/32', './', 'qx'));

   show_trgm
   ═
{"  1"," 12",0q0,0q1,127,1x3,27q,"32 ",7q0,q0q,q1x,x32}
   (1 row)

Then you could search like

   WHERE translate(search_string, './', 'qx') LIKE translate('%127.0.0.1/32%', 
'./', 'qx')
 AND search_string LIKE '%127.0.0.1/32%'

The first condition can use a trigram index, and the second filters out
false positives.



Hehe that is a nifty idea. I went to try this but then it turned out 
that I was probably overthinking the whole issue already. Using a 
gist_trgm_ops index and % as operator works perfectly well:


insert into docs (orig) values ('120.2.10.22');
insert into docs (orig) values ('120 2 10 22');
CREATE INDEX iorig ON docs USING GIST (orig gist_trgm_ops);

set enable_seqscan = off;
explain analyze verbose select * from docs where orig like '%.10.22%';

Index Scan using iorig on public.docs  (cost=0.14..8.16 rows=1 width=32) 
(actual time=0.952..1.018 rows=1 loops=1)

  Output: orig
  Index Cond: (docs.orig ~~ '%.10.22%'::text)
  Rows Removed by Index Recheck: 1

Even though this query has the same trigrams like e.g. '% 10 22%', the 
index recheck takes care of it and only the matching row is returned. 
Excellent, not quite sure why I was expecting false positives in the 
first place, it would be a pretty stark violation of how % is supposed 
to behave.


Not quite sure how big the performance hit of not having the optimal 
trigrams with punctuation in the index and rechecking some hits will be, 
but for now I'll assume it's negligible. Otherwise I'll try the 
translate variant.


Many thanks!