Re: pg_attribute, pg_class, pg_depend grow huge in count and size with multiple tenants.

2020-05-07 Thread Avinash Kumar
Hi,

On Thu, May 7, 2020 at 5:18 PM David G. Johnston 
wrote:

> On Thu, May 7, 2020 at 1:05 PM samhitha g 
> wrote:
>
>> Our application serves multiple tenants. Each tenant has the schema
>> with a few hundreds of tables and few functions.
>> We have 2000 clients so we have to create 2000 schemas in a single
>> database.
>>
>
> That is one option but I wouldn't say you must.  If you cannot get
> individual tables to be multi-tenant you are probably better off having one
> database per client on a shared cluster - at least given the size of the
> schema and number of clients.
>
I am working on a similar problem.
1 database per each client may be a killer when you have a connection
pooler that creates a pool for a unique combination of (user,database).

>
> David J.
>
>

-- 
Regards,
Avinash Vallarapu
+1-902-221-5976


Re: pg_attribute, pg_class, pg_depend grow huge in count and size with multiple tenants.

2020-05-07 Thread Avinash Kumar
Hi,

On Thu, May 7, 2020 at 6:08 PM Rory Campbell-Lange 
wrote:

> On 07/05/20, Avinash Kumar ([email protected]) wrote:
> > >> Our application serves multiple tenants. Each tenant has the schema
> > >> with a few hundreds of tables and few functions.
> > >> We have 2000 clients so we have to create 2000 schemas in a single
> > >> database.
>
> > > That is one option but I wouldn't say you must.  If you cannot get
> > > individual tables to be multi-tenant you are probably better off
> having one
> > > database per client on a shared cluster - at least given the size of
> the
> > > schema and number of clients.
> > >
> > I am working on a similar problem.
> > 1 database per each client may be a killer when you have a connection
> > pooler that creates a pool for a unique combination of (user,database).
>
> One of our clusters has well over 500 databases fronted by pg_bouncer.
>
> We get excellent connection "flattening" using pg_bouncer with
> per-database connection spikes dealt with through a reserve pool.
>
What if you see at least 4 connections being established by each client
during peak ? And if you serve 4 or 2  connections per each DB, then you
are creating 1000 or more reserved connections with 500 DBs in a cluster.

>
> The nice thing about separate databases is that it is easy to scale
> horizontally.
>
Agreed. But, how about autovacuum ? Workers shift from DB to DB and 500
clusters means you may have to have a lot of manual vacuuming in place as
well.

>
> Rory
>


-- 
Regards,
Avinash Vallarapu
+1-902-221-5976


Re: pg_attribute, pg_class, pg_depend grow huge in count and size with multiple tenants.

2020-05-07 Thread Avinash Kumar
Hi,

On Fri, May 8, 2020 at 3:31 AM Laurenz Albe 
wrote:

> On Thu, 2020-05-07 at 18:17 -0300, Avinash Kumar wrote:
> > > The nice thing about separate databases is that it is easy to scale
> > > horizontally.
> >
> > Agreed. But, how about autovacuum ? Workers shift from DB to DB and 500
> clusters
> > means you may have to have a lot of manual vacuuming in place as well.
>
> Just set "autovacuum_max_workers" higher.
>
No, that wouldn't help. If you just increase autovacuum_max_workers, the
total cost limit of autovacuum_vacuum_cost_limit (or vacuum_cost_limit) is
shared by so many workers and it further delays autovacuum per each worker.
Instead you need to increase autovacuum_vacuum_cost_limit as well when you
increase the number of workers. But, if you do that and also increase
workers, well, you would easily reach the limitations of the disk. I am not
sure it is anywhere advised to have 20 autovacuum_max_workers unless i have
a disk with lots of IOPS and with very tiny tables across all the
databases.

>
> Yours,
> Laurenz Albe
> --
> Cybertec | https://www.cybertec-postgresql.com
>
>

-- 
Regards,
Avinash Vallarapu


Re: pg_attribute, pg_class, pg_depend grow huge in count and size with multiple tenants.

2020-05-08 Thread Avinash Kumar
Hi,

On Fri, May 8, 2020 at 3:53 AM Laurenz Albe 
wrote:

> On Fri, 2020-05-08 at 03:47 -0300, Avinash Kumar wrote:
> > > Just set "autovacuum_max_workers" higher.
> >
> > No, that wouldn't help. If you just increase autovacuum_max_workers, the
> total cost limit of
> > autovacuum_vacuum_cost_limit (or vacuum_cost_limit) is shared by so many
> workers and it
> > further delays autovacuum per each worker. Instead you need to increase
> autovacuum_vacuum_cost_limit
> > as well when you increase the number of workers.
>
> True, I should have mentioned that.
>
> > But, if you do that and also increase workers, well, you would easily
> reach the limitations
> > of the disk. I am not sure it is anywhere advised to have 20
> autovacuum_max_workers unless
> > i have a disk with lots of IOPS and with very tiny tables across all the
> databases.
>
> Sure, if you have a high database load, you will at some point exceed the
> limits of
> the machine, which is not surprising.  What I am trying to say is that you
> have to ramp
> up the resources for autovacuum together with increasing the overall
> workload.
> You should consider autovacuum as part of that workload.
>
> If your machine cannot cope with the workload any more, you have to scale,
> which
> is easily done by adding more machines if you have many databases.
>
Agreed. Getting back to the original question asked by Sammy, i think it is
still bad to create 2000 databases for storing 2000 clients/(schemas) for a
multi-tenant setup.

>
> Yours,
> Laurenz Albe
> --
> Cybertec | https://www.cybertec-postgresql.com
>
>

-- 
Regards,
Avinash Vallarapu


Re: Postgresql server gets stuck at low load

2020-06-09 Thread Avinash Kumar
Hi,

On Fri, Jun 5, 2020 at 7:07 AM Krzysztof Olszewski 
wrote:

> I have problem with one of my Postgres production server. Server works
> fine almost always, but sometimes without any increase of transactions or
> statements amount, machine gets stuck. Cores goes up to 100%, load up to
> 160%. When it happens then there are problems with connect to database and
> even it will succeed, simple queries works several seconds instead of
> milliseconds.Problem sometimes stops after a period a time (e.g. 35 min),
> sometimes we must restart Postgres, Linux, or even KVM (which exists as
> virtualization host).
>
> My hardware
> 56 cores (Intel Core Processor (Skylake, IBRS))
> 400 GB RAM
> RAID10 with about 40k IOPS
>
> Os
> CentOS Linux release 7.7.1908
> kernel 3.10.0-1062.18.1.el7.x86_64
>
> Databasesize 100 GB (entirely fit in memory :) )
> server_version 10.12
> effective_cache_size 192000 MB
> maintenance_work_mem 2048 MB
> max_connections 150
> shared_buffers 64000 MB
> work_mem 96 MB
>
What is the value set to random_page_cost ?
Set to 1 (same as default seq_page_cost) for a moment and try it.

>
> On normal state, i have about 500 tps, 5% usage of cores, about 3% of
> load, whole database fits in memory, no reads from disk, only writes on
> about 500 IOPS level, sometimes in spikes on 1500 IOPS level, but on this
> hardware there is no problem with this values (no iowaits on cores). In
> normal state this machine does "nothing". Connections to database are
> created by two app servers based on Java, through connection pools, so
> connections count is limited by configuration of pools and max is 120, is
> lower value than in Postgres configuration (150). On normal state there is
> about 20 connections, when stuck goes into max (120).
>
> In correlation with stucks i see informations in kernel log about
> NMI watchdog: BUG: soft lockup - CPU#25 stuck for 23s! [postmaster:33935]
> but i don't know this is reason or effect of problem
> I made investigation with pgBadger and ... nothing strange happens, just
> normal statements
>
> Any ideas?
>
> Thanks,
> Kris
>
>
>

-- 
Regards,
Avinash Vallarapu


Re: Logical Replication speed-up initial data

2021-08-05 Thread Avinash Kumar
Hi,

On Thu, Aug 5, 2021 at 11:28 AM Vijaykumar Jain <
[email protected]> wrote:

> On Thu, 5 Aug 2021 at 10:27, Nikhil Shetty  wrote:
>
>> Hi,
>>
>> Thank you for the suggestion.
>>
>> We tried by dropping indexes and it worked faster compared to what we saw
>> earlier. We wanted to know if anybody has done any other changes that helps
>> speed-up initial data load without dropping indexes.
>>
>>
> You could leverage pg_basbeackup or pg_dump with parallel jobs
taken from a Standby (preferably replication paused if pg_dump, anyways
pg_basebackup should be straight-forward) or taken even from
Primary, for the purpose of initial data load.

As you are able to drop indexes and make some schema changes, I would
assume that you could pause your app temporarily. If that's the case
you may look into the simple steps i am posting here that demonstrates
pg_dump/pg_restore instead.

If you cannot pause the app, then, you could look into how you
could use pg_replication_origin_advance



Step 1 : Pause App
Step 2 : Create Publication on the Primary CREATE PUBLICATION
 FOR ALL TABLES;
Step 3 : Create Logical Replication Slot on the Primary SELECT * FROM
pg_create_logical_replication_slot('', 'pgoutput'); Step 4
: Create Subscription but do not enable the Subscription
CREATE SUBSCRIPTION  CONNECTION
'host= dbname= user=postgres
password=secret port=5432' PUBLICATION 
WITH (copy_data = false, create_slot=false, enabled=false,
slot_name=);

Step 5 : Initiate pg_dump. We can take a parallel backup for a faster
restore.

$ pg_dump -d  -Fd -j 4 -n  -f 
-- If its several hundreds of GBs or TBs, you may rather utilize one of
your Standby that has been paused from replication using -> select
pg_wal_replay_pause();

Step 6 : Don't need to wait until pg_dump completes, you may start the App.
-- Hope the app does not perform changes that impact the pg_dump or
gets blocked due to pg_dump.
Step 7 : Restore the dump if you used pg_dump.
pg_restore -d  -j   Step
8 : Enable subscription.
ALTER SUBSCRIPTION  ENABLE;

If you have not stopped your app then you must advance the lsn using
pg_replication_origin_advance



These are all hand-written steps while drafting this email, so,
please test it on your end as some typos or adjustments are definitely
expected.

PS: i have not tested this in production level loads, it was just some exp
> i did on my laptop.
>
> one option would be to use pglogical extension (this was shared by
> Dharmendra in one the previous mails, sharing the same),
> and then use pglogical_create_subscriber cli to create the initial copy
> via pgbasebackup and then carry on from there.
> I ran the test case similar to one below in my local env, and it seems to
> work fine. of course i do not have TB worth of load to test, but it looks
> promising,
> especially since they introduced it to the core.
> pglogical/010_pglogical_create_subscriber.pl at REL2_x_STABLE ยท
> 2ndQuadrant/pglogical (github.com)
> 
> Once you attain some reasonable sync state, you can drop the pglogical
> extension, and check if things continue fine.
> I have done something similar when upgrading from 9.6 to 11 using
> pglogical and then dropping the extension and it was smooth,
> maybe you need to try this out and share if things works fine.
> and
> The 1-2-3 for PostgreSQL Logical Replication Using an RDS Snapshot -
> Percona Database Performance Blog
> 
>
>

-- 
Regards,
Avinash Vallarapu (Avi)
CEO,
MigOps, Inc.