Re: POSTGRES/MYSQL

2019-03-12 Thread Chris Travers
On Mon, Mar 11, 2019 at 7:32 PM Sonam Sharma  wrote:

> Hi All,
>
> We are planning to migrate our database into any open source DB.
> Can someone please help me in knowing which one will be better among
> POSTGRESQL and MYSQL.
>
> In what terms postgres is better than MYSQL.
>

If course you will get a lot of pro-Postgres answers here.  I am going to
try to give a balanced one.

I:  Performance

MySQL with InnoDB performs differently than PostgreSQL.  Primary key
lookups are marginally less expensive.  Secondary index lookups are
significantly more expensive.  Sequential scans are much more expensive.
If all you are ever doing is primary key lookups, MySQL might perform
better.  For most real-world workloads, PostgreSQL does better.

Also MySQL has a query cache that allows the results of very common queries
to be much faster.  PostgreSQL has a more complex cache system which
performs better on complex workloads.  So in most cases, Postgres is better
(assuming appropriate tuning on both).

For updates, MySQL avoids a lot of index write overhead.  PostgreSQL has
more overhead per update.  In some cases this is a big deal.  However I
have never seen such a situation that made PostgreSQL unsuitable.

I suspect parallel query is faster on MySQL but I have seen internal
parallelism lead to deadlocks with only a single session running.  In other
words, bulk inserts deadlocking against themselves.

II:  Porting

MySQL has per-client modes of query which affect what data can be properly
stored and how data can be retrieved.  The positive side is that MySQL is
ok at pretending to be other database systems but not so good at ensuring
data integrity (strict mode can be turned off by any writer, so you can't
always trust what is written).  The downside is that MySQL doesn't have as
much of a rich feature set of compliant features, so what you gain from
being able to use a supported dialect you may lose in having to rewrite
queries anyway.

PostgreSQL is fairly strict about data insertion and does not support
multiple dialects of SQL, so porting non-ANSI-SQL queries to PostgreSQL can
sometimes take more effort, but the feature set supported is much higher
so  Six of one, half a dozen of the other.

III:  Licensing

MySQL is owned by Oracle and GPL licensed.  PostgreSQL is BSD-licensed and
owned by the individual contributors.  If you are considering MySQL you
might want to use MariaDB instead.  But PostgreSQL avoids most of these
issues and ensures that even if you are distributing the db with a
proprietary application, there are no licensing implications of doing that.


> Regards,
> Sonam
>


-- 
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor
lock-in.
http://www.efficito.com/learn_more


Re: POSTGRES/MYSQL

2019-03-12 Thread Laurenz Albe
Chris Travers wrote:
> Also MySQL has a query cache that allows the results of very common queries 
> to be much faster.

I have used that feature, and it has bitten me:
https://stackoverflow.com/q/44244482/6464308

I guess only some rather pathological workloads really benefit from that.

> For updates, MySQL avoids a lot of index write overhead.  PostgreSQL has more 
> overhead per update.

That is what I meant when I said that PostgreSQL is less suitable for a 
key-value store.

There is HOT update which can mitigate the problem if the updated columns are 
not indexed.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com




Re: POSTGRES/MYSQL

2019-03-12 Thread Thomas Kellerer
Laurenz Albe schrieb am 12.03.2019 um 10:05:
>> Also MySQL has a query cache that allows the results of very common queries 
>> to be much faster.
> 
> I have used that feature, and it has bitten me:
> https://stackoverflow.com/q/44244482/6464308

Note that the query cache was removed in MySQL 8.0 (don't know about MariaDB)






Re: Performance comparison between Pgsql 10.5 and Pgsql 11.2

2019-03-12 Thread Nicola Contu
Hello,
do you have any advice on this?

Thanks a lot in advance

Il giorno gio 7 mar 2019 alle ore 09:39 Nicola Contu 
ha scritto:

> So the first file is on Postgres11.2 on a test server (and where I compare
> 10 vs 11)
> The second file, is our preprod machine running Postgres 11.2 (different
> hardware etc, it is a VM). I know that could be confusing, but I just
> wanted to compare that too because if you see the two files there's a lot
> of difference between the two machines.
> And they are both running CentOS 7.
>
> So at this point I have two problems. One inside the machine between
> Postgres 10 and 11 and another problem on the preprod (similar to prod)
> with a lot of lseek.
>
> Sorry if this is confusing, hope it is clear now.
>
> Regarding partitions/tables. The first file involves just one table. The
> second file (with a huge lseek) was running the test on a single table, but
> meanwhile it was accessible by the preprod web application. So it was maybe
> hit by some user and some other table.
>
>
> Question:
> 1) Is it possible that pgbench could not be really a good tool for testing
> the performances? If I use a sql script of thousands of insert records and
> compare on the same server between pg10 and pg11 I get pretty much the same
> result (maybe better on pg11)
> 2) regarding preprod, is there any way to reduce those lseek()?  Just to
> let you know, comparing the same insert script between the first server,
> the first server takes 2m the second one takes 5-7m.
>
> Thanks a lot,
>
>
>
>
>
> Il giorno gio 7 mar 2019 alle ore 04:47 Thomas Munro <
> thomas.mu...@gmail.com> ha scritto:
>
>> On Thu, Mar 7, 2019 at 12:12 AM Nicola Contu 
>> wrote:
>> > This is instead the strace of another server running the same version
>> compiled  but that is even slower.
>>
>> Huh.  That's a lot of lseek().  Some of these will be for random
>> reads/writes and will go way in v12, and some will be for probing the
>> size of relations while planning, and some while executing scans.  I
>> bet you could make some of them go away by using prepared statements.
>> Does the query in your test involve many partitions/tables?
>>
>> % time seconds  usecs/call callserrors syscall
>> -- --- --- - - 
>>  32.50  143.010306   7  21044095   lseek
>>  26.21  115.354045  14   8144577   read
>>   6.18   27.185578  16   166988910 sendto
>>   5.29   23.300584  57407528   fdatasync
>>   4.93   21.709522   9   2313529824174 recvfrom
>>   3.31   14.547568  19765897   write
>>   2.73   12.007486  14867088 14494 epoll_wait
>>   2.189.597460  15659871 84097 futex
>>   1.858.147759  14567414   close
>>   1.777.767832  18437656 11319 open
>>
>> The other results had 1 usec lseek(), and much fewer of them relative
>> to the number of reads and writes.  BTW, are you comparing v10 and v11
>> on the same hardware, kernel, filesystem?  Just wondering if there
>> could be some change in syscall overhead on different kernel patch
>> levels or something like that: we see 7 usec vs 1 usec in those two
>> files (though I have no idea how reliable these times are) and if
>> we're going to call it 21 million times at some point it might
>> matter...
>>
>> --
>> Thomas Munro
>> https://enterprisedb.com
>>
>


PostgreSQL logical replication slot LSN values

2019-03-12 Thread Rashmi V Bharadwaj
Hi,We have an application that uses the PostgreSQL logical replication API to read the changes made to the PostgreSQL database and applies it to a different database (like Db2 etc). We are using logical replication slots for this. Currently I am facing an issue where the replication slot is pointing to an older restart_lsn and confirmed_flush_lsn (like 10 days back) and the corresponding WAL file is already deleted from the pg_wal directory. Ideally this should not happen, right? since the slot is holding this LSN the wal file should not have been deleted. Now when I try to use query like select * from pg_logical_slot_get_changes(,, NULL) or use the logical replication API with a start position as any newer LSN, I get the following error:ERROR:  requested WAL segment pg_wal/00010036 has already been removed
SQL state: 58P01.How do I get past this issue? I have not enabled log archiving. I would also like to know how I can modify the restart_lsn and confirmed_flush_lsn positions of the slot?Thanks,Rashmi




Re: POSTGRES/MYSQL

2019-03-12 Thread Benedict Holland
MySQL isn't ACID. Postgresql is a full-featured database that doesn't allow
injection. It is very safe and secure. Also, the way that PostgreSQL has a
much better user management system and database/table level access ACLs.

Basically, you just asked a bunch of people who have used  PostgreSQL over
MySQL why PostgerSQL is better. It is just better. The only time that MySQL
might be better is if you have a very simple website and you want a SQL
backend. For anything else, PostgreSQL is infinitely better.

Thanks,
~Ben

On Tue, Mar 12, 2019 at 5:34 AM Thomas Kellerer  wrote:

> Laurenz Albe schrieb am 12.03.2019 um 10:05:
> >> Also MySQL has a query cache that allows the results of very common
> queries to be much faster.
> >
> > I have used that feature, and it has bitten me:
> > https://stackoverflow.com/q/44244482/6464308
>
> Note that the query cache was removed in MySQL 8.0 (don't know about
> MariaDB)
>
>
>
>
>


Re: POSTGRES/MYSQL

2019-03-12 Thread Francisco Olarte
Benedict:

On Tue, Mar 12, 2019 at 3:11 PM Benedict Holland
 wrote:
> MySQL isn't ACID.

Are you sure of that? I mean, I stopped using it in the late nineties,
and it only had MyISAM then, certainly not ACID, but my understanding
is it's got some ACID storage backends and options ( I seem to recall
InnoDB is one of them, and last thing I know is it did ACID DML but
had problems with DDL ).

Don't get me wrong, I prefer postgres too, I think it is easier and
nicer, but MySql has gone a long way and is not that bad ( as "not
acid" ).

> Postgresql is a full-featured database that doesn't allow injection.

Also, regarding this, it seems to hint at "injections are mysql fault"
and "you are safe from injections using Pg". Databases do not allow or
disallow (SQL) injection. It's the app code which does it. I can write
code with and without injections which works equally well, or bad, in
both engines. Nearly anyone can.

.
> Basically, you just asked a bunch of people who have used  PostgreSQL over 
> MySQL why PostgerSQL is better. It is just better. The only time that MySQL 
> might be better is if you have a very simple website and you want a SQL 
> backend. For anything else, PostgreSQL is infinitely better.

No, it's not. I use postgres for a lot of things, and some of them are
only done in this over mysql because I have postgresql servers
installed and I am familiar with it and the volume is not that high
that switching to mysql justifies the learning cost. And it would
probably be the other way round if I were a mysql guy.

And, not knowing too much about the OP detailed requirements, I would
recommend Pg for them because I know it can do the things he seem to
want, and I am not sure about mysql. I am nearly sure it can do them,
but I cannot assert it.

Pg is really good, but not a panacea.

Francisco Olarte.



Re: PostgreSQL logical replication slot LSN values

2019-03-12 Thread Andres Freund
Hi,

(please don't send HTML only emails to this list)

On 2019-03-12 11:08:56 +, Rashmi V Bharadwaj wrote:
> We have an application that uses the PostgreSQL logical replication API to 
> read
> the changes made to the PostgreSQL database and applies it to a different
> database (like Db2 etc). We are using logical replication slots for
> this.

Cool.


> Currently I am facing an issue where the replication slot is pointing to an
> older restart_lsn and confirmed_flush_lsn (like 10 days back) and the
> corresponding WAL file is already deleted from the pg_wal directory. Ideally
> this should not happen, right?

Well, did you consume the logical data, and if so how? When you use the
streaming interface - HIGHLY recommended - you need to send feedback
messages as to where you've received the data.


> since the slot is holding this LSN the wal file
> should not have been deleted. Now when I try to use query like
> select * from pg_logical_slot_get_changes(,, NULL)
> 

> or use the logical replication API with a start position as any newer LSN, I
> get the following error:
> 
> ERROR: requested WAL segment pg_wal/00010036 has already been
> removed
> SQL state: 58P01.

Hm, that should not happen. Did you by any chance externally (manually
or by script) delete WAL files?


> How do I get past this issue? I have not enabled log archiving. I would also
> like to know how I can modify the restart_lsn and confirmed_flush_lsn 
> positions
> of the slot?

You need to send feedback messages confirming up to wher eyou've
consumed the data when using the streaming protocol. When using the SQL
functions the _get_ function confirms when it returns, the _peek_
function never does so. It's recommended to limit the size of the
resultset a bit using the nchanges paramter so you can call it in
smaller increments.

Greetings,

Andres Freund



Re: POSTGRES/MYSQL

2019-03-12 Thread Ron

The Percona fork of MySQL makes active-active clustering very simple to set up.

On 3/12/19 9:10 AM, Benedict Holland wrote:
MySQL isn't ACID. Postgresql is a full-featured database that doesn't 
allow injection. It is very safe and secure. Also, the way that PostgreSQL 
has a much better user management system and database/table level access 
ACLs.


Basically, you just asked a bunch of people who have used PostgreSQL over 
MySQL why PostgerSQL is better. It is just better. The only time that 
MySQL might be better is if you have a very simple website and you want a 
SQL backend. For anything else, PostgreSQL is infinitely better.


Thanks,
~Ben

On Tue, Mar 12, 2019 at 5:34 AM Thomas Kellerer > wrote:


Laurenz Albe schrieb am 12.03.2019 um 10:05:
>> Also MySQL has a query cache that allows the results of very common
queries to be much faster.
>
> I have used that feature, and it has bitten me:
> https://stackoverflow.com/q/44244482/6464308

Note that the query cache was removed in MySQL 8.0 (don't know about
MariaDB)






--
Angular momentum makes the world go 'round.


Re: POSTGRES/MYSQL

2019-03-12 Thread Benedict Holland
When you create a table in a transaction, it will commit the transaction
and prevent a rollback. MySQL is not ACID.

Thanks,
~Ben

On Tue, Mar 12, 2019 at 11:44 AM Ron  wrote:

> The Percona fork of MySQL makes active-active clustering very simple to
> set up.
>
> On 3/12/19 9:10 AM, Benedict Holland wrote:
>
> MySQL isn't ACID. Postgresql is a full-featured database that doesn't
> allow injection. It is very safe and secure. Also, the way that PostgreSQL
> has a much better user management system and database/table level access
> ACLs.
>
> Basically, you just asked a bunch of people who have used  PostgreSQL over
> MySQL why PostgerSQL is better. It is just better. The only time that MySQL
> might be better is if you have a very simple website and you want a SQL
> backend. For anything else, PostgreSQL is infinitely better.
>
> Thanks,
> ~Ben
>
> On Tue, Mar 12, 2019 at 5:34 AM Thomas Kellerer 
> wrote:
>
>> Laurenz Albe schrieb am 12.03.2019 um 10:05:
>> >> Also MySQL has a query cache that allows the results of very common
>> queries to be much faster.
>> >
>> > I have used that feature, and it has bitten me:
>> > https://stackoverflow.com/q/44244482/6464308
>>
>> Note that the query cache was removed in MySQL 8.0 (don't know about
>> MariaDB)
>>
>>
>>
>>
>>
> --
> Angular momentum makes the world go 'round.
>


Re: POSTGRES/MYSQL

2019-03-12 Thread Francisco Olarte
Benedict:

On Tue, Mar 12, 2019 at 4:56 PM Benedict Holland
 wrote:
> When you create a table in a transaction, it will commit the transaction and 
> prevent a rollback. MySQL is not ACID.

And when you call COMMIT in postgres it will commit and prevent a rollback.

This does not mean MySQL is not ACID, it means DDL cannot be transactioned.

It is, in fact, one of the reasons why I do not use MySql, but I think
InnoDb and friends are ACID, and the problem is the system catalogs
were Isam an not transactioned.

And, If I remember correctly, one of the features Oracle announced for
8.0 was transactional DDL ( althought not sure if it can be mixed with
other things in a transaction, these things are hairy due to the
multiple storage engines normally involved in a single mysql instance
). But I'm not going to test it.

regards.
   Francisco Olarte.



Re: POSTGRES/MYSQL

2019-03-12 Thread Michael Nolan
The MySQL manual says that INNODB 'adheres closely' to the ACID model,
though there are settings where you can trade some ACID compliance for
performance.

See https://dev.mysql.com/doc/refman/5.6/en/mysql-acid.html

I've been running PostgreSQL for a client since 2005, we're on our 5th
hardware platform in that time period.  We also run a MySQL/MariaDB
database to support our website, it is currently on an AWS RDB server which
is refreshed from the PostgreSQL server.  I also administered an Oracle
system for a well-known educational publisher for about 10 years.  Given my
druthers, I'd take PostgreSQL over either Oracle or MySQL.
--
Mike Nolan

>


Re: POSTGRES/MYSQL

2019-03-12 Thread Benedict Holland
I am not saying it is not well documented. I am saying that it isn't ACID
compliant, which it isn't, as they document.

It comes up far more often than I would like, particularly with migrations
and schema changes. It is one of the huge reasons I actively refuse to work
with MySQL. I have never upgraded a MySQL schema without running into a
problem where I needed to roll back and the transaction committed on a
table creation. That left my databases in an uncertain state, which is
about as bad as you can get from my perspective.

MsSQL is better at this but they heavily use dynamic SQL and SQL injection,
both of which open the database up to SQL injection. Apparently, PostgreSQL
is even fedramp compliant, and that actually matters to me.

This is still a very strange thread. This would be like asking a C#
developer why the .net stack is the best thing ever created and why they
should use it over PHP or Django. I don't know what the OP really expected
to get out of this apart from basically everyone here saying "you should
probably use PostgreSQL".

Thanks,
~Ben

On Tue, Mar 12, 2019 at 12:09 PM Michael Nolan  wrote:

> The MySQL manual says that INNODB 'adheres closely' to the ACID model,
> though there are settings where you can trade some ACID compliance for
> performance.
>
> See https://dev.mysql.com/doc/refman/5.6/en/mysql-acid.html
>
> I've been running PostgreSQL for a client since 2005, we're on our 5th
> hardware platform in that time period.  We also run a MySQL/MariaDB
> database to support our website, it is currently on an AWS RDB server which
> is refreshed from the PostgreSQL server.  I also administered an Oracle
> system for a well-known educational publisher for about 10 years.  Given my
> druthers, I'd take PostgreSQL over either Oracle or MySQL.
> --
> Mike Nolan
>
>>


Re: POSTGRES/MYSQL

2019-03-12 Thread Francisco Olarte
Benedict:

On Tue, Mar 12, 2019 at 5:53 PM Benedict Holland
 wrote:
> I am not saying it is not well documented. I am saying that it isn't ACID 
> compliant, which it isn't, as they document.

You are using the term "ACID compliant". Could you please point me to
some definition of what this is ( not ACID, I know what ACID is, not
compiant or compliance, I now what compliance is like "SQL-99
compliant", "ACID compliant" is what I do not have a definition for ).

Francisco Olarte.



xmin and very high number of concurrent transactions

2019-03-12 Thread Vijaykumar Jain
I was asked this question in one of my demos, and it was interesting one.

we update xmin for new inserts with the current txid.
now in a very high concurrent scenario where there are more than 2000
concurrent users trying to insert new data,
will updating xmin value be a bottleneck?

i know we should use pooling solutions to reduce concurrent
connections but given we have enough resources to take care of
spawning a new process for a new connection,

Regards,
Vijay



Re: xmin and very high number of concurrent transactions

2019-03-12 Thread Adrian Klaver

On 3/12/19 12:19 PM, Vijaykumar Jain wrote:

I was asked this question in one of my demos, and it was interesting one.

we update xmin for new inserts with the current txid.


Why?


now in a very high concurrent scenario where there are more than 2000
concurrent users trying to insert new data,
will updating xmin value be a bottleneck?

i know we should use pooling solutions to reduce concurrent
connections but given we have enough resources to take care of
spawning a new process for a new connection,

Regards,
Vijay





--
Adrian Klaver
adrian.kla...@aklaver.com



Re: [External] Re: xmin and very high number of concurrent transactions

2019-03-12 Thread Vijaykumar Jain
no i mean not we end users, postgres does it (?) via the xmin and xmax
fields  from inherited tables :) if that is what you wanted in a why
or are you asking, does postgres even update those rows and i am wrong
assuming it that way?

since the values need to be atomic,
consider the below analogy
assuming i(postgres) am person giving out token to
people(connections/tx) in a queue.
if there is a single line, (sequential) then it is easy for me to
simply give them 1 token incrementing the value and so on.
but if there are thousands of users in parallel lines, i am only one
person delivering the token, will operate sequentially, and the other
person is "blocked" for sometime before it gets the token with the
required value.
so if there are 1000s or users with the "delay" may impact my
performance  coz i need to maintain the value of the token to be able
to know what token value i need to give to next person?

i do not know if am explaining it correctly, pardon my analogy,


Regards,
Vijay

On Wed, Mar 13, 2019 at 1:10 AM Adrian Klaver  wrote:
>
> On 3/12/19 12:19 PM, Vijaykumar Jain wrote:
> > I was asked this question in one of my demos, and it was interesting one.
> >
> > we update xmin for new inserts with the current txid.
>
> Why?
>
> > now in a very high concurrent scenario where there are more than 2000
> > concurrent users trying to insert new data,
> > will updating xmin value be a bottleneck?
> >
> > i know we should use pooling solutions to reduce concurrent
> > connections but given we have enough resources to take care of
> > spawning a new process for a new connection,
> >
> > Regards,
> > Vijay
> >
> >
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com



Re: POSTGRES/MYSQL

2019-03-12 Thread Christopher Browne
On Tue, 12 Mar 2019 at 12:53, Benedict Holland
 wrote:
> I am not saying it is not well documented. I am saying that it isn't ACID 
> compliant, which it isn't, as they document.

I *love* the notion of being able to roll back DDL, but it has long
been common for DDL to *not* be transactional even with some of the
Big Expensive Databases (such as the one whose name begins with an
"O").

Up until version 11.something, "Big O" apparently did NOT have this,
and MS SQL Server didn't in version 2008.

https://docs.oracle.com/cd/B19306_01/server.102/b14220/transact.htm
https://docs.microsoft.com/en-us/previous-versions/sql/sql-server-2008-r2/ms189122(v=sql.105)

Of course, those are somewhat old versions.  But nobody would have
claimed those systems not to be "ACID Compliant" at the time; you're
setting the bar a bit too high.

Someone's asking the merits of PostgreSQL versus MySQL; it certainly
*is* possible to overplay the case.

I'm perfectly happy with a claim like...

 "PostgreSQL does transactional DDL, which we find quite valuable, and
while MySQL supports ACID for data manipulation, with suitable choice
of storage engines, there is not the same capability to be able to
roll back DDL within a transaction."
-- 
When confronted by a difficult problem, solve it by reducing it to the
question, "How would the Lone Ranger handle this?"



Re: POSTGRES/MYSQL

2019-03-12 Thread Ron

On 3/12/19 3:19 PM, Christopher Browne wrote:

On Tue, 12 Mar 2019 at 12:53, Benedict Holland
 wrote:

I am not saying it is not well documented. I am saying that it isn't ACID 
compliant, which it isn't, as they document.

I *love* the notion of being able to roll back DDL, but it has long
been common for DDL to *not* be transactional even with some of the
Big Expensive Databases (such as the one whose name begins with an
"O").

Up until version 11.something, "Big O" apparently did NOT have this,
and MS SQL Server didn't in version 2008.


This has always shocked me.  DEC's relational and CODASYL dbms products (now 
owned by Big O, and still being updated) has had transactional DDL for 35 years.


I wouldn't be surprised if their PDP11 predecessors had it 40 years ago.

--
Angular momentum makes the world go 'round.



Re: [External] Re: xmin and very high number of concurrent transactions

2019-03-12 Thread Adrian Klaver

On 3/12/19 1:02 PM, Vijaykumar Jain wrote:

no i mean not we end users, postgres does it (?) via the xmin and xmax
fields  from inherited tables :) if that is what you wanted in a why
or are you asking, does postgres even update those rows and i am wrong
assuming it that way?


Not sure where the inherited tables come in?

See below for more info:
https://www.postgresql.org/docs/11/storage-page-layout.html

AFAIK xmin and xmax are just done as part of the insert or delete 
operations so there is no updating involved.


I would say the impact to performance would come from the overhead of 
each connection rather then maintaining xmin/xmax.




since the values need to be atomic,
consider the below analogy
assuming i(postgres) am person giving out token to
people(connections/tx) in a queue.
if there is a single line, (sequential) then it is easy for me to
simply give them 1 token incrementing the value and so on.
but if there are thousands of users in parallel lines, i am only one
person delivering the token, will operate sequentially, and the other
person is "blocked" for sometime before it gets the token with the
required value.
so if there are 1000s or users with the "delay" may impact my
performance  coz i need to maintain the value of the token to be able
to know what token value i need to give to next person?

i do not know if am explaining it correctly, pardon my analogy,


Regards,
Vijay

On Wed, Mar 13, 2019 at 1:10 AM Adrian Klaver  wrote:


On 3/12/19 12:19 PM, Vijaykumar Jain wrote:

I was asked this question in one of my demos, and it was interesting one.

we update xmin for new inserts with the current txid.


Why?


now in a very high concurrent scenario where there are more than 2000
concurrent users trying to insert new data,
will updating xmin value be a bottleneck?

i know we should use pooling solutions to reduce concurrent
connections but given we have enough resources to take care of
spawning a new process for a new connection,

Regards,
Vijay





--
Adrian Klaver
adrian.kla...@aklaver.com



--
Adrian Klaver
adrian.kla...@aklaver.com



Re: POSTGRES/MYSQL

2019-03-12 Thread 王pg
Different code behave different,  what you need to do is to study your
business model, and find a best fit.

I'm going to talk about the nontechnique thing, and hope it can help.

Mysql is spliting, when saying Mysql, we always need to think about Mysql
of Oracle or Mariadb. both database have their own road map, you cannot
benifit from both.
if you finaly choice "Mysql", actually, you will still face to another
choice, Mysql or Mariadb. Assume you select Mysql, you will always face to
new choices, for example, some tools(or practice) is for Mariadb, whether
it can work with Mysql; some good features is included by Mariadb, what
about or when for mysql; some business model that Maradb code can fit well,
but Mysql code cannot, and so on.

Thanks
Denny Wang


On Wed, Mar 13, 2019 at 12:09 AM Michael Nolan  wrote:

> The MySQL manual says that INNODB 'adheres closely' to the ACID model,
> though there are settings where you can trade some ACID compliance for
> performance.
>
> See https://dev.mysql.com/doc/refman/5.6/en/mysql-acid.html
>
> I've been running PostgreSQL for a client since 2005, we're on our 5th
> hardware platform in that time period.  We also run a MySQL/MariaDB
> database to support our website, it is currently on an AWS RDB server which
> is refreshed from the PostgreSQL server.  I also administered an Oracle
> system for a well-known educational publisher for about 10 years.  Given my
> druthers, I'd take PostgreSQL over either Oracle or MySQL.
> --
> Mike Nolan
>
>>

-- 
regards
denny


Re: Performance comparison between Pgsql 10.5 and Pgsql 11.2

2019-03-12 Thread Thomas Munro
On Tue, Mar 12, 2019 at 10:49 PM Nicola Contu  wrote:
> Il giorno gio 7 mar 2019 alle ore 09:39 Nicola Contu  
> ha scritto:
>> So the first file is on Postgres11.2 on a test server (and where I compare 
>> 10 vs 11)
>> The second file, is our preprod machine running Postgres 11.2 (different 
>> hardware etc, it is a VM). I know that could be confusing, but I just wanted 
>> to compare that too because if you see the two files there's a lot of 
>> difference between the two machines.
>> And they are both running CentOS 7.
>>
>> So at this point I have two problems. One inside the machine between 
>> Postgres 10 and 11 and another problem on the preprod (similar to prod) with 
>> a lot of lseek.

So we still don't know what is different between 10 and 11.  We'll
need some more evidence to understand that.  We know that perf said 11
was spending more time making syscalls, but we haven't seen any
syscall measurement from the 10 system so we don't know what those
supposed extra syscalls are.  Also, backing up a bit, have you
compared the query plan to see if it's the same, and can we please see
it?

As for the preprod/lots-of-lseek system, I don't know, it seems there
are even more unknown variables there...  if you can say more about
the virtualisation technology you're using, perhaps someone who knows
more about that would have some ideas.  Besides apparently slower
syscalls, one factor that is unexplained is why that system is calling
lseek() more times per query (but you said there may be other work
happening on the server, so who knows).

-- 
Thomas Munro
https://enterprisedb.com



Re: PostgreSQL logical replication slot LSN values

2019-03-12 Thread Rashmi V Bharadwaj
Hi,

> Well, did you consume the logical data, and if so how? When you use the
> streaming interface - HIGHLY recommended - you need to send feedback
> messages as to where you've received the data.
 Yes, I am consuming data using the PGReplicationStream.readPending() 
method in my program.


> Hm, that should not happen. Did you by any chance externally (manually
> or by script) delete WAL files?
No, I am not deleting the WAL files externally.


> You need to send feedback messages confirming up to wher eyou've
> consumed the data when using the streaming protocol. When using the SQL
> functions the _get_ function confirms when it returns, the _peek_
> function never does so. It's recommended to limit the size of the
> resultset a bit using the nchanges paramter so you can call it in
> smaller increments.
  I am sending feedback messages using method 
PGReplicationStream.setAppliedLSN(). I cannot do 
PGReplicationStream.setFlushedLSN() as I want the option to go back and 
read data in case of data loss. I have a separate utility that can be used to 
set the flush_lsn position periodically.


Currently since the WAL file was deleted by postgresql, I am not able to move 
forward at all - replication thru the SQL or Java API is not happening. I tried 
doing a PGReplicationStream.setAppliedLSN and PGReplicationStream.setFlushedLSN 
(to current LSN) thru another Java program for the same replication slot, but 
that didn't work. It still gives the WAL segment already removed error.

Could you please suggest a solution for this? Is there a way to set the 
restart_lsn and flush_lsn of slot? Or is recreating the slot the only possible 
solution?


Thanks,
Rashmi



-Andres Freund  wrote: -
To: Rashmi V Bharadwaj 
From: Andres Freund 
Date: 12/03/2019 09:07PM
Cc: pgsql-gene...@postgresql.org
Subject: Re: PostgreSQL logical replication slot LSN values

Hi,

(please don't send HTML only emails to this list)

On 2019-03-12 11:08:56 +, Rashmi V Bharadwaj wrote:
> We have an application that uses the PostgreSQL logical replication API to 
> read
> the changes made to the PostgreSQL database and applies it to a different
> database (like Db2 etc). We are using logical replication slots for
> this.

Cool.


> Currently I am facing an issue where the replication slot is pointing to an
> older restart_lsn and confirmed_flush_lsn (like 10 days back) and the
> corresponding WAL file is already deleted from the pg_wal directory. Ideally
> this should not happen, right?

Well, did you consume the logical data, and if so how? When you use the
streaming interface - HIGHLY recommended - you need to send feedback
messages as to where you've received the data.


> since the slot is holding this LSN the wal file
> should not have been deleted. Now when I try to use query like
> select * from pg_logical_slot_get_changes(,, NULL)
> 

> or use the logical replication API with a start position as any newer LSN, I
> get the following error:
> 
> ERROR: requested WAL segment pg_wal/00010036 has already been
> removed
> SQL state: 58P01.

Hm, that should not happen. Did you by any chance externally (manually
or by script) delete WAL files?


> How do I get past this issue? I have not enabled log archiving. I would also
> like to know how I can modify the restart_lsn and confirmed_flush_lsn 
> positions
> of the slot?

You need to send feedback messages confirming up to wher eyou've
consumed the data when using the streaming protocol. When using the SQL
functions the _get_ function confirms when it returns, the _peek_
function never does so. It's recommended to limit the size of the
resultset a bit using the nchanges paramter so you can call it in
smaller increments.

Greetings,

Andres Freund





Re: xmin and very high number of concurrent transactions

2019-03-12 Thread reg_pg_stefanz
I may have misunderstood the documentation or your question, but I had 
the understanding that xmin is not updated, but is only set on insert
(but yes, also for update, but updates are also inserts for Postgres as 
updates are executed as delete/insert)


from https://www.postgresql.org/docs/10/ddl-system-columns.html
> xmin
> The identity (transaction ID) of the inserting transaction for this 
row version. (A row version is an individual state of > row; each update 
of a row creates a new row version for the same logical row.)


therfore I assume, there are no actual updates of xmin values

Stefan

On 12.03.2019 20:19, Vijaykumar Jain wrote:

I was asked this question in one of my demos, and it was interesting one.

we update xmin for new inserts with the current txid.
now in a very high concurrent scenario where there are more than 2000
concurrent users trying to insert new data,
will updating xmin value be a bottleneck?

i know we should use pooling solutions to reduce concurrent
connections but given we have enough resources to take care of
spawning a new process for a new connection,

Regards,
Vijay






Re: PostgreSQL logical replication slot LSN values

2019-03-12 Thread Rashmi V Bharadwaj
The PGReplicationStream.setAppliedLSN is only applicable for physical 
replication, right? So this may not actually make a difference for my logical 
replication program.
Periodically running the utility for setting the flush LSN using 
PGReplicationStream.setFlushedLSN should still work for the feedback mechanism 
right?

Thanks,
Rashmi

-"Rashmi V Bharadwaj"  wrote: -
To: Andres Freund 
From: "Rashmi V Bharadwaj" 
Date: 13/03/2019 10:59AM
Cc: pgsql-gene...@postgresql.org
Subject: Re: PostgreSQL logical replication slot LSN values

Hi,

> Well, did you consume the logical data, and if so how? When you use the
> streaming interface - HIGHLY recommended - you need to send feedback
> messages as to where you've received the data.
 Yes, I am consuming data using the PGReplicationStream.readPending() 
method in my program.


> Hm, that should not happen. Did you by any chance externally (manually
> or by script) delete WAL files?
No, I am not deleting the WAL files externally.


> You need to send feedback messages confirming up to wher eyou've
> consumed the data when using the streaming protocol. When using the SQL
> functions the _get_ function confirms when it returns, the _peek_
> function never does so. It's recommended to limit the size of the
> resultset a bit using the nchanges paramter so you can call it in
> smaller increments.
  I am sending feedback messages using method 
PGReplicationStream.setAppliedLSN(). I cannot do 
PGReplicationStream.setFlushedLSN() as I want the option to go back and 
read data in case of data loss. I have a separate utility that can be used to 
set the flush_lsn position periodically.


Currently since the WAL file was deleted by postgresql, I am not able to move 
forward at all - replication thru the SQL or Java API is not happening. I tried 
doing a PGReplicationStream.setAppliedLSN and PGReplicationStream.setFlushedLSN 
(to current LSN) thru another Java program for the same replication slot, but 
that didn't work. It still gives the WAL segment already removed error.

Could you please suggest a solution for this? Is there a way to set the 
restart_lsn and flush_lsn of slot? Or is recreating the slot the only possible 
solution?


Thanks,
Rashmi



-Andres Freund  wrote: -
To: Rashmi V Bharadwaj 
From: Andres Freund 
Date: 12/03/2019 09:07PM
Cc: pgsql-gene...@postgresql.org
Subject: Re: PostgreSQL logical replication slot LSN values

Hi,

(please don't send HTML only emails to this list)

On 2019-03-12 11:08:56 +, Rashmi V Bharadwaj wrote:
> We have an application that uses the PostgreSQL logical replication API to 
> read
> the changes made to the PostgreSQL database and applies it to a different
> database (like Db2 etc). We are using logical replication slots for
> this.

Cool.


> Currently I am facing an issue where the replication slot is pointing to an
> older restart_lsn and confirmed_flush_lsn (like 10 days back) and the
> corresponding WAL file is already deleted from the pg_wal directory. Ideally
> this should not happen, right?

Well, did you consume the logical data, and if so how? When you use the
streaming interface - HIGHLY recommended - you need to send feedback
messages as to where you've received the data.


> since the slot is holding this LSN the wal file
> should not have been deleted. Now when I try to use query like
> select * from pg_logical_slot_get_changes(,, NULL)
> 

> or use the logical replication API with a start position as any newer LSN, I
> get the following error:
> 
> ERROR: requested WAL segment pg_wal/00010036 has already been
> removed
> SQL state: 58P01.

Hm, that should not happen. Did you by any chance externally (manually
or by script) delete WAL files?


> How do I get past this issue? I have not enabled log archiving. I would also
> like to know how I can modify the restart_lsn and confirmed_flush_lsn 
> positions
> of the slot?

You need to send feedback messages confirming up to wher eyou've
consumed the data when using the streaming protocol. When using the SQL
functions the _get_ function confirms when it returns, the _peek_
function never does so. It's recommended to limit the size of the
resultset a bit using the nchanges paramter so you can call it in
smaller increments.

Greetings,

Andres Freund