Re: Question on overall design

2023-12-10 Thread veem v
Thank you so much Ron.  I have some more doubts related to this.

We were thinking , if there is any utility in PG with which we can
create/generate large sample data volume which we can use to run it on our
on premise Oracle exadata box and use the same on the aurora postgresql in
cloud to see the read and write performance comparison. Is there any such
exists, to quickly get some large data volume?  But i think , what you are
saying is, we should use real data to get actual or closest possible
benchmarks, correct me if wrong?

We used to see the data dictionary views (called AWR views) in Oracle to
see the current and historical performance statistics like CPU, IO , Memory
usage, object level contentions etc. in the oracle database. Do we have
such a thing available in Aurora postgre, so as to monitor the
performance and get some idea of how well the load test goes and what
capacity is available or are we saturating it?

When you said "*Beware of excessive partitioning.  We had to "departion"
most tables, because of the impact on performance.*" , as I understand
partitioning helps in reducing IO while reading , as it scans less data
because of partition pruning. And while writing there is almost minimal
overhead in identifying the required partition to which the
INSERTed/Updated/Deleted data will be landing ON. So can you please help me
understand what exact performance impact you are highlighting here? Are you
talking about the size of the partitions or total count of the partitions?
In our case we are planning to do either daily range partition or hourly
based on data data volume, not sure if there exists some sweet spot in
regards to what should be the size of each partition in postgresql be. If
you are pointing to higher count of partitions of table , then in our case
if we persist ~90 days data then for a transaction table it would be ~90
daily range partitions or ~2160 hourly range partitions in the aurora
postgresql. It would be helpful , if you can explain a bit regarding what
exact performance impact you faced in regards to the partitioning in aurora
postgresql.

*"Use ora2pg to export views in the Oracle database. It's very easy; a
small EC2 VM running Linux with enough disk space lets you automate the
extraction from Oracle and importation into AWS Postgresql.)"*

Need to explore a bit more on this I believe. We have an oracle on premise
database, so we can move data directly to aurora postgresql in the cloud.
Another thing , is we have some sample data available in the AWS snowflake
but then not sure if some mechanism is available to move the same data to
the aurora postgresql ?

On Sun, 10 Dec 2023 at 02:27, Ron Johnson  wrote:

> I don't know anything about Aurora, only have experience with RDS
> Postgresql.
>
> We successfully migrated from on-prem Oracle (12c, I think) to RDS
> Postgresql 12, and were very happy: little down time (I take pride in
> designing/implementing that; thanks, ora2pg!), with lower disk (8TB, down
> to 5TB) and CPU usage.
>
> I'm not sure what the TPS was in Oracle, but the server level "we" are on
> (I'm no longer with that company, and don't remember the specifics (48 vCPU
> / 132 GB RAM, I think, with 10,000 IOPs) is over-powered for *our* needs.
>
> You're going to have to spin up a full-sized instance, import a *lot* of
> real data(*) into a production-identical schema and then run your batch
> load process using test data (or copies of real batches).  That's the only
> way you'll *really* know.
>
> Beware of excessive partitioning.  We had to "departion" most tables,
> because of the impact on performance.
>
> (*) Use ora2pg to export views in the Oracle database.  It's *very* easy;
> a small EC2 VM running Linux with enough disk space lets you automate the
> extraction from Oracle and importation into AWS Postgresql.)
>
> On Sat, Dec 9, 2023 at 3:36 PM veem v  wrote:
>
>> Thank you so much for the response.
>>
>> Got your point, will check if we really need details or summary for the
>> historical data. But it looks like we will need detailed transaction data
>> for ~2 years at least.
>>
>> My understanding was that AWS has two different offerings and "aurora
>> postgresql" is more performant and costlier than "RDS postgresql". Will
>> double check on this though.
>>
>> However , how to prove RDS/aurora postgresql is going to serve our OLTP
>> requirement here , similar to the current on premise Oracle exadata. For
>> the OLTP use case we are expecting ~15K TPS write and 2K TPS read and the
>> response for the UI queries are expected to be within subseconds. But yes,
>> as i mentioned individual transactions will be batched and then will be
>> written to the database, so this should have lesser resource consumption
>> and contention created.
>>
>> To test if Aurora postgresql will be comparable to cater the above needs
>> (in regards to the expected performance with nominal cost) ,how should we
>> test it? As we won't be able to test everything right away, S

Re: Question on overall design

2023-12-10 Thread Ron Johnson
* PG has pgbench; *maybe* you can hack it to work on Oracle.
* If you want to know how well an RDBMS will work on your workload, then
you must provide it with a simulated workload.  Right?
* AWS RDS Postgresql has a dashboard that *might* be similar to AWR.  Or it
might not...
* We departitioned because SELECT statements were *slow*.  All partitions
were scanned, even when the partition key was specified in the WHERE clause.

On Sun, Dec 10, 2023 at 8:45 AM veem v  wrote:

> Thank you so much Ron.  I have some more doubts related to this.
>
> We were thinking , if there is any utility in PG with which we can
> create/generate large sample data volume which we can use to run it on our
> on premise Oracle exadata box and use the same on the aurora postgresql in
> cloud to see the read and write performance comparison. Is there any such
> exists, to quickly get some large data volume?  But i think , what you are
> saying is, we should use real data to get actual or closest possible
> benchmarks, correct me if wrong?
>
> We used to see the data dictionary views (called AWR views) in Oracle to
> see the current and historical performance statistics like CPU, IO , Memory
> usage, object level contentions etc. in the oracle database. Do we have
> such a thing available in Aurora postgre, so as to monitor the
> performance and get some idea of how well the load test goes and what
> capacity is available or are we saturating it?
>
> When you said "*Beware of excessive partitioning.  We had to "departion"
> most tables, because of the impact on performance.*" , as I understand
> partitioning helps in reducing IO while reading , as it scans less data
> because of partition pruning. And while writing there is almost minimal
> overhead in identifying the required partition to which the
> INSERTed/Updated/Deleted data will be landing ON. So can you please help me
> understand what exact performance impact you are highlighting here? Are you
> talking about the size of the partitions or total count of the partitions?
> In our case we are planning to do either daily range partition or hourly
> based on data data volume, not sure if there exists some sweet spot in
> regards to what should be the size of each partition in postgresql be. If
> you are pointing to higher count of partitions of table , then in our case
> if we persist ~90 days data then for a transaction table it would be ~90
> daily range partitions or ~2160 hourly range partitions in the aurora
> postgresql. It would be helpful , if you can explain a bit regarding what
> exact performance impact you faced in regards to the partitioning in aurora
> postgresql.
>
> *"Use ora2pg to export views in the Oracle database. It's very easy; a
> small EC2 VM running Linux with enough disk space lets you automate the
> extraction from Oracle and importation into AWS Postgresql.)"*
>
> Need to explore a bit more on this I believe. We have an oracle on premise
> database, so we can move data directly to aurora postgresql in the cloud.
> Another thing , is we have some sample data available in the AWS snowflake
> but then not sure if some mechanism is available to move the same data to
> the aurora postgresql ?
>
> On Sun, 10 Dec 2023 at 02:27, Ron Johnson  wrote:
>
>> I don't know anything about Aurora, only have experience with RDS
>> Postgresql.
>>
>> We successfully migrated from on-prem Oracle (12c, I think) to RDS
>> Postgresql 12, and were very happy: little down time (I take pride in
>> designing/implementing that; thanks, ora2pg!), with lower disk (8TB, down
>> to 5TB) and CPU usage.
>>
>> I'm not sure what the TPS was in Oracle, but the server level "we" are on
>> (I'm no longer with that company, and don't remember the specifics (48 vCPU
>> / 132 GB RAM, I think, with 10,000 IOPs) is over-powered for *our* needs.
>>
>> You're going to have to spin up a full-sized instance, import a *lot* of
>> real data(*) into a production-identical schema and then run your batch
>> load process using test data (or copies of real batches).  That's the only
>> way you'll *really* know.
>>
>> Beware of excessive partitioning.  We had to "departion" most tables,
>> because of the impact on performance.
>>
>> (*) Use ora2pg to export views in the Oracle database.  It's *very*
>> easy; a small EC2 VM running Linux with enough disk space lets you automate
>> the extraction from Oracle and importation into AWS Postgresql.)
>>
>> On Sat, Dec 9, 2023 at 3:36 PM veem v  wrote:
>>
>>> Thank you so much for the response.
>>>
>>> Got your point, will check if we really need details or summary for the
>>> historical data. But it looks like we will need detailed transaction data
>>> for ~2 years at least.
>>>
>>> My understanding was that AWS has two different offerings and "aurora
>>> postgresql" is more performant and costlier than "RDS postgresql". Will
>>> double check on this though.
>>>
>>> However , how to prove RDS/aurora postgresql is going to serve our OLTP
>>> requirement here , simi

Re: running \copy through perl dbi ?

2023-12-10 Thread Vincent Veyron
On Fri, 8 Dec 2023 10:45:28 -0500
David Gauthier  wrote:
> 
> I'm trying to run a PG client side "\copy" command from a perl script.  I
> tried using $dbh->do("\\copy ...") but it barffed when it saw the '\'...
> ERROR:  syntax error at or near "\"
> 
> I can do this with a command line approach, attaching to the DB  then run
> using...

Duh! I just realized that what I proposed with system() is a command line 
approach.

As David Johnston mentionned, you can use the SQL COPY command. 

However, you need then to deal with permissions so that the server may write 
the file, so I wonder what approach is the most elegant?


-- 

Bien à vous, Vincent Veyron

https://marica.fr
Logiciel de gestion des contentieux juridiques, des contrats et des sinistres 
d'assurance

-- 
vv.lists 




Re: [EXT]Re: running \copy through perl dbi ?

2023-12-10 Thread Johnson, Bruce E - (bjohnson)

On Dec 10, 2023, at 10:41 AM, Vincent Veyron 
mailto:vv.li...@wanadoo.fr>> wrote:

External Email

On Fri, 8 Dec 2023 10:45:28 -0500
David Gauthier mailto:dfgpostg...@gmail.com>> wrote:

I'm trying to run a PG client side "\copy" command from a perl script.  I
tried using $dbh->do("\\copy ...") but it barffed when it saw the '\'...
ERROR:  syntax error at or near "\"

I can do this with a command line approach, attaching to the DB  then run
using...

Duh! I just realized that what I proposed with system() is a command line 
approach.

As David Johnston mentionned, you can use the SQL COPY command.

One thing to remember with the Perl DBI is that you can use a string variable 
in the $dbh->do() command.

Perl uses 2 different string variable delimiters:

1)  ‘ ‘ , which is exactly what you enter $s= ‘\copy * from foo as json’; will 
send that to the database without the need for escaping anything (unless you 
need to enter an actual ‘ in the command, in which case method two is better)

2) “ “ , which allows for declared perl variables to be substituted in the 
string:$table=‘foo’;$type=‘json’;$cmd=‘\copy’;$s= “$cmd * from $table as $type”;

Concatenation (periods between strings) works as well: $s = ‘\copy ‘.”* from 
foo as json”;

 Then $dbh->do($s); will work in alll three cases.

Been using perl and DBI for (does quick math, ulp!) over 20 years now wrangling 
a lot of things like this.


--
Bruce Johnson
University of Arizona
College of Pharmacy
Information Technology Group

Institutions do not have opinions, merely customs



Re: [EXT]Re: running \copy through perl dbi ?

2023-12-10 Thread Adrian Klaver

On 12/10/23 10:34, Johnson, Bruce E - (bjohnson) wrote:


On Dec 10, 2023, at 10:41 AM, Vincent Veyron > wrote:


External Email

On Fri, 8 Dec 2023 10:45:28 -0500




Duh! I just realized that what I proposed with system() is a command 
line approach.


As David Johnston mentionned, you can use the SQL COPY command.


One thing to remember with the Perl DBI is that you can use a string 
variable in the $dbh->do() command.


Perl uses 2 different string variable delimiters:

1)  ‘ ‘ , which is exactly what you enter $s= ‘\copy * from foo as 
json’; will send that to the database without the need for escaping 
anything (unless you need to enter an actual ‘ in the command, in which 
case method two is better)


The issue is \copy is not a SQl statement, it is a psql(Postgres command 
line client) meta-command. This means it can only be used in psql.



--
Bruce Johnson
University of Arizona
College of Pharmacy
Information Technology Group

Institutions do not have opinions, merely customs



--
Adrian Klaver
adrian.kla...@aklaver.com