Have you tried to set the instance running on GCP to have similar
shared_buffers as the AWS database ?
What you described has a much lower cache hit rate on GCS and 2X the
shared buffers on AWS which could well explain much of the difference
in execution times.
DETAILS:
Query explain for Postgres on GCP VM:
Buffers: shared hit=423 read=4821
Query explain for Postgres on AWS RDS:
Buffers: shared hit=3290 read=1948
and the configuration :
Instance on VM on GCP (2 vCPUs, 2 GB memory, 800 GB disk):
• shared_buffers: 510920kB (close to 499MB)
Instance managed by RDS (2 vCPUs, 2 GiB RAM, 250GB disk, 750 de IOPS):
• shared_buffers: 943896kB (close to 922MB)
Cheers
Hannu
On Fri, Feb 26, 2021 at 9:16 AM Justin Pitts <[email protected]> wrote:
>
> Since this is a comparison to RDS, and the goal presumably is to make the
> test as even as possible, you will want to pay attention to the network IO
> capacity for the client and the server in both tests.
>
> For RDS, you will be unable to run the client software locally on the server
> hardware, so you should plan to do the same for the GCP comparison.
>
> What is the machine size you are using for your RDS instance? Each machine
> size will specify CPU and RAM along with disk and network IO capacity.
>
> Is your GCP VM where you are running PG ( a GCP VM is the equivalent of an
> EC2 instance, by the way ) roughly equivalent to that RDS instance?
>
> Finally, is the network topology roughly equivalent? Are you performing these
> tests with the same region and/or availability zone?
>
>
>
> On Thu, Feb 25, 2021 at 3:32 PM Philip Semanchuk
> <[email protected]> wrote:
>>
>>
>>
>> > On Feb 25, 2021, at 4:04 PM, Igor Gois <[email protected]> wrote:
>> >
>> > Philip,
>> >
>> > The results in first email in this thread were using explain analyze.
>> >
>> > I thought that you asked to run using only 'explain'. My bad.
>> >
>> > The point is, the execution time with explain analyze is less the 1
>> > second. But the actual execution time (calculated from the python client)
>> > is 24 seconds (aws) and 300+ seconds in gcp
>>
>> Oh OK, sorry, I wasn’t following. Yes, network speed sounds like the source
>> of the problem.
>>
>> Under AWS sometimes we log into an EC2 instance if we have to run a query
>> that generates a lot of data so that both server and client are inside AWS.
>> If GCP has something similar to EC2, it might be an interesting experiment
>> to run your query from there and see how much, if any, that changes the time
>> it takes to get results.
>>
>> Hope this helps
>> Philip
>>
>>
>>
>> >
>> > Em qui., 25 de fev. de 2021 às 17:53, Philip Semanchuk
>> > <[email protected]> escreveu:
>> >
>> >
>> > > On Feb 25, 2021, at 3:46 PM, Igor Gois <[email protected]> wrote:
>> > >
>> > > Hi, Philip
>> > >
>> > > We ran: EXPLAIN (FORMAT JSON) SELECT "Id", "DateTime",
>> > > "SignalRegisterId", "Raw" FROM "SignalRecordsBlobs" WHERE
>> > > "SignalSettingId" = 103 AND "DateTime" BETWEEN
>> > > '2019-11-28T14:00:12.540200000' AND '2020-07-23T21:12:32.249000000';
>> > >
>> > > but it was really fast. I think the results were discarded.
>> >
>> > EXPLAIN and EXPLAIN ANALYZE are different in an important way. EXPLAIN
>> > merely plans the query, EXPLAIN ANALYZE plans *and executes* the query.
>> > From the doc —
>> >
>> > "The ANALYZE option causes the statement to be actually executed, not only
>> > planned....Keep in mind that the statement is actually executed when the
>> > ANALYZE option is used. Although EXPLAIN will discard any output that a
>> > SELECT would return, other side effects of the statement will happen as
>> > usual. “
>> >
>> > https://www.postgresql.org/docs/12/sql-explain.html
>> >
>> >
>> > >
>> > > AWS Execution time select without explain: 24.96505s (calculated in
>> > > python client)
>> > > AWS Execution time select with explain but without analyze: 0.03876s
>> > > (calculated in python client)
>> > >
>> > > https://explain.depesz.com/s/5HRO
>> > >
>> > > Thanks in advance
>> > >
>> > >
>> > > Em qui., 25 de fev. de 2021 às 15:13, Philip Semanchuk
>> > > <[email protected]> escreveu:
>> > >
>> > >
>> > > > On Feb 24, 2021, at 10:11 AM, Igor Gois <[email protected]>
>> > > > wrote:
>> > > >
>> > > > Hi, Julien
>> > > >
>> > > > Your hypothesis about network transfer makes sense. The query returns
>> > > > a big size byte array blobs.
>> > > >
>> > > > Is there a way to test the network speed against the instances? I have
>> > > > access to the network speed in gcp (5 Mb/s), but don't have access in
>> > > > aws rds.
>> > >
>> > > Perhaps what you should run is EXPLAIN ANALYZE SELECT...? My
>> > > understanding is that EXPLAIN ANALYZE executes the query but discards
>> > > the results. That doesn’t tell you the network speed of your AWS
>> > > instance, but it does isolate the query execution speed (which is what I
>> > > think you’re trying to measure) from the network speed.
>> > >
>> > > Hope this is useful.
>> > >
>> > > Cheers
>> > > Philip
>> > >
>> > > >
>> > > >
>> > > > Em qua., 24 de fev. de 2021 às 10:35, Julien Rouhaud
>> > > > <[email protected]> escreveu:
>> > > > Hi,
>> > > >
>> > > > On Wed, Feb 24, 2021 at 6:14 AM Maurici Meneghetti
>> > > > <[email protected]> wrote:
>> > > > >
>> > > > > I have 2 postgres instances created from the same dump (backup), one
>> > > > > on a GCP VM and the other on AWS RDS. The first instance takes 18
>> > > > > minutes and the second one takes less than 20s to run this simples
>> > > > > query:
>> > > > > SELECT "Id", "DateTime", "SignalRegisterId", "Raw" FROM
>> > > > > "SignalRecordsBlobs" WHERE "SignalSettingId" = 103 AND "DateTime"
>> > > > > BETWEEN '2019-11-28T14:00:12.540200000' AND
>> > > > > '2020-07-23T21:12:32.249000000';
>> > > > > I’ve run this query a few times to make sure both should be reading
>> > > > > data from cache.
>> > > > > I expect my postgres on GPC to be at least similar to the one
>> > > > > managed by AWS RDS so that I can work on improvements parallelly and
>> > > > > compare.
>> > > > >
>> > > > > DETAILS:
>> > > > > [...]
>> > > > > Planning time: 456.315 ms
>> > > > > Execution time: 776.976 ms
>> > > > >
>> > > > > Query explain for Postgres on AWS RDS:
>> > > > > [...]
>> > > > > Planning time: 0.407 ms
>> > > > > Execution time: 14.87 ms
>> > > >
>> > > > Those queries were executed in respectively ~1s and ~15ms (one thing
>> > > > to note is that the slower one had less data in cache, which may or
>> > > > may note account for the difference). Does those plans reflect the
>> > > > reality of your slow executions? If yes it's likely due to quite slow
>> > > > network transfer. Otherwise we would need an explain plan from the
>> > > > slow execution, for which auto_explain can help you. See
>> > > > https://www.postgresql.org/docs/11/auto-explain.html for more details.
>> > > >
>> > > >
>> > > > --
>> > > > Att,
>> > > >
>> > > > Igor Gois | Sócio Consultor
>> > > > (48) 99169-9889 | Skype: igor_msg
>> > > > Site | Blog | LinkedIn | Facebook | Instagram
>> > > >
>> > > >
>> > >
>> > >
>> > >
>> > > --
>> > > Att,
>> > >
>> > > Igor Gois | Sócio Consultor
>> > > (48) 99169-9889 | Skype: igor_msg
>> > > Site | Blog | LinkedIn | Facebook | Instagram
>> > >
>> > >
>> >
>> >
>> >
>> > --
>> > Att,
>> >
>> > Igor Gois | Sócio Consultor
>> > (48) 99169-9889 | Skype: igor_msg
>> > Site | Blog | LinkedIn | Facebook | Instagram
>> >
>> >
>>
>>
>>