cvs is updated :-)

Armin Groesslinger heeft op dinsdag, 20 mei 2003 om 21:01 (Europe/Amsterdam) het volgende geschreven:

Hello,

I had a look at the query used in pgsql/dbpgsql.c to calculate the used quota in db_get_quotum_used(). I executed the current query (with "size" instead of
"m.size")

SELECT SUM(messagesize)
FROM messages
WHERE mailbox_idnr IN
     (SELECT mailbox_idnr
      FROM mailboxes
      WHERE owner_idnr = %llu::bigint)
  AND status < 2;

and the equivalent query (which is used in mysql/dbmysql.c)

SELECT SUM(m.messagesize)
FROM messages m, mailboxes mb
WHERE m.mailbox_idnr = mb.mailbox_idnr
     AND mb.owner_idnr = %llu::bigint
     AND m.status < 2;

on an account which had 2155 mails in 52 mailboxes (the whole database
contains 26000 mails in 400 mailboxes). The server is a Celeron 1200MHz with
256 MB RAM.

I asked PostgreSQL 7.3 to EXPLAIN ANALYZE the queries (details see below): PostgreSQL uses two nested sequential scans (on messages and mailboxes) for
the first query and the total execution time is 12144.25 msec.
The second query is executed as a hash join and takes only 154.69 msec.

These results are repeatable and the interactive behaviour of the mail server
is _much_ better with the second query when, e.g., moving mails between
folders.

I've attached a patch to change the query in pgsql/dbpgsql.c to be the same as
the query used in mysql/dbmysql.c .

Is there a reason why the query in pgsql/dbpgsql.c has been formulated
differently, with a sub-select?

Regards,
Armin


Detailed output from "EXPLAIN ANALYZE":

dbmail=# explain analyze SELECT SUM(messagesize) FROM messages WHERE
mailbox_idnr IN (SELECT mailbox_idnr FROM mailboxes WHERE owner_idnr =
10::bigint) AND status < 2;
                                                      QUERY PLAN
----------------------------------------------------------------------- ------------------------------------------------
 Aggregate  (cost=140954.36..140954.36 rows=1 width=8) (actual
time=12144.10..12144.10 rows=1 loops=1)
-> Seq Scan on messages (cost=0.00..140922.85 rows=12601 width=8) (actual
time=11.92..12130.91 rows=2056 loops=1)
         Filter: ((status < 2) AND (subplan))
         SubPlan
-> Seq Scan on mailboxes (cost=0.00..11.03 rows=52 width=8)
(actual time=0.02..0.43 rows=50 loops=25105)
                 Filter: (owner_idnr = 10::bigint)
 Total runtime: 12144.25 msec


dbmail=# explain analyze SELECT SUM(m.messagesize) FROM messages m, mailboxes mb WHERE m.mailbox_idnr = mb.mailbox_idnr AND mb.owner_idnr = 10::bigint AND
m.status < 2;
                                                        QUERY PLAN
----------------------------------------------------------------------- ---------------------------------------------------- Aggregate (cost=856.95..856.95 rows=1 width=24) (actual time=154.51..154.51
rows=1 loops=1)
   ->  Hash Join  (cost=11.16..848.80 rows=3260 width=24) (actual
time=1.35..142.72 rows=2056 loops=1)
         Hash Cond: ("outer".mailbox_idnr = "inner".mailbox_idnr)
-> Seq Scan on messages m (cost=0.00..670.89 rows=25202 width=16)
(actual time=0.10..111.64 rows=25105 loops=1)
               Filter: (status < 2)
-> Hash (cost=11.03..11.03 rows=52 width=8) (actual time=1.11..1.11
rows=0 loops=1)
               ->  Seq Scan on mailboxes mb  (cost=0.00..11.03 rows=52
width=8) (actual time=0.18..1.05 rows=52 loops=1)
                     Filter: (owner_idnr = 10::bigint)
 Total runtime: 154.69 msec
<pg-quota-query.diff>

_________________________
R.A. Rozendaal
IC&S
T: +31 30 2322878
F: +31 30 2322305
www.ic-s.nl

Reply via email to