HDD vs SSD without explanation

2018-01-14 Thread Neto pr
Dear all

Someone help me analyze the two execution plans below (Explain ANALYZE
used), is the  query 9 of TPC-H benchmark [1].
I'm using a server HP Intel Xeon 2.8GHz/4-core - Memory 8GB HDD SAS 320GB
15 Krpm AND SSD Sansung EVO 500GB.
My DBMS parameters presents in postgresql.conf is default, but in SSD I
have changed random_page_cost = 1.0.

I do not understand, because running on an HDD a query used half the time.
I explain better, in HDD spends on average 12 minutes the query execution
and on SSD spent 26 minutes.
I think maybe the execution plan is using more write operations, and so the
HDD SAS 15Krpm has been faster.
Anyway, I always thought that an SSD would be equal or faster, but in the
case and four more cases we have here, it lost a lot for the HDDs.

Any help in understanding, is welcome

Best Regards
Neto

- Query execution Time on SSD ---
execution 1: 00:23:29
execution 2: 00:28:38
execution 3: 00:27:32
execution 4: 00:27:54
execution 5: 00:27:35
execution 6: 00:26:19
Average: 26min 54 seconds

Query execution Time on HDD
---
execution 1: 00:12:44
execution 2: 00:12:30
execution 3: 00:12:47
execution 4: 00:13:02
execution 5: 00:13:00
execution 6: 00:12:47
Average: 12 minutes 48 seconds

-- EXECUTION PLAN SSD
Storage
Finalize GroupAggregate  (cost=15.694.362.41..15842178.65 rows=60150
width=66) (actual time=1670577.649..1674717.444 rows=175 loops=1)  Group
Key: nation.n_name, (date_part(_year_::text,
(orders.o_orderdate)::timestamp without time zone))  ->  Gather Merge
(cost=15694362.41..15839923.02 rows=120300 width=66) (actual
time=1670552.446..1674716.748 rows=525 loops=1)Workers Planned:
2Workers Launched: 2->  Partial GroupAggregate
(cost=15693362.39..15825037.39 rows=60150 width=66) (actual
time=1640482.164..1644619.574 rows=175 loops=3)  Group Key:
nation.n_name, (date_part(_year_::text, (orders.o_orderdate)::timestamp
without time zone))  ->  Sort  (cost=15693362.39..15709690.19
rows=6531119 width=57) (actual time=1640467.384..1641511.970 rows=4344197
loops=3)Sort Key: nation.n_name,
(date_part(_year_::text, (orders.o_orderdate)::timestamp without time
zone)) DESCSort Method: external merge  Disk:
319512kB->  Hash Join  (cost=4708869.23..14666423.78
rows=6531119 width=57) (actual time=1366753.586..1634128.122 rows=4344197
loops=3)  Hash Cond: (lineitem.l_suppkey =
supplier.s_suppkey)  ->  Hash Join
(cost=4683027.67..14400582.74 rows=6531119 width=43) (actual
time=1328019.213..1623919.675 rows=4344197
loops=3)Hash Cond: (lineitem.l_orderkey =
orders.o_orderkey)->  Hash Join
(cost=1993678.29..11279593.98 rows=6531119 width=47) (actual
time=245906.330..1316201.213 rows=4344197
loops=3)  Hash Cond:
((lineitem.l_suppkey = partsupp.ps_suppkey) AND (lineitem.l_partkey =
partsupp.ps_partkey))  ->  Hash Join
(cost=273200.59..9157211.71 rows=6531119 width=45) (actual
time=5103.563..1007657.993 rows=4344197
loops=3)Hash Cond:
(lineitem.l_partkey =
part.p_partkey)->  Parallel Seq
Scan on lineitem  (cost=0.00..5861332.93 rows=15093 width=41) (actual
time=3.494..842667.110 rows=80004097
loops=3)->  Hash
(cost=263919.95..263919.95 rows=565651 width=4) (actual
time=4973.807..4973.807 rows=434469
loops=3)  Buckets: 131072
Batches: 8  Memory Usage:
2933kB  ->  Seq Scan on
part  (cost=0.00..263919.95 rows=565651 width=4) (actual
time=11.810..4837.287 rows=434469
loops=3)Filter:
((p_name)::text ~~
_%orchid%_::text)
Rows Removed by Filter: 7565531  ->
Hash  (cost=1052983.08..1052983.08 rows=31999708 width=22) (actual
time=240711.936..240711.936 rows=3200
loops=3)Buckets: 65536
Batches: 512  Memory Usage:
3941kB->  Seq Scan on partsupp
(cost=0.00..1052983.08 rows=31999708 width=22) (actual
time=0.033..228828.149 rows=3200
loops=3)->  Hash
(cost=1704962.28..1704962.28 rows=6728 width=8) (actual
time=253669.242..253669.242 rows=6000
loops=3)  Buckets: 131072  Batches:
1024  Memory Usage: 3316kB  ->  Seq
Scan on orders  (cost=0.00..1704962.28 rows=6728 width=8) (actual
time=0.038..237545.226 rows

Re: HDD vs SSD without explanation

2018-01-14 Thread Neto pr
Thanks for the reply.
I'll try upload the execution plan with Explain (analyse, buffer) for
website:  https://explain.depesz.com/

I'm make an experiment for a scientific research and this is what I
find strange, explaining better, strange HDD performance far outweigh
the performance of an SSD.

Do you think that if you run a VACUMM FULL the performance with the
SSD will be better than a 15Krpm SAS HDD?

Best Regards
Neto
 

  https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail";
target="_blank">https://ipmcdn.avast.com/images/icons/icon-envelope-tick-round-orange-animated-no-repeat-v1.gif";
alt="" width="46" height="29" style="width: 46px; height: 29px;"
/>
Livre de vírus. https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail";
target="_blank" style="color: #4453ea;">www.avast.com.  




2018-01-14 19:40 GMT-02:00 Justin Pryzby :
> On Sun, Jan 14, 2018 at 12:44:00PM -0800, Neto pr wrote:
>> Dear all
>>
>> Someone help me analyze the two execution plans below (Explain ANALYZE
>> used), is the  query 9 of TPC-H benchmark [1].
>>
>> I'm using a server HP Intel Xeon 2.8GHz/4-core - Memory 8GB HDD SAS 320GB
>> 15 Krpm AND SSD Sansung EVO 500GB.
>>
>> I think maybe the execution plan is using more write operations, and so the
>> HDD SAS 15Krpm has been faster.
>
> The query plan is all garbled by mail , could you resend?  Or post a link from
> https://explain.depesz.com/
>
> To see if the query is causing many writes (due to dirty pages, sorts, etc),
> run with explain(analyze,buffers)
>
> But from what I could tell, your problems are here:
>
> ->  Parallel Seq Scan on lineitem  (cost=0.00..5861332.93 rows=15093 
> width=41) (actual TIME=3.494..842667.110 rows=80004097 loops=3)
> vs
> ->  Parallel Seq Scan on lineitem  (cost=0.00..5861333.40 rows=15140 
> width=41) (actual TIME=41.805..224438.909 rows=80004097 loops=3)
>
> ->  Seq Scan on partsupp (cost=0.00..1052983.08 rows=31999708 width=22) 
> (actual TIME=0.033..228828.149 rows=3200 loops=3)
> vs
> ->  Seq Scan on partsupp  (cost=0.00..1052934.38 rows=31994838 width=22) 
> (actual TIME=0.037..37865.003 rows=3200 loops=3)
>
> Can you reproduce the speed difference using dd ?
> time sudo dd if=/dev/sdX of=/dev/null bs=1M count=32K 
> skip=$((128*$RANDOM/32)) # set bs to optimal_io_size
>
> Or: bonnie++ -f -n0
>
> What OS/kernel are you using?  LVM?  filesystem?  I/O scheduler?  partitions?
> readahead?  blockdev --getra
>
> If you're running under linux, maybe you can just send the output of:
> for a in /sys/block/sdX/queue/*; do echo "$a `cat $a`"; done
> or: tail 
> /sys/block/sdX/queue/{minimum_io_size,optimal_io_size,read_ahead_kb,scheduler,rotational,max_sectors_kb,logical_block_size,physical_block_size}
>
> Justin



Re: HDD vs SSD without explanation

2018-01-14 Thread Neto pr
2018-01-14 13:40 GMT-08:00 Justin Pryzby :
> On Sun, Jan 14, 2018 at 12:44:00PM -0800, Neto pr wrote:
>> Dear all
>>
>> Someone help me analyze the two execution plans below (Explain ANALYZE
>> used), is the  query 9 of TPC-H benchmark [1].
>>
>> I'm using a server HP Intel Xeon 2.8GHz/4-core - Memory 8GB HDD SAS 320GB
>> 15 Krpm AND SSD Sansung EVO 500GB.
>>
>> I think maybe the execution plan is using more write operations, and so the
>> HDD SAS 15Krpm has been faster.
>
> The query plan is all garbled by mail , could you resend?  Or post a link from
> https://explain.depesz.com/
>
> To see if the query is causing many writes (due to dirty pages, sorts, etc),
> run with explain(analyze,buffers)
>
> But from what I could tell, your problems are here:
>
> ->  Parallel Seq Scan on lineitem  (cost=0.00..5861332.93 rows=15093 
> width=41) (actual TIME=3.494..842667.110 rows=80004097 loops=3)
> vs
> ->  Parallel Seq Scan on lineitem  (cost=0.00..5861333.40 rows=15140 
> width=41) (actual TIME=41.805..224438.909 rows=80004097 loops=3)
>
> ->  Seq Scan on partsupp (cost=0.00..1052983.08 rows=31999708 width=22) 
> (actual TIME=0.033..228828.149 rows=3200 loops=3)
> vs
> ->  Seq Scan on partsupp  (cost=0.00..1052934.38 rows=31994838 width=22) 
> (actual TIME=0.037..37865.003 rows=3200 loops=3)
>
> Can you reproduce the speed difference using dd ?
> time sudo dd if=/dev/sdX of=/dev/null bs=1M count=32K 
> skip=$((128*$RANDOM/32)) # set bs to optimal_io_size
>
> Or: bonnie++ -f -n0
>
> What OS/kernel are you using?  LVM?  filesystem?  I/O scheduler?  partitions?
> readahead?  blockdev --getra

OS = Debian 8 64bits - 3.16.0-4

See below the Disk FileSystem 
root@hp2ml110deb:/# fdisk -l
Disk /dev/sda: 931.5 GiB, 1000204886016 bytes, 1953525168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 26F5EB21-30DB-44E4-B9E2-E8105846B6C4

Device  StartEndSectors   Size Type
/dev/sda1204810506231048576   512M EFI System
/dev/sda2 1050624 1937274879 1936224256 923.3G Linux filesystem
/dev/sda3  1937274880 1953523711   16248832   7.8G Linux swap

Disk /dev/sdb: 465.8 GiB, 500107862016 bytes, 976773168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes


The DBMS and tablespace of users is installed in /dev/sdb  SSD.

> If you're running under linux, maybe you can just send the output of:
> for a in /sys/block/sdX/queue/*; do echo "$a `cat $a`"; done
> or: tail 
> /sys/block/sdX/queue/{minimum_io_size,optimal_io_size,read_ahead_kb,scheduler,rotational,max_sectors_kb,logical_block_size,physical_block_size}
>
> Justin



Re: HDD vs SSD without explanation

2018-01-14 Thread Neto pr
2018-01-14 15:59 GMT-08:00 Neto pr :
> Thanks for the reply.
> I'll try upload the execution plan with Explain (analyse, buffer) for
> website:  https://explain.depesz.com/
>

Below is a new execution plan, with Analyze, BUFFERS. This time,
without changing anything in the configuration of the DBMS, I just
rebooted the DBMS, the time of 16 minutes was obtained, against the 26
minutes of another execution. But it still has not managed to exceed
the execution time in HDD SAS 15Krpm.
I was not able to upload to the site, because I'm saving the execution
plan in the database, and when I retrieve it, it loses the line
breaks, and the d site does not allow uploading.


--- Execution Plan with Buffers  executed on SSD
Stores.-

Finalize GroupAggregate  (cost=1588.33..15980046.69 rows=60150
width=66) (actual time=969248.287..973686.679 rows=175 loops=1)  Group
Key: nation.n_name, (date_part(_year_::text,
(orders.o_orderdate)::timestamp without time zone))  Buffers: shared
hit=1327602 read=2305013, temp read=1183857 written=1180940  ->
Gather Merge  (cost=1588.33..15977791.06 rows=120300 width=66)
(actual time=969222.164..973685.582 rows=525 loops=1)Workers
Planned: 2Workers Launched: 2Buffers: shared
hit=1327602 read=2305013, temp read=1183857 written=1180940->
Partial GroupAggregate  (cost=15821228.31..15962905.44 rows=60150
width=66) (actual time=941985.137..946403.344 rows=175 loops=3)
  Group Key: nation.n_name, (date_part(_year_::text,
(orders.o_orderdate)::timestamp without time zone))
Buffers: shared hit=3773802 read=7120852, temp read=3550293
written=3541542  ->  Sort  (cost=15821228.31..15838806.37
rows=7031225 width=57) (actual time=941954.595..943119.850
rows=4344197 loops=3)Sort Key: nation.n_name,
(date_part(_year_::text, (orders.o_orderdate)::timestamp without time
zone)) DESCSort Method: external merge  Disk:
320784kBBuffers: shared hit=3773802 read=7120852,
temp read=3550293 written=3541542->  Hash Join
(cost=4708859.28..14719466.13 rows=7031225 width=57) (actual
time=619996.638..933725.615 rows=4344197 loops=3)
Hash Cond: (lineitem.l_suppkey = supplier.s_suppkey)
   Buffers: shared hit=3773732 read=7120852, temp read=3220697
written=3211409  ->  Hash Join
(cost=4683017.71..14434606.65 rows=7071075 width=43) (actual
time=579893.395..926348.061 rows=4344197 loops=3)
  Hash Cond: (lineitem.l_orderkey = orders.o_orderkey)
   Buffers: shared hit=3758207 read=7108695, temp
read=3114271 written=3105025->  Hash
Join  (cost=1993687.71..11297331.33 rows=7071075 width=47) (actual
time=79741.803..805259.856 rows=4344197 loops=3)
   Hash Cond: ((lineitem.l_suppkey = partsupp.ps_suppkey)
AND (lineitem.l_partkey = partsupp.ps_partkey))
  Buffers: shared hit=1754251 read=5797780, temp
read=2369849 written=2366741  ->
Hash Join  (cost=273201.71..9157213.44 rows=7071075 width=45) (actual
time=5363.078..672302.517 rows=4344197 loops=3)
Hash Cond: (lineitem.l_partkey = part.p_partkey)
 Buffers: shared hit=325918
read=5027133, temp read=1742658 written=1742616
->  Parallel Seq Scan on lineitem
(cost=0.00..5861333.20 rows=15120 width=41) (actual
time=0.129..536226.436 rows=80004097 loops=3)
Buffers: shared hit=2 read=4861280
   ->  Hash  (cost=263921.00..263921.00
rows=565657 width=4) (actual time=5362.100..5362.100 rows=434469
loops=3)  Buckets:
131072  Batches: 8  Memory Usage: 2933kB
   Buffers: shared hit=325910 read=165853, temp
written=3327  ->  Seq
Scan on part  (cost=0.00..263921.00 rows=565657 width=4) (actual
time=0.025..5279.959 rows=434469 loops=3)
  Filter: ((p_name)::text ~~ _%orchid%_::text)
   Rows Removed by
Filter: 7565531
Buffers: shared hit=325910 read=165853
 ->  Hash  (cost=1052986.00..1052986.00 rows=3200 width=22)
(actual time=74231.061..74231.061 rows=3200 loops=3)
 Buckets: 65536  Batches: 512  Memory
Usage: 3941kBBuffers:
shared hit=1428311 read=770647, temp written=513846
->  Seq Scan on partsupp
(cost=0.00..1052986.00 rows=3200 width=22) (actual
time=0.037..66316.652 rows=3200 loops=3)
   Buffers: shared hit=1428311 read=770647
   ->  Hash  (cost=1704955

Re: HDD vs SSD without explanation

2018-01-15 Thread Neto pr
2018-01-14 19:09 GMT-08:00 Justin Pryzby :
> On Sun, Jan 14, 2018 at 06:25:40PM -0800, Neto pr wrote:
>> > The query plan is all garbled by mail , could you resend?  Or post a link 
>> > from
>> > https://explain.depesz.com/
>
> On Sun, Jan 14, 2018 at 06:36:02PM -0800, Neto pr wrote:
>> I was not able to upload to the site, because I'm saving the execution
>> plan in the database, and when I retrieve it, it loses the line breaks,
>
> That's why it's an issue for me, too..
>
>> > What OS/kernel are you using?  LVM?  filesystem?  I/O scheduler?  
>> > partitions?
>>
>> See below the Disk FileSystem 
>> root@hp2ml110deb:/# fdisk -l
>> Disk /dev/sda: 931.5 GiB, 1000204886016 bytes, 1953525168 sectors
>>
>> Disk /dev/sdb: 465.8 GiB, 500107862016 bytes, 976773168 sectors
>> Units: sectors of 1 * 512 = 512 bytes
>> Sector size (logical/physical): 512 bytes / 512 bytes
>> I/O size (minimum/optimal): 512 bytes / 512 bytes
>> ----
> What about sdb partitions/FS?

I used EXT4 filesystem in Debian SO.

>
> On Sun, Jan 14, 2018 at 06:25:40PM -0800, Neto pr wrote:
>> The DBMS and tablespace of users is installed in /dev/sdb  SSD.
>
> Is that also a temp_tablespace ?  Or are your hashes spilling to HDD instead ?
>

How can I find out where my temp_tablesapce is?
With the command \db+ (see below) does not show the location. But the
DBMS I asked to install inside the SSD, but how can I find out the
exact location of the temp_tablespace ?


tpch40gnorssd=# \db+
 List of tablespaces
Name|  Owner   |Location| Access
privileges | Options |  Size  | Description
+--++---+-++-
 pg_default | postgres ||
 | | 21 MB  |
 pg_global  | postgres ||
 | | 573 kB |
 tblpgssd   | postgres | /media/ssd500gb/dados/pg101ssd |
 | | 206 GB |
(3 rows)
--

> Group Key: nation.n_name, (date_part(_year_::text, 
> (orders.o_orderdate)::timestamp without time zone))
> Buffers: shared hit=3773802 read=7120852, temp read=3550293 written=3541542
>
> Are your SSD being used for anything else ?
>
> What about these?
>
>> > readahead?  blockdev --getra
>

About knowing if the SSD is being used by another process, I will
still execute the command and send the result.

But I can say that the SSD is only used by the DBMS.
Explaining better, My server has an HDD and an SSD. The Debian OS is
installed on the HDD and I installed the DBMS inside the SSD and the
data tablespace also inside the SSD .
The server is dedicated to the DBMS and when I execute the queries,
nothing else is executed. I still can not understand how an HDD is
faster than an SSD.
I ran queries again on the SSD and the results were not good see:

execution 1- 00:16:13
execution 2- 00:25:30
execution 3- 00:28:09
execution 4- 00:24:33
execution 5- 00:24:38

Regards
Neto




>> > If you're running under linux, maybe you can just send the output of:
>> > for a in /sys/block/sdX/queue/*; do echo "$a `cat $a`"; done
>> > or: tail 
>> > /sys/block/sdX/queue/{minimum_io_size,optimal_io_size,read_ahead_kb,scheduler,rotational,max_sectors_kb,logical_block_size,physical_block_size}
>
>> > Can you reproduce the speed difference using dd ?
>> > time sudo dd if=/dev/sdX of=/dev/null bs=1M count=32K 
>> > skip=$((128*$RANDOM/32)) # set bs to optimal_io_size
>> >
>> > Or: bonnie++ -f -n0
>
> Justin



Re: HDD vs SSD without explanation

2018-01-15 Thread Neto pr
2018-01-15 3:04 GMT-08:00 Neto pr :
> 2018-01-14 19:09 GMT-08:00 Justin Pryzby :
>> On Sun, Jan 14, 2018 at 06:25:40PM -0800, Neto pr wrote:
>>> > The query plan is all garbled by mail , could you resend?  Or post a
link from
>>> > https://explain.depesz.com/
>>
>> On Sun, Jan 14, 2018 at 06:36:02PM -0800, Neto pr wrote:
>>> I was not able to upload to the site, because I'm saving the execution
>>> plan in the database, and when I retrieve it, it loses the line breaks,
>>
>> That's why it's an issue for me, too..
>>
>>> > What OS/kernel are you using?  LVM?  filesystem?  I/O scheduler?
 partitions?
>>>
>>> See below the Disk FileSystem 
>>> root@hp2ml110deb:/# fdisk -l
>>> Disk /dev/sda: 931.5 GiB, 1000204886016 bytes, 1953525168 sectors
>>>
>>> Disk /dev/sdb: 465.8 GiB, 500107862016 bytes, 976773168 sectors
>>> Units: sectors of 1 * 512 = 512 bytes
>>> Sector size (logical/physical): 512 bytes / 512 bytes
>>> I/O size (minimum/optimal): 512 bytes / 512 bytes
>>>
--------
>> What about sdb partitions/FS?
>
> I used EXT4 filesystem in Debian SO.
>
>>
>> On Sun, Jan 14, 2018 at 06:25:40PM -0800, Neto pr wrote:
>>> The DBMS and tablespace of users is installed in /dev/sdb  SSD.
>>
>> Is that also a temp_tablespace ?  Or are your hashes spilling to HDD
instead ?
>>
>
> How can I find out where my temp_tablesapce is?
> With the command \db+ (see below) does not show the location. But the
> DBMS I asked to install inside the SSD, but how can I find out the
> exact location of the temp_tablespace ?
>
>

> tpch40gnorssd=# \db+
>  List of tablespaces
> Name|  Owner   |Location| Access
> privileges | Options |  Size  | Description
>
+--++---+-++-
>  pg_default | postgres ||
>  | | 21 MB  |
>  pg_global  | postgres ||
>  | | 573 kB |
>  tblpgssd   | postgres | /media/ssd500gb/dados/pg101ssd |
>  | | 206 GB |
> (3 rows)
>
--
>

I checked that the temporary tablespace pg_default is on the SSD, because
when running show temp_tablespaces in psql returns empty, and by the
documentation,
https://www.postgresql.org/docs/current/static/runtime-config-client.html#GUC-TEMP-TABLESPACES
will be in the default directory, where I installed the DBMS in:
/media/ssd500gb/opt/pgv101norssd/data.

The servers where I executed the query with HDD SAS is not the same one
where I executed the query with SSD, but they are identical Server (HP
Proliant ML110), it has the same model and configuration, only the disks
that are not the same, see:

Server 1
- HDD SAS 15 Krpm - 320 GB (Location where O.S. Debian and Postgresql are
installed)

Server 2
- Samsung Evo SSD 500 GB (Location where Postgresql is Installed)
- HDD Sata 7500 Krpm - 1TB (Location where O.S Debian is installed)


>> Group Key: nation.n_name, (date_part(_year_::text,
(orders.o_orderdate)::timestamp without time zone))
>> Buffers: shared hit=3773802 read=7120852, temp read=3550293
written=3541542
>>
>> Are your SSD being used for anything else ?
>>
>> What about these?
>>
>>> > readahead?  blockdev --getra
>>
>
> About knowing if the SSD is being used by another process, I will
> still execute the command and send the result.
>
> But I can say that the SSD is only used by the DBMS.
> Explaining better, My server has an HDD and an SSD. The Debian OS is
> installed on the HDD and I installed the DBMS inside the SSD and the
> data tablespace also inside the SSD .
> The server is dedicated to the DBMS and when I execute the queries,
> nothing else is executed. I still can not understand how an HDD is
> faster than an SSD.
> I ran queries again on the SSD and the results were not good see:
>
> execution 1- 00:16:13
> execution 2- 00:25:30
> execution 3- 00:28:09
> execution 4- 00:24:33
> execution 5- 00:24:38
>
> Regards
> Neto
>
>
>
>
>>> > If you're running under linux, maybe you can just send the output of:
>>> > for a in /sys/block/sdX/queue/*; do echo "$a `cat $a`"; done
>>> > or: tail
/sys/block/sdX/queue/{minimum_io_size,optimal_io_size,read_ahead_kb,scheduler,rotational,max_sectors_kb,logical_block_size,physical_block_size}
>>
>>> > Can you reproduce the speed difference using dd ?
>>> > time sudo dd if=/dev/sdX of=/dev/null bs=1M count=32K
skip=$((128*$RANDOM/32)) # set bs to optimal_io_size
>>> >
>>> > Or: bonnie++ -f -n0
>>
>> Justin


Re: HDD vs SSD without explanation

2018-01-15 Thread Neto pr
Hi Georg,
Your answer I believe has revealed the real problem.
I looked at the specification of my SATA SSD, and from my SAS HDD, I
saw that the SAS has 12 Gb/s versus 6 Gb/s from the SSD

SSD: Samsung 500 GB SATA III 6Gb/s - Model: 850 Evo
http://www.samsung.com/semiconductor/minisite/ssd/product/consumer/850evo/

HDD: HPE 300GB 12G SAS Part-Number: 737261-B21
https://h20195.www2.hpe.com/v2/GetPDF.aspx%2Fc04111744.pdf

I saw that the SAS band is double, and because of that reason the
difference in performance occurred.

Another question, if I compare the disk below HDD SAS that has a
transfer rate of 6Gb/s equal to the SSD SATA 6Gb/s, do you think the
SSD would be more agile in this case?
HDD: HP 450GB 6G SAS 15K rpm LFF (3.5-inch) Part-Number: 652615-B21

best Regards
Neto

2018-01-15 16:32 GMT-02:00 Georg H. :
>
> Hello Neto
>
> Am 14.01.2018 um 21:44 schrieb Neto pr:
>>
>> Dear all
>>
>> Someone help me analyze the two execution plans below (Explain ANALYZE
>> used), is the  query 9 of TPC-H benchmark [1].
>> I'm using a server HP Intel Xeon 2.8GHz/4-core - Memory 8GB HDD SAS 320GB
>> 15 Krpm AND SSD Sansung EVO 500GB.
>> My DBMS parameters presents in postgresql.conf is default, but in SSD I
>> have changed random_page_cost = 1.0.
>>
> you are comparing a SAS Drive against a SATA SSD. Their interfaces serve a
> completely different bandwidth.
> While a SAS-3 device does 12 Gbit/s  SATA-3 device  is only able to transfer
> 6 Gbit/s  (a current SAS-4 reaches 22.5 Gbit/s)
> Do a short research on SAS vs SATA and then use a SAS SSD for comparison :)
>
> regards
> Georg
>



Re: HDD vs SSD without explanation

2018-01-15 Thread Neto pr
2018-01-15 17:55 GMT-02:00 Fernando Hevia :
>
>
> 2018-01-15 15:32 GMT-03:00 Georg H. :
>>
>>
>> Hello Neto
>>
>> Am 14.01.2018 um 21:44 schrieb Neto pr:
>>>
>>> Dear all
>>>
>>> Someone help me analyze the two execution plans below (Explain ANALYZE
>>> used), is the  query 9 of TPC-H benchmark [1].
>>> I'm using a server HP Intel Xeon 2.8GHz/4-core - Memory 8GB HDD SAS 320GB
>>> 15 Krpm AND SSD Sansung EVO 500GB.
>>> My DBMS parameters presents in postgresql.conf is default, but in SSD I
>>> have changed random_page_cost = 1.0.
>>>
>> you are comparing a SAS Drive against a SATA SSD. Their interfaces serve a
>> completely different bandwidth.
>> While a SAS-3 device does 12 Gbit/s  SATA-3 device  is only able to
>> transfer 6 Gbit/s  (a current SAS-4 reaches 22.5 Gbit/s)
>> Do a short research on SAS vs SATA and then use a SAS SSD for comparison
>> :)
>
>
> The query being all read operations both drives should perform somewhat
> similarly. Therefore, either the SAS drive has some special sauce to it
> (a.k.a very fast built-in cache) or there is something else going on these
> systems. Otherwise he shouldn't be stressing the 6 Gbit/s interface limit
> with a single drive, be that the SATA or the SAS drive.
>
> Neto, you have been suggested to provide a number of command outputs to know
> more about your system. Testing the raw read throughput of both your drives
> should be first on your list.
>


Guys, sorry for the Top Post, I forgot 

Fernando, I think the difference of 6 Gb/s to 12 Gb/s from SAS is what
caused the difference in query execution time.
Because looking at the execution plans and the cost estimate, I did
not see many differences, in methods of access among other things.
Regarding the query, none of them use indexes, since I did a first
test without indexes.
Do you think that if I compare the disk below HDD SAS that has a
transfer rate of 6Gb/s equal to the SSD SATA 6Gb/s, do you think the
SSD would be more agile in this case?

HDD: HP 450GB 6G SAS 15K rpm LFF (3.5-inch) Part-Number: 652615-B21

Neto

> Cheers.
>
>
>
>



Re: HDD vs SSD without explanation

2018-01-15 Thread Neto pr
2018-01-15 16:18 GMT-08:00 Fernando Hevia :
>
>
> 2018-01-15 20:25 GMT-03:00 Neto pr :
>>
>> 2018-01-15 17:55 GMT-02:00 Fernando Hevia :
>> >
>> >
>> > 2018-01-15 15:32 GMT-03:00 Georg H. :
>> >>
>> >>
>> >> Hello Neto
>> >>
>> >> Am 14.01.2018 um 21:44 schrieb Neto pr:
>> >>>
>> >>> Dear all
>> >>>
>> >>> Someone help me analyze the two execution plans below (Explain ANALYZE
>> >>> used), is the  query 9 of TPC-H benchmark [1].
>> >>> I'm using a server HP Intel Xeon 2.8GHz/4-core - Memory 8GB HDD SAS
>> >>> 320GB
>> >>> 15 Krpm AND SSD Sansung EVO 500GB.
>> >>> My DBMS parameters presents in postgresql.conf is default, but in SSD
>> >>> I
>> >>> have changed random_page_cost = 1.0.
>> >>>
>> >> you are comparing a SAS Drive against a SATA SSD. Their interfaces
>> >> serve a
>> >> completely different bandwidth.
>> >> While a SAS-3 device does 12 Gbit/s  SATA-3 device  is only able to
>> >> transfer 6 Gbit/s  (a current SAS-4 reaches 22.5 Gbit/s)
>> >> Do a short research on SAS vs SATA and then use a SAS SSD for
>> >> comparison
>> >> :)
>> >
>> >
>> > The query being all read operations both drives should perform somewhat
>> > similarly. Therefore, either the SAS drive has some special sauce to it
>> > (a.k.a very fast built-in cache) or there is something else going on
>> > these
>> > systems. Otherwise he shouldn't be stressing the 6 Gbit/s interface
>> > limit
>> > with a single drive, be that the SATA or the SAS drive.
>> >
>> > Neto, you have been suggested to provide a number of command outputs to
>> > know
>> > more about your system. Testing the raw read throughput of both your
>> > drives
>> > should be first on your list.
>> >
>>
>>
>> Guys, sorry for the Top Post, I forgot 
>>
>> Fernando, I think the difference of 6 Gb/s to 12 Gb/s from SAS is what
>> caused the difference in query execution time.
>> Because looking at the execution plans and the cost estimate, I did
>> not see many differences, in methods of access among other things.
>> Regarding the query, none of them use indexes, since I did a first
>> test without indexes.
>> Do you think that if I compare the disk below HDD SAS that has a
>> transfer rate of 6Gb/s equal to the SSD SATA 6Gb/s, do you think the
>> SSD would be more agile in this case?
>>
>> HDD: HP 450GB 6G SAS 15K rpm LFF (3.5-inch) Part-Number: 652615-B21
>>
>> Neto
>
>
> The 6 Gb/s interface is capable of a maximum throughput of around 600 Mb/s.
> None of your drives can achieve that so I don't think you are limited to the
> interface speed. The 12 Gb/s interface speed advantage kicks in when there
> are several drives installed and it won't make a diference in a single drive
> or even a two drive system.
>
> But don't take my word for it. Test your drives throughput with the command
> Justin suggested so you know exactly what each drive is capable of:
>
>> Can you reproduce the speed difference using dd ?
>> time sudo dd if=/dev/sdX of=/dev/null bs=1M count=32K
>> skip=$((128*$RANDOM/32)) # set bs to optimal_io_size
>
>
> While common sense says SSD drive should outperform the mechanical one, your
> test scenario (large volume sequential reads) evens out the field a lot.
> Still I would have expected somewhat similar results in the outcome, so yes,
> it is weird that the SAS drive doubles the SSD performance. That is why I
> think there must be something else going on during your tests on the SSD
> server. It can also be that the SSD isn't working properly or you are
> running an suboptimal OS+server+controller configuration for the drive.

Ok.

Can you help me to analyze the output of the command: dd if=/dev/sdX
of=/dev/null bs=1M count=32K skip=$((128*$RANDOM/32)) # set bs to
optimal_io_size
I put a heavy query running in the DBMS and ran the time sudo command
...  three times for each environment (SAS HDD and SATA SSD), see
below that the SSD had 412,325 and 120 MB/s
The HDD SAS had 183,176 and 183 MB/s ... strange that in the end the
SAS HDD can execute the query faster ... does it have something else
to analyze in the output below?

---  SAS HDD 320 Gb 12 Gb/s ==--
root@deb:/etc#  time sudo dd if=/dev/sda2 of=/dev/null bs=1M count=32K
skip=$((128*$RANDOM/32)) # set bs to optimal_io_size
32768+

Re: HDD vs SSD without explanation

2018-01-15 Thread Neto pr
2018-01-15 17:58 GMT-08:00 Justin Pryzby :
> On Mon, Jan 15, 2018 at 05:19:59PM -0800, Neto pr wrote:
>> >> Can you reproduce the speed difference using dd ?
>> >> time sudo dd if=/dev/sdX of=/dev/null bs=1M count=32K
>> >> skip=$((128*$RANDOM/32)) # set bs to optimal_io_size
>> >
>> > Still I would have expected somewhat similar results in the outcome, so 
>> > yes,
>> > it is weird that the SAS drive doubles the SSD performance. That is why I
>> > think there must be something else going on during your tests on the SSD
>> > server. It can also be that the SSD isn't working properly or you are
>> > running an suboptimal OS+server+controller configuration for the drive.
>>
>> Ok.
>>
>> Can you help me to analyze the output of the command: dd if=/dev/sdX
>> of=/dev/null bs=1M count=32K skip=$((128*$RANDOM/32)) # set bs to
>> optimal_io_size
>
> You should run the "dd" without the DB or anything else using the drive.  That
> gets peformance of the drive, without the DB.

Oh important observation,..

>
> You should probably rerun the "dd" command using /dev/sdb1 if there's an
> partition table on top (??).
>
> I'm still wondering about these:

See Below:
= SSD SATA 500GB 6 Gb/s
===--
root@hp2ml110deb:/etc# time sudo dd if=/dev/sdb of=/dev/null bs=1M
count=32K skip=$((128*$RANDOM/32)) # set bs to optimal_io_size
32768+0 records in
32768+0 records out
34359738368 bytes (34 GB) copied, 71.0047 s, 484 MB/s

real1m11.109s
user0m0.008s
sys 0m16.584s
root@hp2ml110deb:/etc# time sudo dd if=/dev/sdb of=/dev/null bs=1M
count=32K skip=$((128*$RANDOM/32)) # set bs to optimal_io_size
32768+0 records in
32768+0 records out
34359738368 bytes (34 GB) copied, 70.937 s, 484 MB/s

real1m11.089s
user0m0.012s
sys 0m16.312s
root@hp2ml110deb:/etc#


= HDD SAS 300GB 12 Gb/s
===--
root@deb:/home/user1#  time sudo dd if=/dev/sda2 of=/dev/null bs=1M
count=32K skip=$((128*$RANDOM/32)) # set bs to optimal_io_size

32768+0 records in
32768+0 records out
34359738368 bytes (34 GB) copied, 147.232 s, 233 MB/s

real2m27.277s
user0m0.036s
sys 0m23.096s
root@deb:/home/user1#
root@deb:/home/user1#  time sudo dd if=/dev/sda2 of=/dev/null bs=1M
count=32K skip=$((128*$RANDOM/32)) # set bs to optimal_io_size
32768+0 records in
32768+0 records out
34359738368 bytes (34 GB) copied, 153.698 s, 224 MB/s

real2m33.766s
user0m0.032s
sys 0m22.812s
root@deb:/home/user1#
-  END
---

I had not spoken, but my SAS HDD is connected to the HBA Controler,
through a SATA adapter, because the cable kit I would have to use and
it would be correct, was no available at the supplier,  so it sent the
SAS HDD with a SATA adapter. I found it strange that the speed of SAS
was below the SSD, and even then it can execute the query much faster.



>
> On Sun, Jan 14, 2018 at 09:09:41PM -0600, Justin Pryzby wrote:
>> What about sdb partitions/FS?
>
>> > > readahead?  blockdev --getra
>>
>> > > If you're running under linux, maybe you can just send the output of:
>> > > for a in /sys/block/sdX/queue/*; do echo "$a `cat $a`"; done
>> > > or: tail 
>> > > /sys/block/sdX/queue/{minimum_io_size,optimal_io_size,read_ahead_kb,scheduler,rotational,max_sectors_kb,logical_block_size,physical_block_size}
>
> Justin



Re: HDD vs SSD without explanation

2018-01-16 Thread Neto pr
2018-01-15 20:04 GMT-08:00 Mark Kirkwood :
> On 16/01/18 13:18, Fernando Hevia wrote:
>
>>
>>
>>
>> The 6 Gb/s interface is capable of a maximum throughput of around 600
>> Mb/s. None of your drives can achieve that so I don't think you are limited
>> to the interface speed. The 12 Gb/s interface speed advantage kicks in when
>> there are several drives installed and it won't make a diference in a single
>> drive or even a two drive system.
>>
>> But don't take my word for it. Test your drives throughput with the
>> command Justin suggested so you know exactly what each drive is capable of:
>>
>> Can you reproduce the speed difference using dd ?
>> time sudo dd if=/dev/sdX of=/dev/null bs=1M count=32K
>> skip=$((128*$RANDOM/32)) # set bs to optimal_io_size
>>
>>
>> While common sense says SSD drive should outperform the mechanical one,
>> your test scenario (large volume sequential reads) evens out the field a
>> lot. Still I would have expected somewhat similar results in the outcome, so
>> yes, it is weird that the SAS drive doubles the SSD performance. That is why
>> I think there must be something else going on during your tests on the SSD
>> server. It can also be that the SSD isn't working properly or you are
>> running an suboptimal OS+server+controller configuration for the drive.
>>
>
> I would second the analysis above - unless you see your read MB/s slammed up
> against 580-600MB/s contunuously then the interface speed is not the issue.
> We have some similar servers that we replaced 12x SAS with 1x SATA 6 GBit/s
> (Intel DC S3710) SSD...and the latter way outperforms the original 12 SAS
> drives.
>
> I suspect the problem is the particular SSD you have - I have benchmarked
> the 256GB EVO variant and was underwhelmed by the performance. These
> (budget) triple cell nand SSD seem to have highly variable read and write
> performance (the write is all about when the SLC nand cache gets
> full)...read I'm not so sure of - but it could be crappy chipset/firmware
> combination. In short I'd recommend *not* using that particular SSD for a
> database workload. I'd recommend one of the Intel Datacenter DC range (FWIW
> I'm not affiliated with Intel in any way...but their DC stuff works well).
>
> regards
>
> Mark

Hi Mark
In other forums one person said me that on samsung evo should be
partition aligned to 3072 not  default 2048 , to start on erase block
bounduary .  And fs block should be 8kb. I am studing this too. Some
DBAs have reported in other situations that the SSDs when they are
full, are very slow. Mine is 85% full, so maybe that is also
influencing. I'm disappointed with this SSD from Samsung, because in
theory, the read speed of an SSD should be more than 300 times faster
than an HDD and this is not happening.

regards
Neto



query execution time (with cache)

2018-01-21 Thread Neto pr
Hi all,
I need to know the actual execution time of a query, but considering
that the data is already cached. I also need to make sure that cached
data from other queries is cleared.
I believe that in order to know the real time of a query it will be
necessary to "warm up" the data to be inserted in cache.

Below are the steps suggested by a DBA for me:

Step 1- run ANALYZE on all tables involved before the test;
Step 2- restart the DBMS (to clear the DBMS cache);
Step 3- erase the S.O. cache;
Step 4- execute at least 5 times the same query.

After the actual execution time of the query, it would have to take
the time of the query that is in the "median" among all.

Example:

Execution 1: 07m 58s
Execution 2: 14m 51s
Execution 3: 17m 59s
Execution 4: 17m 55s
Execution 5: 17m 07s

In this case to calculate the median, you must first order each
execution by its time:
Execution 1: 07m 58s
Execution 2: 14m 51s
Execution 5: 17m 07s
Execution 4: 17m 55s
Execution 3: 17m 59s

In this example the median would be execution 5 (17m 07s). Could
someone tell me if this is a good strategy ?
Due to being a scientific work, if anyone has a reference of any
article or book on this subject, it would be very useful.

Best Regards
Neto



help in analysis of execution plans

2018-05-05 Thread Neto pr
Dear all

Could you help me understand these two execution plans for the same
query (query 3 benchmark TPCH www.tpc.org/tpch), executed in two
different environments of Postgresql, as described below. These plans
were generated by the EXPLAIN ANALYZE command, and the time of plan 1
was 4.7 minutes and plan 2 was 2.95 minutes.

Execution Plan 1 (query execution time 4.7 minutes):
- https://explain.depesz.com/s/Ughh
- Postgresql version 10.1 (default) with index on l_shipdate (table lineitem)

Execution Plan 2 (query execution time 2.95 minutes):
- https://explain.depesz.com/s/7Zb7
- Postgresql version 9.5 (version with source code changed by me) with
index on l_orderkey (table lineitem).

Some doubts
- Difference between GroupAggregate and Finalize GroupAggregate
- because some algorithms show measurements on "Disk" and others on
"Memory" example:
 - External sort Disk: 52784kB
 - quicksort Memory: 47770kB

Because one execution plan was much smaller than the other,
considering that the query is the same and the data are the same.
--
select
l_orderkey,
sum(l_extendedprice * (1 - l_discount)) as revenue,
o_orderdate,
o_shippriority
from
customer,
orders,
lineitem
where
c_mktsegment = 'HOUSEHOLD'
and c_custkey = o_custkey
and l_orderkey = o_orderkey
and o_orderdate < date '1995-03-21'
and l_shipdate > date '1995-03-21'
group by
l_orderkey,
o_orderdate,
o_shippriority
order by
revenue desc,
o_orderdate
--

best regards



Re: help in analysis of execution plans

2018-05-05 Thread Neto pr
Further information is that th Postgresql with modified source code, is
that I modified some internal functions of cost (source code) and
parameters in Postgresql.conf so that it is possible for the DBMS to
differentiate cost of read (random and sequence) and write (random and
sequence), this is because reading in SSDs' and more than 400 times faster
than HDD. This is due to academic research that I am doing.

See schema of the tables used below:
https://docs.snowflake.net/manuals/_images/sample-data-tpch-schema.png

I am using 40g scale, in this way the lineitem table has (40 * 6 million)
240 million of the rows.

Regards
Neto

2018-05-05 8:16 GMT-07:00 Neto pr :

> Dear all
>
> Could you help me understand these two execution plans for the same
> query (query 3 benchmark TPCH www.tpc.org/tpch), executed in two
> different environments of Postgresql, as described below. These plans
> were generated by the EXPLAIN ANALYZE command, and the time of plan 1
> was 4.7 minutes and plan 2 was 2.95 minutes.
>
> Execution Plan 1 (query execution time 4.7 minutes):
> - https://explain.depesz.com/s/Ughh
> - Postgresql version 10.1 (default) with index on l_shipdate (table
> lineitem)
>
> Execution Plan 2 (query execution time 2.95 minutes):
> - https://explain.depesz.com/s/7Zb7
> - Postgresql version 9.5 (version with source code changed by me) with
> index on l_orderkey (table lineitem).
>
> Some doubts
> - Difference between GroupAggregate and Finalize GroupAggregate
> - because some algorithms show measurements on "Disk" and others on
> "Memory" example:
>  - External sort Disk: 52784kB
>  - quicksort Memory: 47770kB
>
> Because one execution plan was much smaller than the other,
> considering that the query is the same and the data are the same.
> --
> select
> l_orderkey,
> sum(l_extendedprice * (1 - l_discount)) as revenue,
> o_orderdate,
> o_shippriority
> from
> customer,
> orders,
> lineitem
> where
> c_mktsegment = 'HOUSEHOLD'
> and c_custkey = o_custkey
> and l_orderkey = o_orderkey
> and o_orderdate < date '1995-03-21'
> and l_shipdate > date '1995-03-21'
> group by
> l_orderkey,
> o_orderdate,
> o_shippriority
> order by
> revenue desc,
> o_orderdate
> --
>
> best regards
>


Why HDD performance is better than SSD in this case

2018-07-16 Thread Neto pr
Dear,
Some of you can help me understand this.

This query plan is executed in the query below (query 9 of TPC-H
Benchmark, with scale 40, database with approximately 40 gb).

The experiment consisted of running the query on a HDD (Raid zero).
Then the same query is executed on an SSD (Raid Zero).

Why did the HDD (7200 rpm)  perform better?
HDD - TIME 9 MINUTES
SSD - TIME 15 MINUTES

As far as I know, the SSD has a reading that is 300 times faster than SSD.

--- Execution  Plans---
ssd 40g
https://explain.depesz.com/s/rHkh

hdd 40g
https://explain.depesz.com/s/l4sq

Query 

select
nation,
o_year,
sum(amount) as sum_profit
from
(
select
n_name as nation,
extract(year from o_orderdate) as o_year,
l_extendedprice * (1 - l_discount) - ps_supplycost *
l_quantity as amount
from
part,
supplier,
lineitem,
partsupp,
orders,
nation
where
s_suppkey = l_suppkey
and ps_suppkey = l_suppkey
and ps_partkey = l_partkey
and p_partkey = l_partkey
and o_orderkey = l_orderkey
and s_nationkey = n_nationkey
and p_name like '%orchid%'
) as profit
group by
nation,
o_year
order by
nation,
o_year desc



Re: HDD vs SSD without explanation

2018-07-17 Thread Neto pr
Thanks all, but I still have not figured it out.
This is really strange because the tests were done on the same machine
(I use  HP ML110 Proliant 8gb RAM - Xeon 2.8 ghz processor (4
cores), and POSTGRESQL 10.1.
- Only the mentioned query running at the time of the test.
- I repeated the query 7 times and did not change the results.
- Before running each batch of 7 executions, I discarded the Operating
System cache and restarted DBMS like this:
(echo 3> / proc / sys / vm / drop_caches;

discs:
- 2 units of Samsung Evo SSD 500 GB (mounted on ZERO RAID)
- 2 SATA 7500 Krpm HDD units - 1TB (mounted on ZERO RAID)

- The Operating System and the Postgresql DBMS are installed on the SSD disk.

Best Regards
[ ]`s Neto


2018-01-16 13:24 GMT-08:00 Mark Kirkwood :
>
>
> On 16/01/18 23:14, Neto pr wrote:
>>
>> 2018-01-15 20:04 GMT-08:00 Mark Kirkwood :
>>>
>>> On 16/01/18 13:18, Fernando Hevia wrote:
>>>
>>>>
>>>>
>>>> The 6 Gb/s interface is capable of a maximum throughput of around 600
>>>> Mb/s. None of your drives can achieve that so I don't think you are
>>>> limited
>>>> to the interface speed. The 12 Gb/s interface speed advantage kicks in
>>>> when
>>>> there are several drives installed and it won't make a diference in a
>>>> single
>>>> drive or even a two drive system.
>>>>
>>>> But don't take my word for it. Test your drives throughput with the
>>>> command Justin suggested so you know exactly what each drive is capable
>>>> of:
>>>>
>>>>  Can you reproduce the speed difference using dd ?
>>>>  time sudo dd if=/dev/sdX of=/dev/null bs=1M count=32K
>>>>  skip=$((128*$RANDOM/32)) # set bs to optimal_io_size
>>>>
>>>>
>>>> While common sense says SSD drive should outperform the mechanical one,
>>>> your test scenario (large volume sequential reads) evens out the field a
>>>> lot. Still I would have expected somewhat similar results in the
>>>> outcome, so
>>>> yes, it is weird that the SAS drive doubles the SSD performance. That is
>>>> why
>>>> I think there must be something else going on during your tests on the
>>>> SSD
>>>> server. It can also be that the SSD isn't working properly or you are
>>>> running an suboptimal OS+server+controller configuration for the drive.
>>>>
>>> I would second the analysis above - unless you see your read MB/s slammed
>>> up
>>> against 580-600MB/s contunuously then the interface speed is not the
>>> issue.
>>> We have some similar servers that we replaced 12x SAS with 1x SATA 6
>>> GBit/s
>>> (Intel DC S3710) SSD...and the latter way outperforms the original 12 SAS
>>> drives.
>>>
>>> I suspect the problem is the particular SSD you have - I have benchmarked
>>> the 256GB EVO variant and was underwhelmed by the performance. These
>>> (budget) triple cell nand SSD seem to have highly variable read and write
>>> performance (the write is all about when the SLC nand cache gets
>>> full)...read I'm not so sure of - but it could be crappy chipset/firmware
>>> combination. In short I'd recommend *not* using that particular SSD for a
>>> database workload. I'd recommend one of the Intel Datacenter DC range
>>> (FWIW
>>> I'm not affiliated with Intel in any way...but their DC stuff works
>>> well).
>>>
>>> regards
>>>
>>> Mark
>>
>> Hi Mark
>> In other forums one person said me that on samsung evo should be
>> partition aligned to 3072 not  default 2048 , to start on erase block
>> bounduary .  And fs block should be 8kb. I am studing this too. Some
>> DBAs have reported in other situations that the SSDs when they are
>> full, are very slow. Mine is 85% full, so maybe that is also
>> influencing. I'm disappointed with this SSD from Samsung, because in
>> theory, the read speed of an SSD should be more than 300 times faster
>> than an HDD and this is not happening.
>>
>>
>
> Interesting - I didn't try changing the alignment. However I could get the
> rated write and read performance on simple benchmarks (provided it was in a
> PCIe V3 slot)...so figured it was ok with the default aligning. However once
> more complex workloads were attempted (databases and distributed object
> store) the performance was disappointing.
>
> If the SSD is 85% full that will not help either (also look at the expected
> lifetime of these EVO's - not that great for a server)!
>
> One thing worth trying is messing about with the IO scheduler: if you are
> using noop, then try deadline (like I said crappy firmware)...
>
> Realistically, I'd recommend getting an enterprise/DC SSD (put the EVO in
> your workstation, it will be quite nice there)!
>
> Cheers
> Mark



Re: Why HDD performance is better than SSD in this case

2018-07-17 Thread Neto pr
Sorry.. I replied in the wrong message before ...
follows my response.
-

Thanks all, but I still have not figured it out.
This is really strange because the tests were done on the same machine
(I use  HP ML110 Proliant 8gb RAM - Xeon 2.8 ghz processor (4
cores), and POSTGRESQL 10.1.
- Only the mentioned query running at the time of the test.
- I repeated the query 7 times and did not change the results.
- Before running each batch of 7 executions, I discarded the Operating
System cache and restarted DBMS like this:
(echo 3> / proc / sys / vm / drop_caches;

discs:
- 2 units of Samsung Evo SSD 500 GB (mounted on ZERO RAID)
- 2 SATA 7500 Krpm HDD units - 1TB (mounted on ZERO RAID)

- The Operating System and the Postgresql DBMS are installed on the SSD disk.

Best Regards
[ ]`s Neto

2018-07-17 1:08 GMT-07:00 Fabio Pardi :
> As already mentioned by Robert, please let us know if you made sure that
> nothing was fished from RAM, over the faster test.
>
> In other words, make sure that all caches are dropped between one test
> and another.
>
> Also,to better picture the situation, would be good to know:
>
> - which SSD (brand/model) are you using?
> - which HDD?
> - how are the disks configured? RAID? or not?
> - on which OS?
> - what are the mount options? SSD requires tuning
> - did you make sure that no other query was running at the time of the
> bench?
> - are you making a comparison on the same machine?
> - is it HW or VM? benchs should better run on bare metal to avoid
> results pollution (eg: other VMS on the same hypervisor using the disk,
> host caching and so on)
> - how many times did you run the tests?
> - did you change postgres configuration over tests?
> - can you post postgres config?
> - what about vacuums or maintenance tasks running in the background?
>
> Also, to benchmark disks i would not use a custom query but pgbench.
>
> Be aware: running benchmarks is a science, therefore needs a scientific
> approach :)
>
> regards
>
> fabio pardi
>
>
>
> On 07/17/2018 07:00 AM, Neto pr wrote:
>> Dear,
>> Some of you can help me understand this.
>>
>> This query plan is executed in the query below (query 9 of TPC-H
>> Benchmark, with scale 40, database with approximately 40 gb).
>>
>> The experiment consisted of running the query on a HDD (Raid zero).
>> Then the same query is executed on an SSD (Raid Zero).
>>
>> Why did the HDD (7200 rpm)  perform better?
>> HDD - TIME 9 MINUTES
>> SSD - TIME 15 MINUTES
>>
>> As far as I know, the SSD has a reading that is 300 times faster than SSD.
>>
>> --- Execution  Plans---
>> ssd 40g
>> https://explain.depesz.com/s/rHkh
>>
>> hdd 40g
>> https://explain.depesz.com/s/l4sq
>>
>> Query 
>>
>> select
>> nation,
>> o_year,
>> sum(amount) as sum_profit
>> from
>> (
>> select
>> n_name as nation,
>> extract(year from o_orderdate) as o_year,
>> l_extendedprice * (1 - l_discount) - ps_supplycost *
>> l_quantity as amount
>> from
>> part,
>> supplier,
>> lineitem,
>> partsupp,
>> orders,
>> nation
>> where
>> s_suppkey = l_suppkey
>> and ps_suppkey = l_suppkey
>> and ps_partkey = l_partkey
>> and p_partkey = l_partkey
>> and o_orderkey = l_orderkey
>> and s_nationkey = n_nationkey
>> and p_name like '%orchid%'
>> ) as profit
>> group by
>> nation,
>> o_year
>> order by
>> nation,
>> o_year desc
>>
>



Re: Why HDD performance is better than SSD in this case

2018-07-17 Thread Neto pr
2018-07-17 10:04 GMT-03:00 Neto pr :
> Sorry.. I replied in the wrong message before ...
> follows my response.
> -
>
> Thanks all, but I still have not figured it out.
> This is really strange because the tests were done on the same machine
> (I use  HP ML110 Proliant 8gb RAM - Xeon 2.8 ghz processor (4
> cores), and POSTGRESQL 10.1.
> - Only the mentioned query running at the time of the test.
> - I repeated the query 7 times and did not change the results.
> - Before running each batch of 7 executions, I discarded the Operating
> System cache and restarted DBMS like this:
> (echo 3> / proc / sys / vm / drop_caches;
>
> discs:
> - 2 units of Samsung Evo SSD 500 GB (mounted on ZERO RAID)
> - 2 SATA 7500 Krpm HDD units - 1TB (mounted on ZERO RAID)
>
> - The Operating System and the Postgresql DBMS are installed on the SSD disk.
>

One more information.
I used default configuration to Postgresql.conf
Only exception is to :
random_page_cost on SSD is 1.1


> Best Regards
> [ ]`s Neto
>
> 2018-07-17 1:08 GMT-07:00 Fabio Pardi :
>> As already mentioned by Robert, please let us know if you made sure that
>> nothing was fished from RAM, over the faster test.
>>
>> In other words, make sure that all caches are dropped between one test
>> and another.
>>
>> Also,to better picture the situation, would be good to know:
>>
>> - which SSD (brand/model) are you using?
>> - which HDD?
>> - how are the disks configured? RAID? or not?
>> - on which OS?
>> - what are the mount options? SSD requires tuning
>> - did you make sure that no other query was running at the time of the
>> bench?
>> - are you making a comparison on the same machine?
>> - is it HW or VM? benchs should better run on bare metal to avoid
>> results pollution (eg: other VMS on the same hypervisor using the disk,
>> host caching and so on)
>> - how many times did you run the tests?
>> - did you change postgres configuration over tests?
>> - can you post postgres config?
>> - what about vacuums or maintenance tasks running in the background?
>>
>> Also, to benchmark disks i would not use a custom query but pgbench.
>>
>> Be aware: running benchmarks is a science, therefore needs a scientific
>> approach :)
>>
>> regards
>>
>> fabio pardi
>>
>>
>>
>> On 07/17/2018 07:00 AM, Neto pr wrote:
>>> Dear,
>>> Some of you can help me understand this.
>>>
>>> This query plan is executed in the query below (query 9 of TPC-H
>>> Benchmark, with scale 40, database with approximately 40 gb).
>>>
>>> The experiment consisted of running the query on a HDD (Raid zero).
>>> Then the same query is executed on an SSD (Raid Zero).
>>>
>>> Why did the HDD (7200 rpm)  perform better?
>>> HDD - TIME 9 MINUTES
>>> SSD - TIME 15 MINUTES
>>>
>>> As far as I know, the SSD has a reading that is 300 times faster than SSD.
>>>
>>> --- Execution  Plans---
>>> ssd 40g
>>> https://explain.depesz.com/s/rHkh
>>>
>>> hdd 40g
>>> https://explain.depesz.com/s/l4sq
>>>
>>> Query 
>>>
>>> select
>>> nation,
>>> o_year,
>>> sum(amount) as sum_profit
>>> from
>>> (
>>> select
>>> n_name as nation,
>>> extract(year from o_orderdate) as o_year,
>>> l_extendedprice * (1 - l_discount) - ps_supplycost *
>>> l_quantity as amount
>>> from
>>> part,
>>> supplier,
>>> lineitem,
>>> partsupp,
>>> orders,
>>> nation
>>> where
>>> s_suppkey = l_suppkey
>>> and ps_suppkey = l_suppkey
>>> and ps_partkey = l_partkey
>>> and p_partkey = l_partkey
>>> and o_orderkey = l_orderkey
>>> and s_nationkey = n_nationkey
>>> and p_name like '%orchid%'
>>> ) as profit
>>> group by
>>> nation,
>>> o_year
>>> order by
>>> nation,
>>> o_year desc
>>>
>>



Re: Why HDD performance is better than SSD in this case

2018-07-17 Thread Neto pr
2018-07-17 10:44 GMT-03:00 Nicolas Charles :
> Hi Neto,
>
> You should list the SSD model also - there are pleinty of Samsung EVO drives
> - and they are not professional grade.
>
> Among the the possible issues, the most likely (from my point of view) are:
>
> - TRIM command doesn't go through the RAID (which is really likely) - so the
> SSD controller think it's full, and keep pushing blocks around to level
> wear, causing massive perf degradation - please check this config on you
> RAID driver/adapter
>
> - TRIM is not configured on the OS level for the SSD
>
> - Partitions is not correctly aligned on the SSD blocks
>
>
> Without so little details on your system, we can only try to guess the real
> issues
>

Thank you Nicolas, for your tips.
I believe your assumption is right.

This SSD really is not professional, even if Samsung's advertisement
says yes. If I have to buy another SSD I will prefer INTEL SSDs.

I had a previous problem with it (Sansung EVO) as it lost in
performance to a SAS HDD, but however, the SAS HDD was a 12 Gb/s
transfer rate and the SSD was 6 Gb/s.

But now I tested against an HDD (7200 RPM) that has the same transfer
rate as the SSD 6 Gb/sec. and could not lose in performance.

Maybe it's the unconfigured trim.

Could you give me some help on how I could check if my RAID is
configured for this, I use Hardware RAID using HP software (HP Storage
Provider on boot).
And on Debian 8 Operating System, how could I check the TRIM configuration ?

Best
[]'s Neto
>
> Nicolas
>
> Nicolas CHARLES
>
> Le 17/07/2018 à 15:19, Neto pr a écrit :
>>
>> 2018-07-17 10:04 GMT-03:00 Neto pr :
>>>
>>> Sorry.. I replied in the wrong message before ...
>>> follows my response.
>>> -
>>>
>>> Thanks all, but I still have not figured it out.
>>> This is really strange because the tests were done on the same machine
>>> (I use  HP ML110 Proliant 8gb RAM - Xeon 2.8 ghz processor (4
>>> cores), and POSTGRESQL 10.1.
>>> - Only the mentioned query running at the time of the test.
>>> - I repeated the query 7 times and did not change the results.
>>> - Before running each batch of 7 executions, I discarded the Operating
>>> System cache and restarted DBMS like this:
>>> (echo 3> / proc / sys / vm / drop_caches;
>>>
>>> discs:
>>> - 2 units of Samsung Evo SSD 500 GB (mounted on ZERO RAID)
>>> - 2 SATA 7500 Krpm HDD units - 1TB (mounted on ZERO RAID)
>>>
>>> - The Operating System and the Postgresql DBMS are installed on the SSD
>>> disk.
>>>
>> One more information.
>> I used default configuration to Postgresql.conf
>> Only exception is to :
>> random_page_cost on SSD is 1.1
>>
>>
>>> Best Regards
>>> [ ]`s Neto
>>>
>>> 2018-07-17 1:08 GMT-07:00 Fabio Pardi :
>>>>
>>>> As already mentioned by Robert, please let us know if you made sure that
>>>> nothing was fished from RAM, over the faster test.
>>>>
>>>> In other words, make sure that all caches are dropped between one test
>>>> and another.
>>>>
>>>> Also,to better picture the situation, would be good to know:
>>>>
>>>> - which SSD (brand/model) are you using?
>>>> - which HDD?
>>>> - how are the disks configured? RAID? or not?
>>>> - on which OS?
>>>> - what are the mount options? SSD requires tuning
>>>> - did you make sure that no other query was running at the time of the
>>>> bench?
>>>> - are you making a comparison on the same machine?
>>>> - is it HW or VM? benchs should better run on bare metal to avoid
>>>> results pollution (eg: other VMS on the same hypervisor using the disk,
>>>> host caching and so on)
>>>> - how many times did you run the tests?
>>>> - did you change postgres configuration over tests?
>>>> - can you post postgres config?
>>>> - what about vacuums or maintenance tasks running in the background?
>>>>
>>>> Also, to benchmark disks i would not use a custom query but pgbench.
>>>>
>>>> Be aware: running benchmarks is a science, therefore needs a scientific
>>>> approach :)
>>>>
>>>> regards
>>>>
>>>> fabio pardi
>>>>
>>>>
>>>>
>>>> On 07/17/2018 07:00 AM, Neto pr wrote:
>>>>>
>>>>> Dear,
>>>>> Some of you can help me understand this.
>>>>>
>&

Re: Why HDD performance is better than SSD in this case

2018-07-17 Thread Neto pr
2018-07-17 10:55 GMT-03:00 Fabio Pardi :
> If you have a RAID cache, i would disable it, since we are only focusing
> on the disks. Cache can give you inconsistent data (even it looks like
> is not the case here).
>
> Also, we can do a step backward, and exclude postgres from the picture
> for the moment.
>
> try to perform a dd test in reading from disk, and let us know.
>
> like:
>
> - create big_enough_file
> - empty OS cache
> - dd if=big_enough_file of=/dev/null
>
> and post the results for both disks.
>
> Also i think it makes not much sense testing on RAID 0. I would start
> performing tests on a single disk, bypassing RAID (or, as mentioned, at
> least disabling cache).
>

But in my case, both the 2 SSDs and the 2 HDDs are in RAID ZERO.
This way it would not be a valid test ? Because the 2 environments are
in RAID ZERO.



> The findings should narrow the focus
>
>
> regards,
>
> fabio pardi
>
> On 07/17/2018 03:19 PM, Neto pr wrote:
>> 2018-07-17 10:04 GMT-03:00 Neto pr :
>>> Sorry.. I replied in the wrong message before ...
>>> follows my response.
>>> -
>>>
>>> Thanks all, but I still have not figured it out.
>>> This is really strange because the tests were done on the same machine
>>> (I use  HP ML110 Proliant 8gb RAM - Xeon 2.8 ghz processor (4
>>> cores), and POSTGRESQL 10.1.
>>> - Only the mentioned query running at the time of the test.
>>> - I repeated the query 7 times and did not change the results.
>>> - Before running each batch of 7 executions, I discarded the Operating
>>> System cache and restarted DBMS like this:
>>> (echo 3> / proc / sys / vm / drop_caches;
>>>
>>> discs:
>>> - 2 units of Samsung Evo SSD 500 GB (mounted on ZERO RAID)
>>> - 2 SATA 7500 Krpm HDD units - 1TB (mounted on ZERO RAID)
>>>
>>> - The Operating System and the Postgresql DBMS are installed on the SSD 
>>> disk.
>>>
>>
>> One more information.
>> I used default configuration to Postgresql.conf
>> Only exception is to :
>> random_page_cost on SSD is 1.1
>>
>>
>>> Best Regards
>>> [ ]`s Neto
>>>
>>> 2018-07-17 1:08 GMT-07:00 Fabio Pardi :
>>>> As already mentioned by Robert, please let us know if you made sure that
>>>> nothing was fished from RAM, over the faster test.
>>>>
>>>> In other words, make sure that all caches are dropped between one test
>>>> and another.
>>>>
>>>> Also,to better picture the situation, would be good to know:
>>>>
>>>> - which SSD (brand/model) are you using?
>>>> - which HDD?
>>>> - how are the disks configured? RAID? or not?
>>>> - on which OS?
>>>> - what are the mount options? SSD requires tuning
>>>> - did you make sure that no other query was running at the time of the
>>>> bench?
>>>> - are you making a comparison on the same machine?
>>>> - is it HW or VM? benchs should better run on bare metal to avoid
>>>> results pollution (eg: other VMS on the same hypervisor using the disk,
>>>> host caching and so on)
>>>> - how many times did you run the tests?
>>>> - did you change postgres configuration over tests?
>>>> - can you post postgres config?
>>>> - what about vacuums or maintenance tasks running in the background?
>>>>
>>>> Also, to benchmark disks i would not use a custom query but pgbench.
>>>>
>>>> Be aware: running benchmarks is a science, therefore needs a scientific
>>>> approach :)
>>>>
>>>> regards
>>>>
>>>> fabio pardi
>>>>
>>>>
>>>>
>>>> On 07/17/2018 07:00 AM, Neto pr wrote:
>>>>> Dear,
>>>>> Some of you can help me understand this.
>>>>>
>>>>> This query plan is executed in the query below (query 9 of TPC-H
>>>>> Benchmark, with scale 40, database with approximately 40 gb).
>>>>>
>>>>> The experiment consisted of running the query on a HDD (Raid zero).
>>>>> Then the same query is executed on an SSD (Raid Zero).
>>>>>
>>>>> Why did the HDD (7200 rpm)  perform better?
>>>>> HDD - TIME 9 MINUTES
>>>>> SSD - TIME 15 MINUTES
>>>>>
>>>>> As far as I know, the SSD has a reading that is 300 times faster than SSD.
>>>>>
>>>>> --- Executio

Re: Why HDD performance is better than SSD in this case

2018-07-17 Thread Neto pr
2018-07-17 20:04 GMT-03:00 Mark Kirkwood :
> Ok, so dropping the cache is good.
>
> How are you ensuring that you have one test setup on the HDDs and one on the
> SSDs? i.e do you have 2 postgres instances? or are you using one instance
> with tablespaces to locate the relevant tables? If the 2nd case then you
> will get pollution of shared_buffers if you don't restart between the HHD
> and SSD tests. If you have 2 instances then you need to carefully check the
> parameters are set the same (and probably shut the HDD instance down when
> testing the SSD etc).
>
Dear  Mark
To ensure that the test is honest and has the same configuration the
O.S. and also DBMS, my O.S. is installed on the SSD and DBMS as well.
I have an instance only of DBMS and two database.
- a database called tpch40gnorhdd with tablespace on the HDD disk.
- a database called tpch40gnorssd with tablespace on the SSD disk.
See below:

postgres=# \l
List of databases
 Name  |  Owner   | Encoding |   Collate   |Ctype|
Access privileges
---+--+--+-+-+---
 postgres  | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
 template0 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
=c/postgres  +
   |  |  | | |
postgres=CTc/postgres
 template1 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
=c/postgres  +
   |  |  | | |
postgres=CTc/postgres
 tpch40gnorhdd | user1| UTF8 | en_US.UTF-8 | en_US.UTF-8 |
 tpch40gnorssd | user1| UTF8 | en_US.UTF-8 | en_US.UTF-8 |
(5 rows)

postgres=#

After 7 query execution in a database tpch40gnorhdd I restart the DBMS
(/etc/init.d/pg101norssd restart and drop cache of the O.S.) and go to
execution test with the database tpch40gnorssd.
You think in this case there is pollution of shared_buffers?
Why do you think having O.S. on SSD is bad? Do you could explain better?

Best regards
[]`s Neto

> I can see a couple of things in your setup that might pessimize the SDD
> case:
> - you have OS on the SSD - if you tests make the system swap then this will
> wreck the SSD result
> - you have RAID 0 SSD...some of the cheaper ones slow down when you do this.
> maybe test with a single SSD
>
> regards
> Mark
>
> On 18/07/18 01:04, Neto pr wrote (note snippage):
>
>> (echo 3> / proc / sys / vm / drop_caches;
>>
>> discs:
>> - 2 units of Samsung Evo SSD 500 GB (mounted on ZERO RAID)
>> - 2 SATA 7500 Krpm HDD units - 1TB (mounted on ZERO RAID)
>>
>> - The Operating System and the Postgresql DBMS are installed on the SSD
>> disk.
>>
>>
>



Re: Why HDD performance is better than SSD in this case

2018-07-17 Thread Neto pr
2018-07-17 22:13 GMT-03:00 Neto pr :
> 2018-07-17 20:04 GMT-03:00 Mark Kirkwood :
>> Ok, so dropping the cache is good.
>>
>> How are you ensuring that you have one test setup on the HDDs and one on the
>> SSDs? i.e do you have 2 postgres instances? or are you using one instance
>> with tablespaces to locate the relevant tables? If the 2nd case then you
>> will get pollution of shared_buffers if you don't restart between the HHD
>> and SSD tests. If you have 2 instances then you need to carefully check the
>> parameters are set the same (and probably shut the HDD instance down when
>> testing the SSD etc).
>>
> Dear  Mark
> To ensure that the test is honest and has the same configuration the
> O.S. and also DBMS, my O.S. is installed on the SSD and DBMS as well.
> I have an instance only of DBMS and two database.
> - a database called tpch40gnorhdd with tablespace on the HDD disk.
> - a database called tpch40gnorssd with tablespace on the SSD disk.
> See below:
>
> postgres=# \l
> List of databases
>  Name  |  Owner   | Encoding |   Collate   |Ctype|
> Access privileges
> ---+--+--+-+-+---
>  postgres  | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
>  template0 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
> =c/postgres  +
>|  |  | | |
> postgres=CTc/postgres
>  template1 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
> =c/postgres  +
>|  |  | | |
> postgres=CTc/postgres
>  tpch40gnorhdd | user1| UTF8 | en_US.UTF-8 | en_US.UTF-8 |
>  tpch40gnorssd | user1| UTF8 | en_US.UTF-8 | en_US.UTF-8 |
> (5 rows)
>
> postgres=#
>
> After 7 query execution in a database tpch40gnorhdd I restart the DBMS
> (/etc/init.d/pg101norssd restart and drop cache of the O.S.) and go to
> execution test with the database tpch40gnorssd.
> You think in this case there is pollution of shared_buffers?
> Why do you think having O.S. on SSD is bad? Do you could explain better?
>
> Best regards
> []`s Neto
>

+1 information about EVO SSD Samsung:

 Model: 850 Evo 500 GB SATA III 6Gb/s -
http://www.samsung.com/semiconductor/minisite/ssd/product/consumer/850evo/


>> I can see a couple of things in your setup that might pessimize the SDD
>> case:
>> - you have OS on the SSD - if you tests make the system swap then this will
>> wreck the SSD result
>> - you have RAID 0 SSD...some of the cheaper ones slow down when you do this.
>> maybe test with a single SSD
>>
>> regards
>> Mark
>>
>> On 18/07/18 01:04, Neto pr wrote (note snippage):
>>
>>> (echo 3> / proc / sys / vm / drop_caches;
>>>
>>> discs:
>>> - 2 units of Samsung Evo SSD 500 GB (mounted on ZERO RAID)
>>> - 2 SATA 7500 Krpm HDD units - 1TB (mounted on ZERO RAID)
>>>
>>> - The Operating System and the Postgresql DBMS are installed on the SSD
>>> disk.
>>>
>>>
>>



Re: Why HDD performance is better than SSD in this case

2018-07-17 Thread Neto pr
2018-07-17 11:43 GMT-03:00 Fabio Pardi :
>
>
> On 07/17/2018 04:05 PM, Neto pr wrote:
>> 2018-07-17 10:55 GMT-03:00 Fabio Pardi :
>
>>> Also i think it makes not much sense testing on RAID 0. I would start
>>> performing tests on a single disk, bypassing RAID (or, as mentioned, at
>>> least disabling cache).
>>>
>>
>> But in my case, both the 2 SSDs and the 2 HDDs are in RAID ZERO.
>> This way it would not be a valid test ? Because the 2 environments are
>> in RAID ZERO.
>>
>>
>
> in theory, probably yes and maybe not.
> In RAID 0, data is (usually) striped in a round robin fashion, so you
> should rely on the fact that, in average, data is spread 50% on each
> disk. For the sake of knowledge, you can check what your RAID controller
> is actually using as algorithm to spread data over RAID 0.
>
> But you might be in an unlucky case in which more data is on one disk
> than in another.
> Unlucky or created by the events, like you deleted the records which are
> on disk 0 and you only are querying those on disk 1, for instance.
>
> The fact is, that more complexity you add to your test, the less the
> results will be closer to your expectations.
>
> Since you are testing disks, and not RAID, i would start empirically and
> perform the test straight on 1 disk.
> A simple test, like dd i mentioned here above.
> If dd, or other more tailored tests on disks show that SSD is way slow,
> then you can focus on tuning your disk. or trashing it :)
>
> When you are satisfied with your results, you can build up complexity
> from the reliable/consolidated level you reached.
>
> As side note: why to run a test on a setup you can never use on production?
>
> regards,
>
> fabio pardi
>

Fabio, I understood and I agree with you about testing without RAID,
this way it would be easier to avoid problems unrelated to my test on
disks (SSD and HDD).

Can you just explain why you said it below?

"As side note: why to run a test on a setup you can never use on production?"

You think that a RAID ZERO configuration for a DBMS is little used?
Which one do you think would be good? I accept suggestions because I
am in the middle of a work for my
research of the postgraduate course and I can change the environment
to something that is more useful and really used in real production
environments.

Best Regards
[]`s Neto
>