[databases-discuss] high read iops - more memory for arc?

Brad Thu, 24 Dec 2009 12:57:54 PST

I'm running into a issue where there seems to be a high number of read iops 
hitting disks and physical free memory is fluctuating between 200MB -> 450MB 
out of 16GB total.  We have the l2arc configured on a 32GB Intel X25-E ssd and 
slog on another32GB X25-E ssd.


According to our tester, Oracle writes are extremely slow (high latency).   

Below is a snippet of iostat:

    r/s    w/s   Mr/s   Mw/s wait actv wsvc_t asvc_t  %w  %b device
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c0t0d0
 4898.3   34.2   23.2    1.4  0.1 385.3    0.0   78.1   0 1246 c1
    0.0    0.8    0.0    0.0  0.0  0.0    0.0   16.0   0   1 c1t0d0
  401.7    0.0    1.9    0.0  0.0 31.5    0.0   78.5   1 100 c1t1d0
  421.2    0.0    2.0    0.0  0.0 30.4    0.0   72.3   1  98 c1t2d0
  403.9    0.0    1.9    0.0  0.0 32.0    0.0   79.2   1 100 c1t3d0
  406.7    0.0    2.0    0.0  0.0 33.0    0.0   81.3   1 100 c1t4d0
  414.2    0.0    1.9    0.0  0.0 28.6    0.0   69.1   1  98 c1t5d0
  406.3    0.0    1.8    0.0  0.0 32.1    0.0   79.0   1 100 c1t6d0
  404.3    0.0    1.9    0.0  0.0 31.9    0.0   78.8   1 100 c1t7d0
  404.1    0.0    1.9    0.0  0.0 34.0    0.0   84.1   1 100 c1t8d0
  407.1    0.0    1.9    0.0  0.0 31.2    0.0   76.6   1 100 c1t9d0
  407.5    0.0    2.0    0.0  0.0 33.2    0.0   81.4   1 100 c1t10d0
  402.8    0.0    2.0    0.0  0.0 33.5    0.0   83.2   1 100 c1t11d0
  408.9    0.0    2.0    0.0  0.0 32.8    0.0   80.3   1 100 c1t12d0
    9.6   10.8    0.1    0.9  0.0  0.4    0.0   20.1   0  17 c1t13d0
    0.0   22.7    0.0    0.5  0.0  0.5    0.0   22.8   0  33 c1t14d0

Is this an indicator that we need more physical memory?  From 
http://blogs.sun.com/brendan/entry/test, the order that a read request is 
satisfied is:

1) ARC
2) vdev cache of L2ARC devices
3) L2ARC devices
4) vdev cache of disks
5) disks

Using arc_summary.pl, we determined that prefletch was not helping much so we 
disabled.

CACHE HITS BY DATA TYPE:
          Demand Data:                22%        158853174
          Prefetch Data:              17%        123009991   <---not helping???
          Demand Metadata:            60%        437439104
          Prefetch Metadata:           0%        2446824

The write iops started to kick in more and latency reduced on spinning disks:
    r/s    w/s   Mr/s   Mw/s wait actv wsvc_t asvc_t  %w  %b device
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c0t0d0
 1629.0  968.0   17.4    7.3  0.0 35.9    0.0   13.8   0 1088 c1
    0.0    1.9    0.0    0.0  0.0  0.0    0.0    1.7   0   0 c1t0d0
  126.7   67.3    1.4    0.2  0.0  2.9    0.0   14.8   0  90 c1t1d0
  129.7   76.1    1.4    0.2  0.0  2.8    0.0   13.7   0  90 c1t2d0
  128.0   73.9    1.4    0.2  0.0  3.2    0.0   16.0   0  91 c1t3d0
  128.3   79.1    1.3    0.2  0.0  3.6    0.0   17.2   0  92 c1t4d0
  125.8   69.7    1.3    0.2  0.0  2.9    0.0   14.9   0  89 c1t5d0
  128.3   81.9    1.4    0.2  0.0  2.8    0.0   13.1   0  89 c1t6d0
  128.1   69.2    1.4    0.2  0.0  3.1    0.0   15.7   0  93 c1t7d0
  128.3   80.3    1.4    0.2  0.0  3.1    0.0   14.7   0  91 c1t8d0
  129.2   69.3    1.4    0.2  0.0  3.0    0.0   15.2   0  90 c1t9d0
  130.1   80.0    1.4    0.2  0.0  2.9    0.0   13.6   0  89 c1t10d0
  126.2   72.6    1.3    0.2  0.0  2.8    0.0   14.2   0  89 c1t11d0
  129.7   81.0    1.4    0.2  0.0  2.7    0.0   12.9   0  88 c1t12d0
   90.4   41.3    1.0    4.0  0.0  0.2    0.0    1.2   0   6 c1t13d0
    0.0   24.3    0.0    1.2  0.0  0.0    0.0    0.2   0   0 c1t14d0


Is it true if your MFU stats start to go over 50% then more memory is needed?
        CACHE HITS BY CACHE LIST:
          Anon:                       10%        74845266               [ New 
Customer, First Cache Hit ]
          Most Recently Used:         19%        140478087 (mru)        [ 
Return Customer ]
          Most Frequently Used:       65%        475719362 (mfu)        [ 
Frequent Customer ]
          Most Recently Used Ghost:    2%        20785604 (mru_ghost)   [ 
Return Customer Evicted, Now Back ]
          Most Frequently Used Ghost:  1%        9920089 (mfu_ghost)    [ 
Frequent Customer Evicted, Now Back ]
        CACHE HITS BY DATA TYPE:
          Demand Data:                22%        158852935
          Prefetch Data:              17%        123009991
          Demand Metadata:            60%        437438658
          Prefetch Metadata:           0%        2446824

My theory is since there's not enough memory for the arc to cache data, its 
hits the l2arc where it can't find data and has to query the disk for the 
request.  This causes contention between reads and writes causing the service 
times to inflate.

Thoughts?
-- 
This message posted from opensolaris.org

[databases-discuss] high read iops - more memory for arc?

Reply via email to