Good morning, I have a weird problem with two of the 15+ OpenSolaris storage servers in our environment. All the Nearline servers are essentially the same. Supermicro X9DR3-F based server, Dual E5-2609's, 64GB memory, Dual 10Gb SFP+ NICs, LSI 9200-8e HBA, Supermicro CSE-826E26-R1200LPB storage arrays and Seagate enterprise 2TB SATA or SAS drives (not mixed within a server). Root, l2ARC and ZIL are all on Intel SSD (SLC series 313 for ZIL, MLC 520 for L2ARC and MLC 330 for boot)
The volumes are built out of 9 drive Z1 groups, ashift is set to 9 (which is supposed to appropiate for the enterprise seagates). The pools are large (120-130TB) but are only between 27 and 32% full. Each server serves an iSCSI (Comstar) and an CIFS (in kernel server) volume of the same pool. I realize this is not optimal from a recovery/resilver/rebuild standpoint but the servers are replicated and the data is easily rebuildable. Initially these servers did great for several months, while certainly no speed demons, 300+ MB/sec for sequential read/writes was not a problem. Several weeks ago, literally overnight, replication times went through the roof for one server. Simple testing showed that reading from the pool would no longer go over 25MB/s. Even a scrub that used to run at 400+ MB/sec is now crawling along at below 40MB/s. Sometime yesterday the second server started to exhibit the exact same behaviour. This one is used even less (it's our D2D2T server) and data is written to it at night and read during the day to be written to tape. I've exhausted all I know and I'm at a loss. Does anyone have any ideas of what to look at, or do any obvious reasons for this behaviour jump out from the configuration above? Thanks W _______________________________________________ OpenIndiana-discuss mailing list [email protected] http://openindiana.org/mailman/listinfo/openindiana-discuss
