A couple of things I see in this thread: 1. Metadata performance! A lot of bioinformatics code uses metadata searches. Anyone doing these kinds of workloads should tune/spec for metadata separately from data storage. In Spectrum Scale there is a choice of dedicated, fully distributed or partially distributed metadata. Most bio houses I know choose to through really good flash dedicated to the metadata. Not familiar with the options for BeeFS, but do know superior distributed metadata performance was a key target for the Fraunhofer team. Sounds like they are hitting it. My limited history with Ceph & Gluster suggests they could not support bioinformatic metadata requirements. It wasn't they were built to do.
2. Small files! Small file performance is undoubtedly a big gain in IBM dev from GPFS v 3.x to IBM Spectrum Scale 4.x - Don't compare the old performance with the new. Spectrum Scale look into the client side caching for reads and/or writes. A local SSD could be a great boon for performance. There are lots of tuning options for Scale/GPFS. Small file performance and simplified install were also target for BeeFS when they started development. 3. Spectrum Scale (GPFS) is absolutely fundamental to IBM. There is a lot of focus now on making it easier to adopt & tune. IBM dev has delivered new GUI, monitoring, troubleshooting, performance counters, parameterization of tuning in the last 18 months. 4. IBM Spectrum Scale does differentiate with multi-cluster, tiered storage support, migrate data to cloud by policy, HDFS support, etc. These may be overkill for a lot of this mailing list, but really useful in shared settings. Sorry about the guy with the suit. IBM also has a good set of Scale/GPFS people who don't own ties. Passing along your feedback on the client license costs to guys with ties. doug On Wed, Feb 15, 2017 at 8:43 AM, Scott Atchley <e.scott.atch...@gmail.com> wrote: > Hi Chris, > > Check with me in about a year. > > After using Lustre for over 10 years to initially serve ~10 PB of disk and > now serve 30+ PB with very nice DDN gear, later this year we will be > installing 320 PB (250 PB useable) of GPFS (via IBM ESS storage units) to > support Summit, our next gen HPC system from IBM with Power9 CPUs and > NVIDIA Volta GPUs. Our current Lustre system is capable of 1 TB/s for large > sequential writes, but random write performance is much lower (~400 GB/s or > 40% of sequential). The target performance for GPFS will be 2.5 TB/s > sequential writes and 2.2 TB/s random (~90% of sequential). The initial > targets are slightly lower, but we are supposed to achieve these rates by > 2019. > > We are very familiar with Lustre, the good and the bad, and ORNL is the > largest contributor to the Lustre codebase outside of Intel. We have > encountered many bugs at our scale that few other sites can match and we > have tested patches for Intel before their release to see how they perform > at scale. We have been testing GPFS for the last three years in preparation > for the change and IBM has been a very good partner to understand our > performance and scale issues. Improvements that IBM are adding to support > the CORAL systems will also benefit the larger community. > > People are attracted to the "free" aspect of Lustre (in addition to the > open source), but it is not truly free. For both of our large Lustre > systems, we bought block storage from DDN and we added Lustre on top. We > have support contracts with DDN for the hardware and Intel for Lustre as > well as a large team within our operations to manage Lustre and a full-time > Lustre developer. The initial price is lower, but at this scale running > without support contracts and an experienced operations team is untenable. > IBM is proud of GPFS and their ESS hardware (i.e. licenses and hardware are > expensive) and they also require support contracts, but the requirements > for operations staff is lower. It is probably more expensive than any other > combination of hardware/licenses/support, but we have one vendor to blame, > which our management sees as a value. > > As I said, check back in a year or two to see how this experiment works > out. > > Scott > > On Wed, Feb 15, 2017 at 1:53 AM, Christopher Samuel <sam...@unimelb.edu.au > > wrote: > >> Hi John, >> >> On 15/02/17 17:33, John Hanks wrote: >> >> > So "clusters" is a strong word, we have a collection of ~22,000 cores of >> > assorted systems, basically if someone leaves a laptop laying around >> > unprotected we might try to run a job on it. And being bioinformatic-y, >> > our problem with this and all storage is metadata related. The original >> > procurement did not include dedicated NSD servers (or extra GPFS server >> > licenses) so we run solely off the SFA12K's. >> >> Ah right, so these are the embedded GPFS systems from DDN. Interesting >> as our SFA10K's hit EOL in 2019 and so (if our funding continues beyond >> 2018) we'll need to replace them. >> >> > Could we improve with dedicated NSD frontends and GPFS clients? Yes, >> > most certainly. But again, we can stand up a PB or more of brand new >> > SuperMicro storage fronted by BeeGFS that performs as well or better >> > for around the same cost, if not less. >> >> Very nice - and for what you're doing it sounds like just what you need. >> >> > I don't have enough of an >> > emotional investment in GPFS or DDN to convince myself that suggesting >> > further tuning that requires money and time is worthwhile for our >> > environment. It more or less serves the purpose it was bought for, we >> > learn from the experience and move on down the road. >> >> I guess I'm getting my head around how other sites GPFS performs given I >> have a current sample size of 1 and that was spec'd out by IBM as part >> of a large overarching contract. :-) >> >> I guess I assuming that because that was what we had it was how most >> sites did it, apologies for that! >> >> All the best, >> Chris >> -- >> Christopher Samuel Senior Systems Administrator >> VLSCI - Victorian Life Sciences Computation Initiative >> Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 >> http://www.vlsci.org.au/ http://twitter.com/vlsci >> _______________________________________________ >> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing >> To change your subscription (digest mode or unsubscribe) visit >> http://www.beowulf.org/mailman/listinfo/beowulf >> > > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > >
_______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf