Awesome, thanks for the info! Best,
leo On Thu, 10 Aug 2023 at 22:01, Jeff Johnson <jeff.john...@aeoncomputing.com> wrote: > Leo, > > Both BeeGFS and Lustre require a backend file system on the disks > themselves. Both Lustre and BeeGFS support ZFS backend. > > --Jeff > > > On Thu, Aug 10, 2023 at 1:00 PM leo camilo <lhcam...@gmail.com> wrote: > >> Hi there, >> >> thanks for your response. >> >> BeeGFS indeed looks like a good call option, though realistically I can >> only afford to use a single node/server for it. >> >> Would it be feasible to use zfs as volume manager coupled with BeeGFS for >> the shares, or should I write zfs off all together? >> >> thanks again, >> >> best, >> >> leo >> >> On Thu, 10 Aug 2023 at 21:29, Bernd Schubert <bernd.schub...@fastmail.fm> >> wrote: >> >>> >>> >>> On 8/10/23 21:18, leo camilo wrote: >>> > Hi everyone, >>> > >>> > I was hoping I would seek some sage advice from you guys. >>> > >>> > At my department we have build this small prototyping cluster with 5 >>> > compute nodes,1 name node and 1 file server. >>> > >>> > Up until now, the name node contained the scratch partition, which >>> > consisted of 2x4TB HDD, which form an 8 TB striped zfs pool. The pool >>> is >>> > shared to all the nodes using nfs. The compute nodes and the name node >>> > and compute nodes are connected with both cat6 ethernet net cable and >>> > infiniband. Each compute node has 40 cores. >>> > >>> > Recently I have attempted to launch computation from each node (40 >>> tasks >>> > per node), so 1 computation per node. And the performance was >>> abysmal. >>> > I reckon I might have reached the limits of NFS. >>> > >>> > I then realised that this was due to very poor performance from NFS. I >>> > am not using stateless nodes, so each node has about 200 GB of SSD >>> > storage and running directly from there was a lot faster. >>> > >>> > So, to solve the issue, I reckon I should replace NFS with something >>> > better. I have ordered 2x4TB NVMEs for the new scratch and I was >>> > thinking of : >>> > >>> > * using the 2x4TB NVME in a striped ZFS pool and use a single node >>> > GlusterFS to replace NFS >>> > * using the 2x4TB NVME with GlusterFS in a distributed arrangement >>> > (still single node) >>> > >>> > Some people told me to use lustre,but I reckon that might be overkill. >>> > And I would only use a single fileserver machine(1 node). >>> > >>> > Could you guys give me some sage advice here? >>> > >>> >>> So glusterfs is using fuse, which doesn't have the best performance >>> reputation (although hopefully not for long - feel free to search for >>> "fuse" + "uring"). >>> >>> If you want to avoid complexity of Lustre, maybe look into BeeGFS. Well, >>> I would recommend to look into it anyway (as former developer I'm biased >>> again ;) ). >>> >>> >>> Cheers, >>> Bernd >>> >>> _______________________________________________ >> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing >> To change your subscription (digest mode or unsubscribe) visit >> https://beowulf.org/cgi-bin/mailman/listinfo/beowulf >> > > > -- > ------------------------------ > Jeff Johnson > Co-Founder > Aeon Computing > > jeff.john...@aeoncomputing.com > www.aeoncomputing.com > t: 858-412-3810 x1001 f: 858-412-3845 > m: 619-204-9061 > > 4170 Morena Boulevard, Suite C - San Diego, CA 92117 > > High-Performance Computing / Lustre Filesystems / Scale-out Storage >
_______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit https://beowulf.org/cgi-bin/mailman/listinfo/beowulf