Re: [Beowulf] help for metadata-intensive jobs (imagenet)

Joe Landman Fri, 28 Jun 2019 10:53:59 -0700


On 6/28/19 1:47 PM, Mark Hahn wrote:

Hi all,
I wonder if anyone has comments on ways to avoid metadata bottlenecks

for certain kinds of small-io-intensive jobs. For instance, ML onimagenet,

which seems to be a massive collection of trivial-sized files.


A good answer is "beef up your MD server, since it helps everyone".
That's a bit naive, though (no money-trees here.)

How about things like putting the dataset into squashfs or some otherimage that can be loop-mounted on demand? sqlite? perhaps even a format

that can simply be mmaped as a whole?

personally, I tend to dislike the approach of having a job stage tons of
stuff onto node storage (when it exists) simply because that guarantees a
waste of cpu/gpu/memory resources for however long the stagein takes...

I'd suggest something akin to a collection of ramdisks using zramdistributed across your nodes. Then put a beegfs file system atopthose. Stage in the images. Run.

This is cheap compared to building the storage you actually need forthis workload.


--
Joe Landman
e: joe.land...@gmail.com
t: @hpcjoe
w: https://scalability.org
g: https://github.com/joelandman
l: https://www.linkedin.com/in/joelandman

_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf

Re: [Beowulf] help for metadata-intensive jobs (imagenet)

Reply via email to