i'm not familiar with the imagenet set, but i'm suprised you'd see a
bottleneck.  my understanding of the ML image sets is that they're
mostly read.  do you have things like noatime set on the filesystem?
do you know specifically which ops are pounding the metadata?

On Fri, Jun 28, 2019 at 1:47 PM Mark Hahn <h...@mcmaster.ca> wrote:
>
> Hi all,
> I wonder if anyone has comments on ways to avoid metadata bottlenecks
> for certain kinds of small-io-intensive jobs.  For instance, ML on imagenet,
> which seems to be a massive collection of trivial-sized files.
>
> A good answer is "beef up your MD server, since it helps everyone".
> That's a bit naive, though (no money-trees here.)
>
> How about things like putting the dataset into squashfs or some other
> image that can be loop-mounted on demand?  sqlite?  perhaps even a format
> that can simply be mmaped as a whole?
>
> personally, I tend to dislike the approach of having a job stage tons of
> stuff onto node storage (when it exists) simply because that guarantees a
> waste of cpu/gpu/memory resources for however long the stagein takes...
>
> thanks, mark hahn.
> --
> operator may differ from spokesperson.              h...@mcmaster.ca
> _______________________________________________
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit 
> https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf

Reply via email to