Re: [Beowulf] help for metadata-intensive jobs (imagenet)

2019-06-28 Thread John Hearns via Beowulf
Igor, if there are any papers published on what you are doing with these images I would be very interested. I went to the new London HPC and AI Meetup on Thursday, one talk was by Odin Vision which was excellent. Recommend the new Meetup to anyone in the area. Next meeting 21st August. And a plug

Re: [Beowulf] help for metadata-intensive jobs (imagenet)

2019-06-28 Thread INKozin via Beowulf
Converting the files to TF records or similar would be one obvious approach if you are concerned about meta data. But then I d understand why some people would not want that (size, augmentation process). I assume you are are doing the training in a distributed fashion using MPI via Horovod or simil

Re: [Beowulf] help for metadata-intensive jobs (imagenet)

2019-06-28 Thread Michael Di Domenico
oh and knowing what type of fileystem you're on would help with recommendations. On Fri, Jun 28, 2019 at 1:51 PM Michael Di Domenico wrote: > > i'm not familiar with the imagenet set, but i'm suprised you'd see a > bottleneck. my understanding of the ML image sets is that they're > mostly read.

Re: [Beowulf] help for metadata-intensive jobs (imagenet)

2019-06-28 Thread Michael Di Domenico
i'm not familiar with the imagenet set, but i'm suprised you'd see a bottleneck. my understanding of the ML image sets is that they're mostly read. do you have things like noatime set on the filesystem? do you know specifically which ops are pounding the metadata? On Fri, Jun 28, 2019 at 1:47 PM

Re: [Beowulf] help for metadata-intensive jobs (imagenet)

2019-06-28 Thread Joe Landman
On 6/28/19 1:47 PM, Mark Hahn wrote: Hi all, I wonder if anyone has comments on ways to avoid metadata bottlenecks for certain kinds of small-io-intensive jobs.  For instance, ML on imagenet, which seems to be a massive collection of trivial-sized files. A good answer is "beef up your MD serv

[Beowulf] help for metadata-intensive jobs (imagenet)

2019-06-28 Thread Mark Hahn
Hi all, I wonder if anyone has comments on ways to avoid metadata bottlenecks for certain kinds of small-io-intensive jobs. For instance, ML on imagenet, which seems to be a massive collection of trivial-sized files. A good answer is "beef up your MD server, since it helps everyone". That's a bi