Eric Thibodeau wrote:
You can look into OpenAFS but be warned that you have to know infrastructure 
software quite well (LDAP+kerberos). It's cross-platform, can be distributed 
but don't think it's up to multiple writes on different mirrors though.


Indeed. There are many tough compromises in distributed filesystems. Alas there are many conflicting goals. Coherency vs performance is a big one, you pretty much get one or the other. Locking is another ugly one, databases
and some applications assume bit range locking which is sometimes available,
sometimes not.  Many unix programs assuming posix locking, again sometimes
available.  So, unfortunately it's easy to ask for a distributed filesystem
which does not exist.

I'll provide my current brain dump on the various pieces I've been tracking,
I'm sure there are some inaccuracies included, but hopefully they are small ones. As always comments and corrections welcome.

A high level overview of opanafs:
* Openafs is distributed, but not p2p.
* performs well (assuming cache friendliness, and a single peer accessing
  the same files/directories)
* scales well (for reads, because RO volumes can be replicated)
* has a universal namespace
* places little trust in a peer (getting root on a client != ability to
  read all files)
* allows for transparent volume migration (the client doesn't complain when a
  volume is migrated)
* perfect coherency (via a subscription model)
* It also supports linux, OSX, and Windows (among others).
* relatively complex.

NFS in contrast:
* Isn't distributed (unless you count automount)
* has loose coherency (poll based)
* No replication (corrections?)
* Doesn't scale easily
* Volume migration isn't easy (nfs4 claims to enable this, I've yet to see it
  demonstrated in the real world).
* Is mostly unix specific (Microsoft had an NFS client but MS EoL'd it?)
* relatively simple

Lustre:
* client server
* scales extremely well, seems popular on the largest of clusters.
* Can survive hardware failures assuming more than 1 block server is connected
  to each set of disks
* unix only.
* relatively complex.

PVFS2:
* Client server
* scales well
* can not survive a block server death.
* unix only
* relatively simple.
* designed for use within a cluster.

Oceanstore:
* p2p
* claims scalability to billions of users
* Highly available/byzantine fault tolerant
* complex
* slow
* in prototype stage
* Requires use of an API (AFAIK it is not available as a transparently mounted
  filesystem)

So the end result (from my skewed perspective) is:
* NFS is hugely popular, easy, not very secure (at least by default), poor
  coherency, but for things like sharing /home within a cluster it works
  reasonably well.  Seems most appropriate for LAN usage.  Diskless to most
  implies NFS (and works well within a cluster or LAN).
* Lustre and PVFS2 are popular in clusters for sharing files in larger
  clusters where more than single file server worth of bandwidth is required.
  Both I believe scale well with bandwidth but only allow for a single
  metadata server so will ultimately scale only as far as single machine
  for metadata intensive workloads (such as lock intensive, directory
  intensive, or file creation/deletion intensive workloads).  Granted this
  also allows for exotic hardware solutions (like solid state storage) if you
  really need the performance.
* AFS is popular for internet wide file service, researchers love the ability
  to run an application that requires 100 different libraries anywhere in the
  world.  Sysadmins love it because then can migrate volumes without having
  to notify users or schedule downtime.  I believe performance is usually
  somewhat less than NFS within a cluster (because of higher overhead), and
  usually significantly better outside a cluster (better caching and
  coherency).

I'm less familiar with the various commercial filesystems like ibrix.

Hopefully others will expand and correct the above.

_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to