-----Original Message----- From: Mark Hahn [mailto:[EMAIL PROTECTED] Sent: Friday, January 19, 2007 12:23 PM To: Ryan Waite Cc: Beowulf@beowulf.org Subject: RE: [Beowulf] SGI to offer Windows on clusters > >> I know some of you aren't, um, tolerant of Microsoft for various reasons > > "tolerant" is the wrong word. beowulf is, definitionally, open-source. > anyone using *nix has obviously made a decision (aesthetic, commercial, etc) > to avoid the dominant OS; this is rejection rather than ill tolerance... > >> and we've designed for those customers. So, users get an OS, job >> scheduler, management package, and MPI stack for < $500. > > compared to $0. > > in that light, the question is- does the academic cost include some > form of support?
Academic version is provided through MSDN Academic Alliance. Support depends on the type of agreement. Our dev team also monitors http://windowshpc.net and provides support there. > >> Our MPI stack is based on MPICH2 but we've made performance and security >> enhancements. The folks at ANL are very talented UNIX developers but > > I cynically guess the "security" enhancements are not really enhancements, > but rather simply making it work in the MSFT universe (domain controller, > etc). is that correct? We did some other security work as well. One example is the MPICH2 code stores password information un-encrypted in the registry. That needed to be changed. There were a couple similar issues uncovered during our security review. When used with our job scheduler we do integrate with Active Directory. When jobs execute on compute nodes we first create a Windows Job Object and then log on using the submitting user's credentials. The Job Object runs with that user's credentials. If the user has their data on a secured server then they should be able to access that data directly. Also, if the process running in the Job Object spawns any other processes they are contained in the same Job Object; if the user's job is cancelled we can clean up any child processes. > >> Windows is more efficient using async overlapped I/O. We've made other, > > what _would_ be interesting is to hear about the technical aspects. > that is: how much difference does this make? is the impetus to use > async mainly to batch completion events (sort of like mpi_waitall)? > does it affect latency? how does it compare to normal mpich on the same > hardware under linux? > There were a number of changes. If you think it would be interesting I could arrange a conference call or an online presentation where we present our changes and the rationale behind those changes. > also, is there any documentation on the job scheduler? > Yep. Here's a paper with an overview of the scheduler: http://www.microsoft.com/downloads/details.aspx?FamilyID=c4dd011a-42e0-4 978-b518-dd6cfef7131f&displaylang=en Here's an article on linking with the CCP Job Scheduling API. Our partners like Mathworks, Ansys, Schlumberger use these APIs to integrate their apps directly with the cluster job scheduler. The result is that users don't have to learn job control languages, they just press the compute button from inside Fluent, etc. http://msdn2.microsoft.com/en-us/library/aa578732.aspx > these kind of specifics would, truely, be most gratefully "tolerated" ;) > > >> ANL for incorporation in future MPICH stacks. We're also the first group >> at Microsoft making these kinds of sizable contributions back to the >> open source community. > > are the contributions entirely specific to the windows API? > (I would not call it a contribution if you've simply ported to the > architecture you own...) > > >> think we're going to help the community bring HPC into mainstream >> computing. > > I'd be curious to understand what that means. HPC as implemented by > beowulves seems entirely mainstream to me. (lots of places already have > substituted a GUI button for a queue-submit command...) > Yeah, I think HPC is much more mainstream than before. I'd like to get to a place where an HPC cluster is just another shared network resource. Just like you can plug a printer into a network and decide to share it with your colleagues I think setting up an HPC cluster and sharing it should be just as simple. Ideally you could buy a 16 or 32 node cluster for your workgroup, plug it in, it images itself (or all the nodes are pre-installed), integrates with your network security (Active Directory or Kerberos or NIS, etc.), decide who you want to share it with, and it's ready to run jobs. With applications like Matlab's DCT and Grid Mathematica integrated directly with the job scheduler it's easy to submit jobs and the results are returned directly to the application. So, mainstream, for us isn't about building bigger and bigger clusters but about building clusters that are easy to manage, integrated directly with your apps, and that work well with the rest of the software infrastructure you have in place. Woah, starting to sound like a marketing person. > regards, mark hahn. > > _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf