On Sun, 5 Feb 2006, Walid wrote: > I belive i have seen on this maling list*, and other internet fourms** some > limitation of NIS, but i have failed to find a documented limiation from > SUN, or from the various linux distrubutions, did any one try to research > the scalability of NIS servers?
The scalability depends on many details of your environment, and the timing of the requests. Remember that NIS was designed for a workstation environment, where humans-rate requests generate asynchronous events. It wasn't designed for a cluster environment where a single application generates queries from every node simultaneously, and where the system state (e.g. number of nodes) might change frequently and new applications expect the state to current. There are ways to tune NIS (increase the backlog) to minimize the observable problems. But you haven't fixed it, you have only made them less obvious for the current cluster scale and application set. > The reason i am asking on a 256 nodes cluster using GigE with two nis Linux > slaves we do see lots of rpc timeouts, the moment we added, an extra slave > we have not experinced much, but in the other hand our solairs Linux slaves > handles triple the amount of clients, and users have not reported problems. > > so my question in these big clusters that have 256 nodes and more, what do > people use for host, and name lookups?, and how much NIS slaves if any do > they deploy? does any one know how many concurrent connections an NIS can > handle ? We developed a cluster-specific name service/ directory service called BeoNSS. It uses knowledge about the cluster structure to cache, compute or avoid name lookups. Some examples Host map We number cluster compute nodes sequentially starting at '0', and map them to sequential IP addresses. We then use names based on these numbers: node 23 is named ".23" with aliases "cluster.23" "23.cluster", "23.cluster0" and "<prefix>23". BeoNSS knows these formats, and returns the address calculated from the known IP address of node 0 and other info (node count, netmask, preferred interface, cluster name). Netgroup map Netgroups are used for file server exports and security. We use much the same approach to generate a list of compute nodes names in the cluster. Password and group We send credentials out with each job, so that the process has a preserved passwd and group entry. BeoNSS uses the information to generate getpwent() entry for the user and a synthetic entry for "root". (Note that this approach automatically handles disjoint user sets from multiple masters, and is one element of highly secure servers since the process doesn't have access to the list of other users.) These are not the only name services that BeoNSS provides, but they are good examples of how a cluster-specific name service can make the cluster faster, easier to scale and more consistent. BeoNSS works with other name services. If an cluster requires other name services, it's easy to configure them as fall-back services. This works very well, since BeoNSS handles the really troublesome queries (an application generating an all-to-all IP address map on each node simultaneously, or libc looking up a user name at start-up from a 10,000 entry passwd map), while taking a negligible amount of time to return a soft fail ("don't know, ask the next service on the list"). There are other approaches that clusters have used: The most obvious is copying out files to each /etc/. This has the problem of consistency and synchronization. You might think that you'll remember to push out new copies with each update. But what about machines that are down? Or booting? Or up but not responding right now? I've seen systems that use NSCD, the Name Service Caching Daemon. It's another "it seems to work for me, at least today" solutions. Like most caching systems, it reduces traffic in the common case. But it doesn't handle update consistency, and won't handle the start-up backlog and dropped-request problem. -- Donald Becker [EMAIL PROTECTED] Scyld Software Scyld Beowulf cluster systems 914 Bay Ridge Road, Suite 220 www.scyld.com Annapolis MD 21403 410-990-9993 _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf