[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[SAGE] NIS conniption fits
-----BEGIN PGP SIGNED MESSAGE-----
I'm having what is ever becoming a more perplexing and bizarre
problem with NIS. I have a collection of about 5 FreeBSD servers,
one of which is an NIS master for around 5000 users and about 400
groups, the others of which are clients. The passwd and group maps
are refreshed on a periodic basis from an LDAP directory.
Every once in a while, this network gets into a situation where
the ypserv process on the NIS master chews up all the CPU it can
get and the clients lose the ability to do much of anything useful
in any kind of timely fashion. Tracing this down, it appears that
what's going on is that the ypclient process and the ypserv process
are passing back and forth megabytes worth of group map data, but
it's somehow not getting processed by the client. Rebooting the
client in question almost always fixes the problem, but in those
occassions where it doesn't, the problem spreads to other clients
and the server has to be rebooted as well.
In no particular order, here are some of the things I've noticed about
this behavior:
* No log messages with any error indications.
* All systems involved are running packet filters (ipf), but
clearing and resetting them doesn't achieve anything. The filters
are configured to allow all traffic among these systems anyway.
* Restarting ypclient or ypserv (and associated rpcinfo wrangling)
doesn't achieve anything except a rebinding of the RPC system and
continuation of the problem state.
* Sudo on the problem client still works (that's how I reboot it).
* Any connections to the problem client (ssh, imap, pop, web) hang.
* This behavior seems independant of the NIS refreshes from the LDAP
server, at least time-wise.
* All the systems (except the LDAP server) are running FreeBSD 4.4
to 4.9. There are some Solaris NIS clients in the network as
well, but I've never noticed them to have problems (frankly, until
recently the problems only ever involved one particular client and
the master).
* NIS client and master are passing group map information like it
was going out of style (shown on tcpdump).
* The NIS master is never heavily loaded, usually a load average of
about 0.5 on a two CPU system. Problem clients have variable load
averages, but the problem doesn't seem to be directly connected
with that -- it just usually happens during the day (i.e., load
related, but I'm not sure how).
I'm sure there's more I could say about this, but hopefully this is
enough to at least pique someone's interest. I'm about at the end of
my rope with this puzzle. Has anyone run into this kind of behavior
before, or have any suggestions of what else to poke at?
Thanks,
--rowan
- --
John "Rowan" Littell
Systems Administrator
Earlham College Computing Services
http://www.earlham.edu/~littejo/
2004-01-21 14:49
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (Darwin)
Comment: Made with pgp4pine 1.76
iQCVAwUBQA7eEJdUNSJ2nf/5AQGT2gQAz3MU6VPHeclEgNJHWuy4W9/zJYF7KlWt
p+Ajgju2GgZWgJDR0aQuz1z8w9td8Ggy6vxmaV7OugL8EtacaHOZrHGd+m8Urvsx
kOynDJnoQiQFHK/0KsbQmPWZttV8hA6Ak8JJemHr4OHaTU4H5zVTNBNY3RIxfUfB
Gu1LyYJ/Uhw=
=ssyr
-----END PGP SIGNATURE-----