[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
actual technical questions (was Re: [SAGE] Did everyone Migrate to LOPSA ?)
On Wed, Nov 23, 2005 at 07:08:57PM -0800, Jim Dennis wrote:
> I suppose I could offer up the following questions for those that
> which to spawn off some technical threads on them:
>
> * Has anyone here been trying to run RHEL3 on x86_64 hardware
> with 8 to 32 GiB of RAM? Do you also have stability problems?
> Anyone have a stable Linux x86_64 infrastructure on any 2.4 based
> kernels? (RE: RHEL3 and 2.4 on x86_64)
We run RHEL3 on Opterons up to 128GB of RAM, and have tons of
stability problems, mostly related to memory. For the 32GB+
machines, they are mostly HP DL585 4cpu boxes. For the 2GB-16GB
machines, we use Rackable boxes. We've just put RHEL4 on a few
machines, but too soon to judge it yet.
> * Anyone using RHEL3 with a large number (50 to 100) autofs maps
> in LSF and/or Condor "grids?" Any good documents on "best
> practices" for that arrangement?
We have thousands of map entries, but most machines only end up
mounting a dozen or two. The best practice is to have really good
NFS servers. We use NetApps, which while pricey, are very reliable.
The default unmount time out is really low (60seconds IIRC). We've
increased that to 15 minutes. A few more mounts to stick around,
but we've had issues (slow network) where an automount doesn't
happen quickly enough, which causes LSF jobs to fail. Since our
machines are segregated by which groups use them, the next job to
land on a machine is likely to use the same mounts as the last job.
> * How do people feel about NFS/TCP vs. UDP? On Linux?
NFS over TCP all the way. While UDP is theoretically slightly
faster, it doesn't mix well with really busy clients. Since our
machines are running bloated cpu hog programs from Cadence and
Synopsys, they are pretty much always cpu bound, except when they
are out of memory.
Linux, in my experience, is a great example of the old 'good, fast,
cheap. pick at most two'. Intel or AMD are fast enough that we've
completely moved off of Sparc, and they are cheap enough that we
can buy extra machines to run extra jobs to account for the lack
of 'good'.