[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[SAGE] LSF (was: Re: actual technical questions)




Unless you've tweaked LSF's prioritization algorithms, the best reward for a job 
well done is another job.  That is, the host which most recently successfully 
completed a job, if eligible for this next job, will get it, even if other hosts 
have been standing idly about looking sullen and scuffing their feet.  However, 
I've come to see this as a feature rather than a bug, as it neatly supports the 
blackhole detection feature.


Speaking of which, what are you using for your blackhole settings for LSF, and 
are you having it page you, drop it into the ticket system, or just send mail to 
lsfadmin (the default)?

The client running LSF is doing FPGA development, so multi-hundred job 
submission groups are de rigeur during cmodel verification.  We found that 15 
jobs in a 5 minute window was about the right sweet spot for us.  The default of 
10 would sometimes get false-alarmed by these 20 or 30 second pico-verification 
job slices.

I should mention that blackhole detection only worked well for us AFTER we set 
the queues to 1 min queue time.  With the queues at '0', we triggered a couple 
of irksome bugs relating to time-based queues and the lsf.licensescheduler 
facility, such that Platform said 'Put them all at 1 or more!'

Now please repeat after me, Strata is not to have diet Pepsi after dinnertime. 
It's 3am and I'm still bloody WIDE awake.  Ah well.

SRC

John Clear wrote:

> ...
> The default unmount time out is really low (60seconds IIRC).  We've
> increased that to 15 minutes.  A few more mounts to stick around,
> but we've had issues (slow network) where an automount doesn't
> happen quickly enough, which causes LSF jobs to fail.  Since our
> machines are segregated by which groups use them, the next job to
> land on a machine is likely to use the same mounts as the last job.


-- 
========================================================================
Strata R Chalup [KF6NBZ]                         strata "@" virtual.net
Virtual.Net Inc                                  http://www.virtual.net/
           ** Strategic IT for the Growing Enterprise **
=========================================================================