[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[SAGE] Re: [Seattle-SAGE] unexplainable freeze



On 28 Aug 2003 at 18:45, gest wrote:

> There is a computer at work (Dell, dual processosr,
> something ...).  It was there before I was there. 
> Occasionally, it would freeze, can't move mouse, can't
> type, can't do anything.  The logs don't show anything
> which can help pinpoint the problem.  Thus, reboot is
> the inevitable answer, everytime.
> 
> I have a computer at home that, freezes similary like
> the one at work.  Different specs, setup, and what
> not, though.  Click, click, type, type, and bam,
> frozen.  Reboot is also the answer.  (Can't even power
> off.  I have to reboot first, then power off.) 
> However, I think I've fixed the computer at home. 
> Before, I had the case fans connected to the "full"
> power line (ie. it uses four pins).  I didn't freeze
> then.  It's when I switched it over to the "fan only"
> power line (ie. it uses only two pins), that freezes
> started showing up.  I also noticed that my CPU temp
> was going up really quickly.  So, I switched it back
> to the "full" power line, things are okay once again.
> 
> Do you think I can apply that same logic to the
> machine at work?  (ie. somehow the CPU temp or temp
> inside the case is getting too high, so it freezes). 
> Additionally, the machine at work is underneath in a
> closed cabinet ... additionally, cooled down by an
> external fan ... because it's a production or "live"
> machine, that the business doesn't want customers
> seeing.  I'm wondering what if the machine was taken
> out of the cabinet, would that make a difference?  Or
> maybe inspect how the case fans are power supplied? 
> Or maybe call Dell to look at it.  

It could well be a cooling issue, in a freeze situation though I'd check the CPU and RAM are 
seated properly, swap in different RAM if available. Check that I wasn't overloading the PSU 
with too many devices. Your best bet would be to check with DELL and have them inspect 
the machine first, they may have a database with precisely your problem on it and save you 
time and effort.

We had a similar problem with an IBM server, it turned out the voltage regulator on the 
daughterboard (it was a 2 CPU machine with a daughterboard) was faulty, they swapped it 
out, replaced the daughterboard, ran perfectly. I don't really want to mention the faulty 
capacitor issue that effects many motherboards (depending on age) but there are many.

> Since, it's a "live" machine I don't want to mess with
> it and then it really doesn't work.  I've encountered
> that happening one time at a previous job.  A server
> was turned off in a room because they were having a
> meeting in there, it was too loud.  But, when they
> turned it back on, it didn't work (faulty or blown
> card or something).  The network admin had to call the
> card maker, and they came out to fix or replace it.

When a machine is turned off and on, that's when it's most likely to fail. Thermal stresses on 
components being one issue. This is why machines are generally kept in a machine room at
a set temperature in a stable environment and run continuously.

> I was thinking, how about having a redudant system
> read to go, before I start messsing with the "live"
> machine.

Can't hurt. Nothing worse than having a live machine down for an extended period of time, 
people never notice when things are going right, only when they go wrong. :-)

> How have any of you delt with these things?
> 
> Thanks in advanced!
> 
> __________________________________
> Do you Yahoo!?
> Yahoo! SiteBuilder - Free, easy-to-use web site design software
> http://sitebuilder.yahoo.com
> _______________________________________________
> Members mailing list
> members@lists.seattle-sage.org
> http://lists.seattle-sage.org/mailman/listinfo/members
>