Follow GFI:
Find us on Facebook Follow us on Twitter Find us on Linkedin Subscribe to our RSS Feed Find us on YouTube Find us on Google+
 

Troubleshooting a Blue Screen

on November 11, 2011

One of the most annoying things that can happen to an administrator is when at random intervals a PC crashes and generates a so-called blue screen of death. A blue screen can be caused by a number of faults and it is sometimes very hard to pinpoint the cause simply by looking at the blue screen information itself. There are other ways to diagnose a blue screen and, if you have a Memory Dump, you can debug the crash and find out exactly what caused it.

 

 Generating a Memory Dump:

In the event that your system is not configured to generate a Memory Dump file when a blue screen occurs, you need to enable the functionality before we can proceed with debugging the root cause of the issue. In order to do this you need to do the following:

  • Open the Control Panel
  • Open the System settings
  • Switch to the Advanced Tab
  • Click on the Settings button under the Start-up and Recovery section

A dialog will open with various settings; towards the end there is a section called “write debugging information”.

The first combo box contains the kind of memory you want to dump when Windows experiences a crash. For our purposes kernel memory dump will suffice.  The next edit box contains the location where the memory dump will be stored.

 

Getting the Necessary Tool:

In order to debug a memory dump we will need a free tool supplied by Microsoft called WinDbg. This is actually a debugger and it can be downloaded for free from the Microsoft website.

Make sure you download the correct debugging tools for your architecture, run the file, install it and you’re ready to debug the blue screen.

 

Debugging the Issue:

A lot of people are not comfortable debugging a memory dump but the process is simpler than most people think.

The first step we need to do when WinDbg loads is to configure symbols path for the debugger. Symbols comprise information that for efficiency’s sake a compiler strips out of executables. Things like variable and function names are very important to a programmer but not to Windows. For this reason when your compiler compiles your source code this information is kept out of the executable to make it smaller and more efficient. To debug a problem however, symbols are very useful. Luckily for us, Microsoft provides a symbols server which WinDbg can make use of to get symbols as required.

 To configure symbols click on:

  •  The File Menu
  •  Select Symbol Search Path

Now we need to enter the following line:

SRV*c:\symbols*http://msdl.microsoft.com/download/symbols

This will instruct WinDbg to fetch any needed symbols from the Microsoft symbol server and store them locally in the provided folder which in this case is c:\symbols. You can choose another folder if you want.

Click on the OK button and we can start to debug our dump file.

Note: WinDbg will need access to the Internet in order to fetch the symbol files it needs.

We now need to open the dump file itself and we do this by:

  • Clicking on the File Menu
  • Select Open Crash Dump
  • Select the Crash Dump you want to debug and click OK

It will take a short while for WinDbg to open your dump file and load up the symbols required.

In order to do a detailed analysis after the dump file finishes loading, type in the prompt: !analyze –v and press enter.

After some time we’ll get all the information we need to determine what is causing the blue screen.

 

Information of Interest:

Right below Bugcheck Analysis we’ll get a small report by WinDbg on what error occurred and what information is relevant to that error, such as what parameters where used when the crash occurred.

Process_Name contains the name of the processes where the crash occurred.

BUGCHECK_STR displays the exception code. A list of codes can be found on the msdn site.

DEFAULT_BUCKET_ID displays the category of the error

STACT_TEXT displays the stack trace.

This should give you the information you need to determine the cause of the blue screen and provides you with a starting point you need to solve the problem.

 
Comments
Helen Walkers November 16, 20113:44 am

This is the reason why I’m not into Windows-based operating systems. I used to use and love XP but after the vulnerabilities and failures of Windows Vista, I came to “hate” all Microsoft’s OSs. Even with the advent of Windows 7, I’m still not convinced that Microsoft has improved their operating system’s functionalities. There are even reports that the blue screen of death still exist on Windows 7. How come? I thought this OS is so powerful and crash-free???

I’ll wait and see what will Windows 8 do to improve the image of Microsoft. But for now, I’m still for Apple’s OSX Lion.

Emmanuel Carabott November 16, 201112:24 pm

Hi Helen,
A blue screen of death occurs when an error occurs at kernel level. A computer system has two levels, a kernel level and a user level. When an exception occurs in a program at user level the application might crash but the system remains unaffected. When such an error occurs at kernel level there is no higher level to handle that crash, in fact the system itself will be in a so called bad state and to safeguard everything else the kernel issues the blue screen of death preventing the system from continuing to function, this action is actually intentional. This is not a Microsoft thing. Errors at kernel level can happen on any operating system including Apple’s. Apple do not display a blue screen when this happens, being BSD based in Unix they call it a Kernel Panic. For example OSX will display a multilingual message informing the user they need to restart the system when such an event occurs.

Why System crashes occur more frequently on Windows, if this is the case at all, can be for many reasons. There are some fundamental design differences between Windows and MACs, and another factor might be that on Windows there are more third party drivers and combinations than you would find on MAC. Any driver works at kernel level and this can cause blue screens if the driver crashes due to hardware issues or bad coding in the driver itself.

The bottom line is that if any operating system claims to be crash-free, it is deceiving users. Even kernels and drivers that are coded perfectly with no bugs at all (which do not really exist) can still cause a blue screen/Kernel Panic if the contents of the memory itself gets corrupted due faulty RAM (which develops due to wear and tear) or even due to hardware they’re driving developing faults.

Perry B. November 17, 20115:52 am

True, NO operating system is perfect. Even Apple and all Unix-based OSs suffer from crashes (blue screen of death or Kernel Panic). The problem lies after crashes happen. In my opinion, all Microsoft operating systems (yes, even the jurassic Windows 98 and the most modern Windows 7) handles almost all computer crashes with uneasiness – and this based on my own experiences as a student before and as an IT worker now.

Steve Ballmer (the current CEO of Microsoft), said that Windows 8′s hybrid Kernel type will “lessen” or completely minimize all blue screen incidents. I can’t comment on this until I get my hands on Windows 8.

Tana George November 28, 201110:15 am

The simplest troubleshooting is to suspect a newly-installed driver/hardware as the culprit. If the BSoD didn’t appear before that, it is almost 100 per cent certain that the newcomer is to blame. The problem is that even after you unistall/remove the new driver/hardware, the problem might persist.
Btw, have you seen this BSoD screensaver: http://technet.microsoft.com/en-us/sysinternals/bb897558 ? I don’t know if it works with the latest versions of Windows because I also ditched Windows (almost) completely after the Vista failure and don’t have where to try it but it used to be cute back in the day. :)