Xem mẫu

  1. Troubleshooting Startup Problems Windows Server 2003 is certainly the most reliable version of Windows, possessing a level of robustness that its predecessors lack, even Windows 2000 and Windows XP. Does this mean startup problems cannot occur in Windows Server 2003? No, it doesn't. No existing operating system can be considered crash- or corruption-proof, not even specialized operating systems used by military organizations. Any operating system can be rendered unbootable, and the newest release of Windows is no exception. This issue becomes extremely important in a large-scale corporate network, which depends on the availability of the network operating system (OS)-particularly for network servers. You must be prepared to handle anything that goes terribly wrong and to avoid feeling helpless. A significant part of this chapter is dedicated to troubleshooting startup problems. These, I admit, are the most frustrating ones, especially if your system won't boot when you have a lot of work to do. So, what should you do in an emergency? First, don't panic. Next, try to detect what is preventing the operating system from booting. Since boot sequence in Windows Server 2003 closely resembles that in Windows 2000/XP, most (but not all) techniques described here can be applied to all three versions. A detailed description of Windows Server 2003 boot sequence was provided in Chapter 6. Therefore, I will provide only a short explanation of the boot sequence, and then proceed with problem detection and troubleshooting. Table 12.1 lists Windows Server 2003 startup phases with brief descriptions of the processes that take place at each stage of the normal boot process. Table 12.1: Windows Server 2003 Startup Process Startup stage Description (x86-based systems) POST routine CPU initiates the system board POST routines. POST routines of the individual adapters start after the motherboard POST is accomplished successfully. Initial startup The system searches for a boot device according to the boot order process setting stored in CMOS. If the boot device is a hard disk, Ntldr starts. Operating system Ntldr switches the CPU to protected mode, starts the file system, load then reads the contents of the Boot.ini file. This information determines the startup options and initial boot menu selections. Hardware detection Ntdetect.com gathers basic hardware configuration data and passes and configuration this information to Ntldr. If more than one hardware profile exists, selection Windows XP and Windows Server 2003 attempt to use the correct one for the current configuration. Notice that if your computer is ACPI-compliant, Windows XP or Windows Server 2003 ACPI
  2. Table 12.1: Windows Server 2003 Startup Process Startup stage Description (x86-based systems) functionality will be used for device enumeration and initialization. (More information on this topic was provided in Chapter 5.) Kernel loading Ntldr passes the information collected by Ntdetect.com to Ntoskrnl.exe. Ntoskrnl then loads the kernel, HAL, and registry information. A status bar at the bottom portion of the screen indicates progress. Operating system Networking-related components (such as TCP/IP) load logon process asynchronously with other services, and the Begin Logon prompt appears on screen. After a user logs on successfully, Windows updates the Last Known Good Configuration information to reflect the current state. New devices are If Windows XP or Windows Server 2003 detects new devices, they detected by Plug are assigned system resources. The operating system extracts the and Play required driver files from the Driver.cab file. If this file is not found, Windows XP or Windows Server 2003 prompts the user to provide them. Device detection occurs asynchronously with the operating system logon process. Diagnosing Startup Problems Fortunately, boot failures in Windows XP and Windows Server 2003 are rare, especially if you perform regular maintenance and take preventive measures against disaster. However, problems still can arise. As with any other operating system, they might be caused both by hardware malfunctions and by software errors. If the problem is severe enough, the system stops booting and displays an error message. A brief list of error messages and their meanings is presented in Table 12.2. Although this list is by no means comprehensive, it covers the most common problems that can cause startup failures of Windows NT-based operating systems, including Windows 2000, Windows XP, and Windows Server 2003. Table 12.2: Startup Problem Symptoms Startup problem Possible cause symptom The POST routine emits The system self-test routines stopped because of improperly a installed devices. series of beeps and displays To recover from hardware problems, carefully review the
  3. Table 12.2: Startup Problem Symptoms Startup problem Possible cause symptom error messages, for documentation supplied with your system and perform the example: basic hardware checks. Verify that all cables are attached correctly and all internal adapters are installed properly. Make Hard disk error. sure that all peripheral devices (such as keyboards) necessary to Hard disk absent/ complete the POST without error messages are installed and failed. functioning. If applicable, verify that you have configured correctly all jumpers or dual in-line package (DIP) switches. Jumpers and DIP switches are especially important for hard disks. Run diagnostic software to detect hardware malfunction, and replace the faulty device. Unfortunately, the topic of troubleshooting hardware problems goes beyond the range of problems discussed in this book. It deserves a separate comprehensive volume. However, I can recommend some resources on the topic that would help you make sense of the BIOS error codes: • BIOS Survival Guide, available at http://burks.bton.ac.uk/burks/pcinfo/hardware/bios_sg/bi os_sg.htm • Definitions and Solutions for BIOS Error Beeps and Messages/Codes, available at http://www.earthweb.com CMOS or NVRAM The CMOS memory is faulty, data is corrupt, or the battery settings are not retained needs replacing. Master boot record The MBR is corrupt. (MBR)-related error messages similar to the The easiest method of recovering the damaged MBR is following: provided by Recovery Console (the methods of starting Recovery Console were discussed in Chapter 2). Once you are Missing operating in Recovery Console, use the FIXMBR command to repair the system. MBR. Insert a system diskette and restart The FIXMBR command uses the following syntax: the system. Fixmbr [device_name] The parameter device_name specifies the drive on which you
  4. Table 12.2: Startup Problem Symptoms Startup problem Possible cause symptom need to repair the damaged MBR. For example: fixmbr \Device\HardDiskO If the device_name parameter is omitted, the new MBR will be written to the boot device, from which your primary system is loaded. Notice that you'll be prompted to confirm your intention to continue if an invalid partition table is detected. Partition table-related The partition table is invalid. error message similar to the following: You can recover from this problem using the DiskProbe Resource Kit utility or any third-party low-level disk editor. Invalid partition Note that to prevent this problem, you must create a backup table. copy of the MBR beforehand. (You can use the DiskProbe tool for this purpose.) Detailed information on this topic can found A disk-read error in the Resource Kit documentation. occurred. If the MBR on the disk used to start Windows is corrupt, most likely you will be unable to start Windows XP or Windows Server 2003 (and, consequently, DiskProbe). Therefore, before proceeding any further, you'll need to start Recovery Console to replace the damaged MBR. Boot failure caused by Start Recovery Console and run the CHKDSK command to disk or file system repair the disk. If this proves to be insufficient, you will need to corruption, not related to take additional actions to fully recover the damaged file damaged MBR or system. partition table Windows XP or The Windows XP or Windows Server 2003 boot sector was Windows Server 2003 overwritten by the other operating system's setup program. cannot start after you have installed another Recovery Console provides the FIXBOOT command that operating system enables you to restore the overwritten boot sector. Missing Boot.ini, Required startup files are missing or damaged, or entries in the Ntoskrnl.exe, or Boot.ini are pointing to the wrong partition. Ntdetect.com files (x86- based systems) Start the Recovery Console and use available commands, such as REN, DEL, or COPY, to restore working copies of boot files.
  5. Table 12.2: Startup Problem Symptoms Startup problem Possible cause symptom Bootstrap loader error Ntldr is missing or corrupt. messages similar to the following: If Ntldr or any other file required to boot the system is missing or corrupt, start Recovery Console and copy the required file. Couldn't find loader Please insert another disk. Windows NT-based OS This and similar error messages specifying different file names cannot start and displays indicate that the boot failure was caused by a damaged registry message similar to the hive(s) or by invalid registry settings. one provided below: First, try to boot using the safe mode startup option. If your Windows could not attempt has failed, try the Last Known Good boot option. If start because the you still can't boot successfully, start Recovery Console and following file is use the COPY command to restore known good registry files missing or corrupt: (for example, those located in the %SystemRoot%\Repair folder) to the %SystemRoot%\System32\Config folder. \WINNT\SYSTEM32\C If the problem is related to settings for a specific service or ON driver, you also may be able to use Recovery Console's FIG\SYSTEM DISABLE command to disable the offending service or driver. You can attempt to repair this file by starting Windows Setup using the original Setup floppy disk of CD- ROM. Select 'r' at the first screen to repair. Boot failure caused by a Use the safe mode startup option, then repair or replace the video display driver driver. problem Boot failure caused by As a first line of defense, try to boot in safe mode and disable
  6. Table 12.2: Startup Problem Symptoms Startup problem Possible cause symptom service or driver the offending service or driver. If your attempt fails, start initialization Recovery Console and use the LISTSVC and DISABLE commands to identify and disable the service or driver that prevents Windows from booting. If you have a working copy of the system registry, you can use the Recovery Console's COPY command to restore the system registry. Boot failure caused by Start Recovery Console and use the ATTRIB command to invalid file attributes set restore the correct attributes. on system files or folders Boot failure caused by Try to boot into the safe mode. If this is not successful, try to unknown system startup use the Boot Logging startup option. Then start Recovery event Console and use the TYPE command on the resulting log file to identify the failed initialization event. Stop messages appear Many software or hardware issues can cause these messages. In addition to official Microsoft documentation (such as the long list of common STOP messages usually supplied in the Resource Kit documentation), there are other useful resources on troubleshooting STOP messages. One such resource can be found at http://www.aumha.org/kbestop.htm As mentioned in Chapter 6, all Windows NT-based systems generate system messages known as blue screens, or "Blue Screens of Death", if they encounter serious errors which they can't correct. If Windows stops loading, the blue screen also may appear to prevent further data corruption. If the STOP message appears during system startup, it's likely that the cause of the problem is among the following: The user installed third-party software that's destroyed part of the system registry (that is, the HKEY_LOCAL_MACHINE root key). This may happen if the application tries to install a new service or driver. The blue screen will appear, informing the user that the registry or one of its hives couldn't be loaded. The user incorrectly modified the hardware configuration and, as a result, one of the critical system files was overwritten or corrupted. The user installed a new service or system driver that is incompatible with the hardware, causing the blue screen to appear after rebooting. Strictly speaking, it's the attempt to load an incompatible file that leads to the corruption of a correct system file.