In this step, targets (i.e., hosts, devices) to be investigated must be identified, access to the target devices must be obtained, and information gathered. During this step, the technician may gather and document more symptoms, depending on the characteristics that are identified.
Possible causes must be identified. The gathered information is interpreted and analyzed by using network documentation and network baselines, searching organizational knowledge bases, searching the Internet, and talking with other technicians.
If multiple causes are identified, then the list must be reduced by progressively eliminating possible causes to eventually identify the most probable cause. Troubleshooting experience is extremely valuable to quickly eliminate causes and identify the most probable cause.
When the most probable cause has been identified, a solution must be formulated. At this stage, troubleshooting experience is very valuable when proposing a plan.
Before testing the solution, it is important to assess the impact and urgency of the problem. For instance, could the solution have an adverse effect on other systems or processes? The severity of the problem should be weighed against the impact of the solution. For example, if a critical server or router must be offline for a significant amount of time, it may be better to wait until the end of the workday to implement the fix. Sometimes, a workaround can be created until the actual problem is resolved.
When the problem is solved, inform the users and anyone involved in the troubleshooting process that the problem has been resolved. Other IT team members should be informed of the solution. It is important to properly document the cause and solution as this can assist other support technicians to prevent and solve similar problems in the future.
Troubleshooting with Layered Models (37.1.3)
The OSI and TCP/IP models can be applied to isolate network problems when troubleshooting. For example, if the symptoms suggest a physical connection problem, the network technician can focus on troubleshooting the cables and their connections at the physical layer.
Figure 37-2 shows some common devices and the OSI layers that must be examined during the troubleshooting process for that device.
Figure 37-2 Layers of the OSI Model and Where Troubleshooting Typically Starts for Different Devices
Notice that routers and multilayer switches are shown at Layer 4, the transport layer. Although routers and multilayer switches usually make forwarding decisions at Layer 3, ACLs on these devices can be used to make filtering decisions using Layer 4 information.
Structured Troubleshooting Methods (37.1.4)
There are several structured troubleshooting methods that can be used to solve computer and network problems. The troubleshooting method used will vary depending on the type of problem and the personal experience of the technician.
A technician may choose one or more of the following troubleshooting methods to solve a problem:
- Bottom-up—Start with the physical layer and the physical components of the network and move up through the layers of the OSI model until the cause of the problem is identified.
- Top-down—Start with the end-user applications and move down through the layers of the OSI model until the cause of the problem has been identified.
- Divide-and-conquer—Start by collecting user experiences of the problem, document the symptoms, and then, using that information, make an informed guess as to which OSI layer to start your investigation.
- Follow-the-path—Discover the traffic path all the way from source to destination. This approach usually complements one of the other approaches.
- Substitution—Physically swap the problematic device or component with a known, working one. If the problem is fixed, then the problem is with the removed item. If the problem remains, then the cause is elsewhere.
- Comparison—Compare specifics such as configurations, software versions, hardware, or other device properties, links, or processes between working and nonworking situations and spot significant differences between them.
- Educated guess—A less-structured troubleshooting method that uses an educated guess based on the experience of the technician and their ability to solve problems.