Elements of System Level Troubleshooting
Video Lectures created by Tim Fiegenbaum at North Seattle Community College.
Lecture Transcript: We are in section 1.5, Elements of System-Level Trouble Shooting. The ability to diagnose and correct is an important part of having a career in computer technology or electronics in general, and we are going to have a look at some of the strategies used in troubleshooting. And these strategies include block diagram thinking, signal tracing, signal injection, diagnostic software, observation and substitution. The first one we'll look at is Block Diagram Thinking. And troubleshooting complex electronic systems can be an overwhelming task; many systems are too complex to be able to visualize all of the system details. One way to avoid being overwhelmed is to logically troubleshoot the system from a block diagram perspective. And earlier in this chapter we had looked at block diagrams and we had observed that block diagrams gave us the big picture of a system and didn't give us the detail and this often helpful when trying to diagnose a problem in a large system because you don't want that much detail, you get lost in the detail. So you want a bigger picture that a block diagram can provide. The next trouble shooting technique that we'll mention is Signal Tracing. This is a technique that involves monitoring circuit quantities at various points throughout the circuit. At each point you evaluate whether the measurement is good or bad with respect to the correct or measured value. Typically oscilloscopes or voltmeters are used to do this. One of the important or necessary skills in this is that you actually know what is the value that should be measured at given points, and usually with given systems you'll have technical manuals or schematics that will provide that information. By tracing the signal through the system, particularly sequential systems, you will reach a point where the signal is no longer what it should be. And at that point you determine this is probably where the fault is. One way to implement signal tracing is called Divide and Conquer. And in this strategy measurements are made in the middle of the system. And I guess if I can draw a very basic--- let's just pretend that here we have a given system and the signal comes in here and the signal comes out over here, and let's pretend that we have test points throughout this circuit and we could go in and we could measure every single one of these tests points to determine when does the system fail. Or we could go in and say ‘Let's measure right here' and we make a determination- is that a good signal or is it a bad signal? And again you have to be familiar enough with the system to determine is that a good signal or a bad- well if is it a good signal then you know the problem is over here somewhere, so you'll probably divide and conquer again and make a determination as to whether it is here or here. So that's the concept of divide and conquer utilizing signal tracing. Then Diagnostic Software, it is a powerful tool in the troubleshooting endeavor. This specially designed software requires only limited system capability in order to execute successfully. Diagnostic software is common in troubleshooting- notice- in computer systems for memory and hardware errors. And a couple of the diagnostic software programs that I am familiar with are Norton Utilities, SiSoft Sandra… in fact SiSoft Sandra is a free download and it's an interesting program that you're using to diagnose computer system faults. Diagnostic software based systems are also used in automotive troubleshooting. Observation; the most important tool in your troubleshooting toolkit is the power of observation- when we say ‘observation' this is all of the senses, you can see but you can also smell, and you can also hear, and often times in a diagnostic system it is not always what you see, sometimes it's what you smell that tells a lot. Those of you that have been around in this business you know there that's smell of burning components that tells you something is very wrong. Sometimes even you'll hear sounds that shouldn't be there. Anyway, observation is very important in diagnosing faults; many defects can be diagnosed simply by carefully observing the operation of the circuit. Observation extends to three other important troubleshooting strategies (besides specific observing) there is the user interview- what has the user observed about the circuit, and then there's a concept… this particular term; front panel milking, I hadn't heard before but this is the way your textbook describes it, and what it is is the intentional operation of all the front panel controls while observing the behavior of the system. And this just simply means that you're putting a given system through all of it's paces and adjusting it and doing all the things that this system should be able to do and observing its operation. Again, this is only going to be helpful if you know how that system is supposed to operate, and then you notice something that's not quite right as a technician you should be able to pick that up. But if you do not know the correct operation of a given system this may or may not help. The review of history using log books, log books can be of great value if they are maintained, often times if a group of technicians is keeping a log of the system they are maintaining they can diagnose weaknesses in the system and then when it comes to a fault that comes up they can look in their log books and say ‘Oh, that particular fault has been occurring frequently' and rather than doing a… that they can just go directly to the fault based on documentation. Substitution; benefits and risks. Well first what is substitution? Once localized, the suspected component can be substituted and the circuit evaluated again to verify the component operation. Well let's pretend that this is a radio, and this one is bad; it doesn't work. And here we have another radio; this one is good and the technician has this available. And this bad radio has come in and he's supposed to repair it. So now typically, and this is probably a communication system for an aircraft or something, it's probably a high-end device. And most radios are going to have - we'll draw some line here- and we're going to pretend that these are circuit cards, or modules within this bad radio. Now in the substitution method of troubleshooting what the technician would do is he would simply pull out… so here we have circuit card one (in the bad radio) and he would pull out circuit card one (from the good radio) and replace it. Typically you would replace several before you got to the faulty one. Now let's just pretend for the sake of discussion that he discovers circuit card number four was the faulty component, and so he would probably go over to his supply system and he would order circuit card number four, he would put it into the radio and the radio is fixed. And this is the concept of troubleshooting by the method of substitution when you have a group of known good components. Substitution can be used at nearly any level in the system integration. In aviation when a system faults the entire system is often removed and replaced, and we're going to look at aviation as an example. We talked about that box in the previous example, let's pretend that in the aircraft the radio system has failed, the pilot can't talk to ground and he's just landed and he's got to take off soon. And he can't wait for technicians to fix this thing so what they would do- and this is major substitution- they would just take this entire black box out and take a known good black box, plug it in, the radio works and the pilot flies away. The bad box is now going to go down to the folks in maintenance, and maintaining is going to fix this thing. So the technicians take the black box apart and remove and replace modules until the box is fixed, so they would be doing the process that we looked at before- substituting modules until they've repaired the box. When repaired the radio goes back to the supply system until needed by another aircraft, so in this case this box now that it is fixed goes into the supply system and it waits until another radio fails and is then plugged into an aircraft and away it goes. So in this particular cycle the defective modules are repaired by the maintenance personnel and this would typically involve going down to the component level and repairing the faulty circuit card. And there is no substitution here but in these two upper levels maintenance cycle substitution is a very viable way of repairing systems. Substitution is not always practical as the primary method of troubleshooting, as it requires a tremendous inventory of spare parts in complex systems. Now in the aviation industry there's a lot of revenue there and the need to make an aircraft go immediately and so typically the supply system will be able to support that, but in smaller systems that type of inventory wouldn't always be available, so in that case substitution will not always be a viable alternative. Occasionally, and this is another problem with substitution, system chassis will short, damaging connected modules. And as known good modules are plugged into the system it destroys them as well, for example most cars today have an onboard computer, and occasionally the onboard computer will fail so the technician notices this and says ‘I'll just replace it', and he replaces it only to have the one he replaced it with go up in smoke as well. The problem probably is that there is something in the chassis of that vehicle that is shorting out the onboard computer, so the original problem wasn't the onboard computer; it was a short in the chassis connected to the onboard computer. So now he's got two broken computers. Substitution doesn't always work. It requires experience and good judgment on the part of the technician to be able to determine when substitution is going to work and when it is not going to work. In section 1 5 we have addressed the elements of troubleshooting, and we've looked at substitution, we have looked at observation- remember your senses can tell you a lot about troubleshooting, we've looked diagnostic software, used mainly in computers but it is also used also in the automotive industry, lots of industries now use diagnostic software to troubleshoot electronic systems. Signal tracing and block diagram. So that concludes elements of system level troubleshooting section 1.5.