by Priscilla Oppenheimer
When troubleshooting problems on a WAN-based internetwork, it's important to dissect the components of a WAN circuit and consider the functions of the different components. In the C-plane, signaling protocols make the connection between Data Terminal Equipment (DTE) and Data Circuit-Terminating Equipment (DCE). DTE devices are typically owned by your organization. The most common DTE is a router. A DCE is typically a switch inside a service provider's network.
A WAN circuit from a carrier enters a customer's building at a Point of Presence (POP), commonly known as a demarcation (demarc) point. Typically, the circuit then traverses building wiring to interface with the carrier (network) side of a CSU/DSU. A CSU/DSU adapts the signal provided by the carrier to the interface used by a DTE. A DTE router typically connects to the data interface of the CSU/DSU via a V.35 serial cable.
In Figure 3, you can see that there are many potential problem areas to consider. Cisco calls these troubleshooting targets.
Figure 3. Troubleshooting Targets in a WAN Circuit
DSUs versus CSUs
Most enterprise WANs use a single device, often called a CSU/DSU, to adapt the signal provided by the carrier to the interface expected by a DTE router. The CSU/DSU has a network port that connects to the carrier and a data port that connects to the router. In reality, the box that we call a CSU/DSU acts mostly like a DSU, however.
A DSU provides a standard interface to a terminal, which in a Cisco-routed environment is a router. The DSU handles such functions as signal regeneration, reformatting, and timing. The DSU is bit transparent, and presents a synchronous signal such as V.35 to the terminal.
From the DTE standpoint, a CSU is not bit transparent, and indeed may stuff ones to keep up pulse density. A CSU presents a four-wire isochronous signal (raw DS-x) to a DTE.
A CSU connects to the communication carrier and is used by customers who wish to use their own equipment to retime and regenerate the incoming signals. The customer must supply all of the transmit logic, receive logic, and timing recovery in order to use the CSU.
A CSU can help with maintenance by going into a loopback mode after it receives a controlled sequence of bipolar violations. A combined CSU/DSU can also be placed into loopback mode when troubleshooting. Loopback is discussed in more detail in the "Loopback Testing" section later in this Tutorial. This Tutorial assumes that you are using a combined CSU/DSU (that provides a DSU-like interface to the router) and that you do not have a separate CSU.
When troubleshooting WAN problems, if the WAN has never worked or has completely stopped working, your first troubleshooting steps should involve the Physical Layer, especially cables. The easiest place to start troubleshooting is the cable between the router and the CSU/DSU. You should try to keep a few of these cables on hand so that you can swap out a cable that you suspect is not working. These cables are normally available from the router or CSU/DSU vendor. As mentioned, the cable is usually required to conform to the V.35 standard, but check with your vendors in case this isn't true for your equipment.
If you prefer a more scientific method than the "swap till you drop" procedure, then you can insert a breakout box to check the signaling between the CSU/DSU and the router. A breakout box is a compact, handheld device that analyzes the signals carried on serial interfaces, usually displaying the results with green or red LEDs. The use of breakout boxes when troubleshooting serial problems was common in the past, and Cisco tests may still mention them. These days most network administrators don't troubleshoot at that level, however. Swapping works and doesn't usually result in dropping.
Another troubleshooting target is the cabling between the CSU/DSU and the telco jack at the demarc. For 56-Kbps service, T1, fractional T1, Frame Relay, and ISDN, the jack will be labeled either RJ-48S or RJ-48C. A straight-through four-pair cable conforming to the EIA/TIA 568A/B standard should be used between the telco and the CSU/DSU. Different cable pairs are used for different signals, but as long as all four cable pairs in the cable are wired according to the EIA/TIA 568A/B standard, the cable should function. Some cable vendors sell a shielded cable to protect the cable from electrical fields. The shield should be grounded at one end. For long distances, you can purchase larger shielded cables to run multiple T1s in one bundle.
The cabling can be tested with a cable tester. Cable testers (also sometimes called scanners) are useful tools for checking physical connectivity and reporting cable conditions such as Near-End Crosstalk (NEXT), attenuation, and noise. Some of the tools also have a Time-Domain Reflectometer (TDR) function, wire-mapping features, and traffic-monitoring capabilities. Some testers display higher-level information about network utilization and error rates. Some tools also allow for limited protocol testing, such as IP pinging.
Intermittent cabling problems are the hardest to troubleshoot. If your WAN circuit generally works but you observe frequent errors or router interface resets, you may want to work with a professional cable installer to check all cabling, patch panels, and jacks. Remember, your goal is to gather as much information as possible to prove that a problem doesn't exist with your equipment or cabling. Often, you need to do a lot of work to prove to a carrier that the problem is on their end and not yours.
After verifying cabling, you should isolate any problems with the CSU/DSU and make sure it can communicate with the DCE at the service provider. Most CSU/DSUs have extensive error reporting and loopback testing facilities. (Some routers have a built-in CSU/DSU. In this case, the error reporting may not be as extensive, but you can generally see a lot of diagnostic data by using the show controller command, as will be discussed.)
To receive data correctly, an electronic device must synchronize its clock with the frequency and phase of the clock used to transmit the data. Otherwise, the device will be unable to determine when a bit starts and ends. On LANs, encoding schemes such as Manchester Encoding embed clocking information in the data. On WANs, you must configure which component in the WAN circuit provides the clocking.
Network clock means that the carrier provides the clock. If the carrier provides clocking, then both CSU/DSUs on either end of the circuit need to be configured to reflect this, normally with the clocking parameter set to "network." On a Cisco router, look for "Clock Source is Line Primary" in the output from the show controller command to verify that the clock source is derived from the network.
On short digital links, you may be required to provide the clock. A CSU/DSU on one end of the circuit (but not both) must be configured for internal clocking in this case. On some Cisco router platforms, you can use the transmit-clock-internal command to enable an internally generated clock on a serial interface. There may be other options for clocking also, depending on the hardware and topology of your WAN. You should check your CSU/DSU and Cisco IOS documentation for more details.
In lab environments, it's often common to connect two Cisco routers back-to-back without a CSU/DSU and without an actual carrier providing clocking. In this case, one router must be configured to do clocking with the clockrate command. The routers should be connected via a special DTE/DCE crossover serial cable. The DCE end of the cable must be connected to the router that is providing the clock. If the cable is not clearly marked, use the show controllers command to determine which router is connected to the DCE end of the cable. (In fact, even if the cable is marked, verify which end is really DCE with the show controllers command. Sometimes cables are marked incorrectly.)
We all know that electronic devices operate on data that consists of ones and zeros. In most cases, however, we don't need to know how these ones and zeros are represented as changes in voltage levels on a transmission medium. Although it can help in a LAN environment to understand how problems at this level can result in frame errors (as discussed in the "Ethernet Troubleshooting" Tutorial at CertificationZone), the reality is that a LAN engineer cares little about the encoding of a digital signal on a transmission medium. Unfortunately, this isn't entirely true with WANs. On WAN circuits, encoding must be configured. This is often an area for troubleshooting because of misconfigurations.
Common encoding methods used on WANs in the United States are Alternate Mark Inversion (AMI) and Bipolar 8 Zero Substitution (B8ZS). AMI is typically used for voice, whereas B8ZS is used for data. In Europe, the High Density Binary 3 (HDB3) encoding method is common.
With AMI, B8ZS, and HDB3, the signal is transmitted using a bipolar, return-to-zero scheme. Bipolar means that each logical 1 bit is transmitted as a positive or negative pulse. After a 1 is transmitted, the line voltage always returns to zero. A logical 0 bit is transmitted as zero voltage on the line. In an AMI scheme, each pulse, or mark, is of opposite polarity from the previous pulse. Notice in Figure 4 that a logical 1 can be either a positive or negative pulse, but that polarity must alternate with each logical 1 bit. This ensures that direct current (DC) does not flow through the circuit.
Figure 4. Alternate Mark Inversion
A benefit of AMI encoding is the ability to detect line errors. If a problem on the line causes a pulse to be deleted or inserted, two consecutive pulses with the same polarity will result, which is a bipolar violation (BPV). A disadvantage of the AMI encoding is that a long sequence of zeros is indistinguishable from a loss of signal. For that reason, data lines often use more sophisticated encoding schemes that enforce a sufficient ones density. Both B8ZS (typically used on T1 lines) and HDB3 (typically used on E1 lines) substitute long strings of zeros with other patterns.
As illustrated in Figure 5, in the B8ZS technique, a sequence of eight consecutive zeros is replaced on the line by four zeros, an intentional BPV, a valid pulse, another zero, another BPV, and a valid pulse.
Figure 5. Bipolar 8-Zero Substitution
In the HDB3 technique, a sequence of four consecutive zeros is replaced. If there has been an even number of pulses of either polarity since the last intentional BPV, the first bit is coded as a valid pulse. (Per the rules of bipolar AMI, a "valid pulse" is one that alternates in polarity from the previous pulse.) If there has been an odd number of pulses of either polarity since the last intentional BPV, the first bit is coded as a 0. The next two bits are coded as zeros. The fourth bit is coded as an intentional BPV.
When troubleshooting a WAN circuit, especially if it has never worked, you should check the signal encoding configuration. The encoding must match the encoding used by your provider. If you are using a router that contains a CSU/DSU, then you should configure the router for the correct signal encoding by entering controller configuration mode and using the linecode command. If you have an external CSU/DSU, check the vendor documentation to see how to configure encoding.
Framing is another parameter that must be configured. Just as there were many options for encoding, there are multiple options for framing also, including Superframe (also known as D4), Extended Superframe (ESF), and others. If you are using a router that contains a CSU/DSU, then you must configure the router with the correct framing by going into controller configuration mode and entering the framing command. Usually framing is configured on a CSU/DSU.
Physical transmission facilities, such as T1 in the United States and E1 in Europe, carry multiple channels. Framing specifies a method for a sender to group multiple channels into one circuit. Framing allows the recipient to detect the beginning and end of the data for each set of channels. A channel is also called a timeslot.
We won't cover all the various framing options, but to give you an idea of the concepts, we'll briefly describe DS1 framing. A single DS1 frame carries 24 DS0 channels. As illustrated in Figure 6, each frame is 193 bits long, made up of 8 bits from each of the 24 channels plus one framing bit. A DS1 frame repeats every 125 microseconds -- in other words 8,000 times per second. Thus, the bandwidth of a DS1 is 193 x 8,000 = 1,544,000 bits per second.
Figure 6. DS1 Framing
In addition to the basic definition of a single 193-bit frame, standards specify how frames can be grouped and how the 193rd bit should be interpreted. The D4 method groups 12 frames into a Superframe and specifies a bit pattern that must occur in the framing bits of the grouped frames. The recipient looks for a specific sequence of framing bits (100011011100) in order to synchronize with the sender. In this manner, the recipient can gain an understanding of where frames begin and end.
Extended Superframe (ESF) is an enhancement to Superframe. By the time ESF came around, there was enough confidence in DS1 that just 6 bits were considered adequate for framing. ESF groups 24 frames and looks for a 6-bit pattern of 001011. The recipient expects to see this pattern occurring in the framing bits for frames 4, 8, 12, 16, 20, and 24. This allows the other bits to be used for different purposes.
Twenty-five percent of the framing bits (2000 bits per second) are used for a CRC line error detection. The CRC detection channel (C in Table 4) uses the framing bits in frames 2, 6, 10, 14, 18, and 22. Half of the framing bits (4000 bits per second) provide a diagnostic data channel (D in Table 4) that can be used for line maintenance and repair purposes. The diagnostic channel uses the framing bits in all the odd-numbered frames.
Table 4. Framing Bits in ESF
|Frame 1||DS0 1||DS0 2||...||DS0 24||D|
|Frame 2||DS0 1||DS0 2||...||DS0 24||C|
|Frame 3||DS0 1||DS0 2||...||DS0 24||D|
|Frame 4||DS0 1||DS0 2||...||DS0 24||0|
|Frame 5||DS0 1||DS0 2||...||DS0 24||D|
|Frame 6||DS0 1||DS0 2||...||DS0 24||C|
|Frame 7||DS0 1||DS0 2||...||DS0 24||D|
|Frame 8||DS0 1||DS0 2||...||DS0 24||0|
|Frame 9||DS0 1||DS0 2||...||DS0 24||D|
|Frame 10||DS0 1||DS0 2||...||DS0 24||C|
|Frame 11||DS0 1||DS0 2||...||DS0 24||D|
|Frame 12||DS0 1||DS0 2||...||DS0 24||1|
|Frame 13||DS0 1||DS0 2||...||DS0 24||D|
|Frame 14||DS0 1||DS0 2||...||DS0 24||C|
|Frame 15||DS0 1||DS0 2||...||DS0 24||D|
|Frame 16||DS0 1||DS0 2||...||DS0 24||0|
|Frame 17||DS0 1||DS0 2||...||DS0 24||D|
|Frame 18||DS0 1||DS0 2||...||DS0 24||C|
|Frame 19||DS0 1||DS0 2||...||DS0 24||D|
|Frame 20||DS0 1||DS0 2||...||DS0 24||1|
|Frame 21||DS0 1||DS0 2||...||DS0 24||D|
|Frame 22||DS0 1||DS0 2||...||DS0 24||C|
|Frame 23||DS0 1||DS0 2||...||DS0 24||D|
|Frame 24||DS0 1||DS0 2||...||DS0 24||1|
There's one more issue when configuring or troubleshooting digital WAN circuits, and that's signaling. Signaling refers to the borrowing of data bits (or entire timeslots) to provide a channel for sending control information. Note that these borrowed bits are in addition to the framing bits that we already discussed. Signaling is necessary when a digital circuit carries voice traffic. The signaling channel, provided by the borrowed bits, can carry information necessary for telephony, such as off-hook, ringing, and so on.
In the T1 Superframe format, signaling is placed in the least significant bit position of every DS0 channel in the 6th and 12th frame of every Superframe. The signaling bit in the 6th frame is called the A bit; the signaling bit in the 12th frame is called the B bit. Together, this is called A/B signaling. With ESF, there are also C and D bits. The C bit is the least significant bit of every channel in the 18th frame, and the D bit is the least significant bit of every channel in the 24th frame. This is called A/B/C/D signaling. Both A/B and A/B/C/D signaling are also known as robbed bit signaling.
In Europe, the E1 system is based on a frame structure of 32 timeslots, numbered 0 through 31. Timeslot 0 is used for framing. Timeslot 16 was originally designed to carry signaling information for the other 30 channels. The 64-Kbps bandwidth in timeslot 16 was allocated to the 30 information channels using a multiframe structure. Each channel was allocated a total of 2000 bits per second to carry four signaling bits. This is called Channel Associated Signaling (CAS).
In recent years, the term CCS has come to mean any system that does not use a specific bit structure for signaling. Instead, all or part of a channel is used to transparently pass control and signaling messages between two devices. This type of system is commonly found in ISDN, where the D channel is used to pass signaling messages.
CAS wastes bandwidth because, for any given channel, the signaling bits are generally only necessary at the beginning of a call to set up the connection and at the end of the call to tear it down. Consequently, a newer signaling method was invented, called Common Channel Signaling (CCS). In the CCS format, a single timeslot (usually 16, but not necessarily) provides a reserved 64-Kbps transparent signaling channel that can be used to exchange signaling information of any type and in any format. Signaling information is sent for a particular channel only when necessary.
The T1 and E1 systems, and their complementary framing and signaling methods, were built with troubleshooting in mind. We gave you a glimpse of some of the bits and channels involved, but because this guide is written for WAN enterprise engineers (with a Cisco router and certification focus), we spared you the gory details. Nonetheless, it is a good idea to recognize some of the strange terminology that may spout from the mouths of service provider technicians. In the United States, three phrases that should make your ears perk up are Red Alarm, Blue Alarm, and Yellow Alarm. These three alarms are used to indicate different problems in the transmission or reception of data in a T1 system.
When Customer Premises Equipment (CPE), such as a CSU/DSU or a router with a built-in CSU/DSU, detects an incoming signal failure, it goes into Red Alarm. The CPE then transmits a Yellow Alarm to the upstream provider to notify the provider and the far end of the problem. Intermediate equipment, such as a repeater between the carrier's Central Office and CPE, can also notice a problem and transmit a Blue Alarm.
On a Cisco router, the show controller t1 command provides information on alarms and other errors. On a CSU/DSU, you can monitor LEDs to determine if alarms or other errors are occurring. Table 5 lists some typical errors that a CSU/DSU or router might report and what they mean.
Table 5. T1 Errors and Alarms
|Loss of Signal (LOS)||The CSU is not receiving a valid T1 signal.||- Improperly wired cable from the demarc.|
- Carrier problem.
|Out of Framing (OOF) or Red Alarm||The CSU cannot synchronize on the received T1 framing pattern.||- Incorrect framing configured. |
- Excessive errors on the T1.
|Alarm Indication Signal (AIS)||The CSU is receiving unframed all ones on the network interface.||- Upstream equipment is in a test or Blue Alarm state.|
|Yellow Alarm||The received T1 signal contains the Yellow Alarm bit pattern.||- Far-end equipment has a problem with its incoming signal, causing the far end to send the Yellow Alarm bit pattern.|
When a WAN circuit is down, or experiencing excessive errors and alarms, your service provider may ask you to do some loopback testing. In generic terms, loopback simply means that the sender and the receiver are the same device. Transmitted data is looped back to the sender. Successively including more pieces of a WAN circuit in the loop allows components of the circuit to be tested and the source of the problem to be determined. Loopback testing should begin on your premises with local CSU/DSU loopback testing and then proceed on to loopback testing involving the service provider.
Figure 7 shows the types of loopback testing that can be done. Local CSU/DSU loopback testing can be done at either end of the circuit. If local loopback testing shows that the routers, CSU/DSUs, and any cables connecting them are not faulty, then further loopback testing is done with the help of the provider.
Figure 7. Loopback Testing
Local Loopback Testing
Local loopback testing sends all data that your local equipment transmits back to that equipment, without sending the data into the provider's network. If the data successfully returns to the sender, then local equipment can be eliminated as a possible cause of problems with the WAN circuit. To implement local loopback testing, the WAN circuit is looped at the CSU/DSU interface that faces the provider's network. This is called the network interface on the CSU/DSU. If the router has a built-in CSU/DSU, the circuit is looped at the router.
Both hardware and software local loopback testing are possible. Hardware loopback testing is accomplished with a loopback plug. A T1 CSU/DSU uses an RJ-48C interface. A 56-Kbps CSU/DSU uses an RJ-48S interface, which has different pinouts. Both interfaces are compatible with RJ-45 plugs. To create a loopback plug for a T1 CSU/DSU, follow these steps:
Use wire cutters to remove the connector from one end of a working RJ-45 cable that is 5 inches long.
Strip the wires.
Twist the wires from pins 1 and 4 together. (Use 1 and 7 for a 56-Kbps CSU/DSU.)
Twist the wires from pins 2 and 5 together. (Use 2 and 8 for a 56-Kbps CSU/DSU.)
Replace the connector.
Insert the connector into the network interface on the CSU/DSU.
Software loopbacks are implemented with software configuration commands or with a loopback button on a CSU/DSU. For most Cisco router platforms, the command will be of the form loopback or loopback dte or loopback local. This will loop the circuit from inside the CSU/DSU back toward the router, thereby isolating that section of the circuit.
To run a loopback test on channelized T1s using Primary Rate Interface (PRI) or Channel Associated Signaling (CAS), you need to use the channel-group T1 controller command to create one or more serial interfaces mapped to a set of timeslots in the channelized T1. If the T1 is configured as a PRI, you need to remove the pri-group before using the channel-group command.
While in loopback mode, run some test data through the looped circuit. For example, try pinging the router's serial interface. First, make sure that the router's serial interface is set to use HDLC encapsulation and that it has an IP address configured. Next, clear the counters for the serial interface with the clear counters command. Then, use Cisco's extended ping facility to send numerous pings. The router should respond to its own pings. Examine the output from the show interface serial command to determine if input errors are increasing. If the pings succeed and input errors have not increased, the local hardware (DSU, cable, router interface card) is probably in good condition.
Loopback Testing with a Provider
If local CSU/DSU loopback testing on both ends of the circuit rules out problems with the CSU/DSUs, routers, and any cables connecting them, then the next step is to involve the service provider. The provider will need your assistance at this point. [This is not the same thing as the provider performing diagnostic or Bit Error Rate Test (BERT) testing on the line, which can be done without your assistance.]
In this type of loopback testing, the circuit will be looped from the provider's network back to your premises. (See the "Provider-assisted loopback" section of Figure 7.) Once the circuit is looped, you can then monitor the looped circuit from your router or CSU/DSU. On a router, you can ping the local serial interface, as you did in the local loopback testing discussed earlier.
To begin the testing, the provider should supply a loopback at the first switch that your circuit normally passes through, and loop the circuit back toward your router. In this way, you can remove the provider's internal network "cloud" from the equation and test only the piece of the circuit that covers the first provider switch down to and including your telco jack at the demarc, CSU/DSU, and router.
Once this testing is completed and you have verified that data can get through the looped circuit without errors, the same procedure should be completed on the other end of the circuit. [If the remote end is your Internet Service Provider (ISP), then you will need to involve the ISP to help test this portion of the circuit.]
If the testing proves that there is a problem in the circuit between the first provider switch and your equipment, the provider can help you test that portion of the circuit. Between the first provider switch and your telco jack at the demarc, the provider probably has several pieces of equipment that can be looped for additional diagnostic tests. Also, keep in mind that if you have a so-called extended demarc, this should be investigated as a potential problem area. When implemented incorrectly, extended demarcs can produce errors on the line. (Extended demarcs occur when the provider extends the original demarc point to a location closer to your equipment.)
If the looped circuits work on both ends, you have proof that the problem is within the provider's network. The provider can either investigate on its own at this point or continue the loopback testing with you, backing off one switch at a time further into the provider's network, doing a loopback toward your equipment.
At some point the provider will back up against your CSU/DSU at the other end of the circuit. At that point, you can configure that CSU/DSU to send all data that it receives from the provider's network back to the network. CSU/DSU documentation may call this a remote loopback or line loopback. (Many CSU/DSU vendors support numerous loopback tests that we haven't discussed. You should check your CSU/DSU documentation for more information.)
In the next few sections, we will get into more detail on troubleshooting from the viewpoint of a router WAN interface. In most cases, we will be concerned with standard serial interfaces. Some of the discussion applies to other types of WAN interfaces also, such as ISDN, High-Speed Serial Interface (HSSI), Asynchronous Transfer Mode (ATM), and packet OC3 interfaces.
One of the first steps when troubleshooting a WAN problem is to examine the output of the show interface command on a Cisco router. With leased lines, Frame Relay WANs, and ISDN PRI, use the show interface serial command. If the interface is an ISDN Basic Rate Interface (BRI), use the show interface bri command. On ATM networks, use the show interface atm command, and on packet OC3 interfaces, use the show interface pos command.
The show interface command displays the type of encapsulation in use on the interface, the number of interface resets that have occurred, the reliability and load of the interface, and other useful information, as shown in the following output from a router in Bend, Oregon, that just booted:
Bend#show interface s0 Serial0 is up, line protocol is up Hardware is MCI Serial Internet address is 192.168.40.1 255.255.255.0 MTU 1500 bytes, BW 1544 Kbit, DLY 20000 usec, rely 255/255, load 1/255 Encapsulation HDLC, loopback not set, keepalive set (10 sec) Last input 0:00:01, output 0:00:01, output hang never Last clearing of "show interface" counters never Output queue 0/40, 0 drops; input queue 0/75, 0 drops 5 minute input rate 0 bits/sec, 0 packets/sec 5 minute output rate 0 bits/sec, 1 packets/sec 51 packets input, 3658 bytes, 0 no buffer Received 18 broadcasts, 0 runts, 0 giants 3 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored, 3 abort 51 packets output, 3439 bytes, 0 underruns 0 output errors, 0 collisions, 4 interface resets, 0 restarts 0 output buffer failures, 0 output buffers swapped out 7 carrier transitions
There Are No Collisions on Serial Interfaces!
One of the statistics displayed in the output of the show interface serial command causes confusion. That's the collision count. Just ignore it! Cisco programmers used a template for the output that is based on the output from the show interface ethernet command. There are no collisions on a serial interface, regardless of the encapsulation or technology. Collisions occur only on Carrier Sense Multiple Access networks, including Ethernet, 802.3, LocalTalk, Aloha, and 802.11 networks.
Inspecting the first line of output from the show interface serial command is an important step when troubleshooting. When a serial interface is enabled (not administratively shut down), but is still not working, there are three possible messages in the first line of output:
SerialX is down, line protocol is down (down/down)
SerialX is up, line protocol is down (up/down)
SerialX is up, line protocol is up (looped)
Serial Interface Is Down/Down
When a serial interface is down/down, there is probably a Physical Layer problem. The meaning of this message is that the interface is not detecting a Carrier Detect (CD) signal. The most likely cause is a disconnected, faulty, or improperly constructed cable between the router interface and the CSU/DSU. The CD signal is carried on the cable between the CSU/DSU and the router. For most implementations, this is a V.35 cable. Check the LEDs on the CSU/DSU to see whether the CD signal is active. Also swap out the cable with a cable that is known to work. If necessary, insert a breakout box to check the signaling between the CSU/DSU and the router.
Your next troubleshooting step should be to make sure there are no faulty hardware components, including the router serial interface card and the CSU/DSU. If you suspect faulty router hardware, move the WAN circuit to another serial interface. If the connection comes up, the previously connected interface has a problem.
It is also possible that the carrier's line is not connected correctly to the demarc or that your cabling to the CSU/DSU from the demarc is faulty. You can use a cable tester to verify your cabling from the demarc. If local hardware and cabling checks out, then contact your service provider. The problem may be with the provider's equipment or service. The line may be noisy or there may be a misconfigured or failed switch in the provider's network.
Cisco documentation says that a down/down interface means the router interface cannot determine that CD has been asserted. As mentioned earlier, a CSU/DSU acts more like a DSU than a CSU and presents a synchronous signal to the router. The CSU/DSU must assert Data Carrier Detect (DCD or CD), Data Set Ready (DSR), and Clear to Send (CTS). The router, which is playing the DTE role, must assert Data Terminal Ready (DTR) and Request to Send (RTS). For all these control leads to be asserted, and for the router to recognize that they are asserted, the router and CSU/DSU hardware must be operational, and the cabling must be properly constructed. However, even if all hardware checks out, there are still cases when the interface may be down/down or flapping between up/up, up/down, and down/down. For example, some CSU/DSUs won't assert CD if the link to the carrier is having problems. In these cases, check the CSU/DSU configuration. Make sure the right clocking, framing, and encoding are configured. If they are correctly configured, then call the service provider and work with the provider to isolate the problem.
Serial Interface Is Up/Down
When the interface is up/down, the router is reporting that it is unable to send and receive Data Link Layer keepalive frames. Possible causes for the interface being up/down are a misconfiguration on one of the routers, a failed local or remote CSU/DSU, or a problem with the carrier's network. A router could be misconfigured with the wrong encapsulation that doesn't match the router at the other end, for example. In the case of Frame Relay, the router could be using the wrong Local Management Interface (LMI) to send keepalives to the carrier's local switch, as will be discussed in the "Frame Relay" section.
To isolate the problem, try putting the CSU/DSU in local loopback mode. Use the show interface serial command to determine if the line protocol comes up (looped). If the line protocol comes up, the problem is probably a carrier problem or a failed remote CSU/DSU or router. If the problem appears to be on the remote end, put the remote CSU/DSU in local loopback mode and see if the line protocol comes up on that end. If the problem appears to be in the middle somewhere, you can put the remote CSU/DSU into remote (line) loopback mode and test from the local end. You can also work with your provider to do additional loopback testing.
While testing in loopback mode, enable the debug serial interface command and monitor the keepalive counter. The keepalive counter should increment. Specifically, the values for mineseen and yourseen should increment every 10 seconds. (See the section "Troubleshooting Cisco's HDLC Implementation" later in this Tutorial for more information on debugging keepalive frames.)
If the line protocol does not come up in local loopback mode, and if the output of the debug serial interface command shows that the keepalive counter is not incrementing, a router hardware problem is possible. If you suspect faulty router hardware, change the serial line to an unused interface. If the connection comes up, the previously connected interface has a problem.
Serial Interface Is Up/Up (Looped)
When a serial interface is up/up (looped), a loop exists in the circuit. When an interface detects a loop, the sequence number in the keepalive frame immediately changes to a random number, which then increments from that starting point. If the same random number is returned over the link, the interface status will change to up/up (looped).
A looped interface may be intentional, as you are testing. But if you have completed your testing, then you should not see this result. If an interface claims to be looped and you don't think it should be, use the show running-config command to look for any loopback interface configuration commands. Remove them with the no loopback command. Also, examine the CSU/DSU to determine whether it is configured in loopback mode. If it is, disable loopback mode. Reset the CSU/DSU, and inspect the line status. If the line protocol comes up, no other action is needed. If the CSU/DSUs and routers are not configured in loopback mode, but the interface insists on claiming the link is looped, contact the service provider for assistance. The carrier's equipment may be in loopback mode.
BRI Interface Is Up/Up (Spoofing)
When using Dial-on-Demand Routing (DDR), the show interface bri command may show that the line protocol is spoofing. Spoofing tricks DDR into thinking that the interface is up/up so that entries in a routing table can point to this interface. This enables DDR to wake up and trigger a call to the ISDN network when user traffic requires the connection. Spoofing does not necessarily mean that the interface is really up. To make sure that the ISDN B channels are actually up, use the show interface bri number 1 2 for the interface selected by the number parameter. (The 1 and 2 refer to the two B channels. If you want to check just one B channel, type just 1 or 2.)
The show interface serial command displays various input errors. Input errors are caused by noise, bad connections, improper grounding, bad cables, and cables that are too long. If you suspect that noise is a problem, you may need to shield your cables better. Make sure that one and only one end of the shield is grounded. Although there are scenarios where it is appropriate to ground both ends, this gets into the province of expert bonding, shielding, and grounding engineers and is beyond the scope of this Tutorial.
If you suspect that the errors are coming from outside your premises, ask your provider to perform Bit Error Rate Tests (BERTs) on the circuit. Input errors can also be caused by misconfigurations. Make sure that clocking, encoding, framing, and signaling are correctly configured to match the provider's equipment. To isolate the source of problems, use the same processes mentioned for troubleshooting a serial interface that is down/down or up/down (swapping cables and interfaces, loopback testing, extended pings, and so on). Table 6 describes the input errors.
Table 6. Serial Interface Input Errors
|CRC||The CRC calculation for a received frame doesn't match the CRC in the transmitted frame due to a noisy serial line, unshielded cabling, or cabling that is badly constructed or too long.|
|Frame||The frame does not end on an 8-bit boundary due to a noisy serial line, unshielded cabling, or cabling that is badly constructed or too long.|
|Overrun||The serial receiver hardware was incapable of handing received data to a hardware buffer because the input rate exceeded the receiver's capability to handle the data.|
|Ignored||The interface ignored the frame because the interface hardware ran low on internal buffers. Broadcast storms and bursts of noise can cause the ignored count to be increased.|
|Abort||An abort error occurs when a frame terminates in the middle due to an interface reset or hardware problem. Cisco documentation also claims that aborts occur on serial interfaces when an illegal sequence of 1 bits (more than seven) is received, which could indicate a clocking or encoding problem.|
Troubleshooters often wonder at what point they should be concerned with CRC and Frame errors. Some documents from Cisco and other vendors specify a threshold of one bad frame per megabyte of data. In other words, an interface should not experience more than one CRC or Frame error per megabyte of data received. This method is better than simply calculating a percentage of bad frames compared to good frames, which does not account for the variable size of frames. If you have a constant flow of 64-byte frames, for example, and a percentage of them are getting damaged, that probably represents a more serious bit error rate problem than the same percentage of 1500-byte frames getting damaged. So it's better to use a total number of bytes rather than a total number of frames in the calculation.
When troubleshooting at the Data Link Layer, which deals with frames rather than bits, you can't actually determine a bit error rate, but you can at least get a rough estimate by considering the number of CRC or Frame errors compared to the number of megabytes received. The megabyte of data threshold comes from industry cabling standards that state that copper cables should not have a bit error rate that exceeds 1 in 106. (Fiber-optic cables should not have a bit error rate that exceeds 1 in 1011.)
A lot of Cisco documentation simply states that a problem exists if input errors are in excess of 1% of total interface traffic. This is easier to remember, but it's actually just as hard to comprehend. The documents don't specify whether you should compare the input errors to the number of frames or the number of bytes received. If they mean frames, then we have the problem already mentioned (no accounting for variable frame sizes). If they mean bytes, then 1% is very high. On a loaded network, 1% of total bytes represents a very high bit error rate. When troubleshooting your own networks, you may want to use a number less than 1%, even if you do want to remember 1% to pass Cisco tests.
When troubleshooting input errors, you should also consider a timeframe and whether there's been a burst of errors and how long the burst has lasted. The telco practice is to report total errors along with errored seconds, for example.
WANs and Retransmissions
The consequence of an input error is that the entire Data Link Layer frame is dropped, and in many cases must be retransmitted. (In some applications, dropped frames don't matter, but for the purposes of this discussion, assume that the application requires reliable delivery.) The question of who retransmits if there are errors on a WAN link is one that CCIE candidates debate endlessly. We will summarize the answer in this section.
Most WAN (and LAN) protocols have a Frame-Check Sequence (FCS) field in the Data Link Layer frame that is used for error checking. The sender performs a Cyclic Redundancy Check (CRC) on the bits in the frame and places the result in the FCS field. The recipient executes the same calculation on the received bits. If the calculation doesn't yield the same FCS that is in the frame, because of changed or dropped bits, most WAN (and LAN) protocols simply drop the frame silently, with no notification to the sender that there was a problem.
A router interface connected to a WAN circuit checks incoming frames for a bad FCS and drops a frame if the FCS is bad. The router increments the CRC or Frame counter, but does no error correction. The router at the other end of the circuit (that sent the frame) does not know that the frame was dropped. An end system device must recognize that its packet never got acknowledged and retransmit if necessary. Reliability is an end-to-end service offered by protocols such as TCP.
The question about retransmissions on WANs has some history, however. In the past, many WAN protocols provided reliability at the Data Link Layer. Data Link retransmission is an entirely appropriate technology on slow, long-delay, or high-error paths, even today, so you may still encounter WAN protocols that offer reliability with retransmissions.
A router interface running one of these protocols expects a Data Link Layer acknowledgment and retransmits if one is not received. These protocols may be used in conjunction with TCP or other protocols that also offer reliability. Table 7 lists some WAN protocols and whether they provide a reliable service for end user traffic. (Note that the table refers to end user traffic. In some cases, signaling traffic in the C-plane, used for purposes such as call setup and sending telephone numbers, is sent in a reliable fashion, even though end user traffic is not.)
Table 7. WAN Technologies and Reliability for End User Traffic
|Technology||Provides Reliability? (ACKs and Retransmissions)|
|Asynchronous Transfer Mode (ATM)||No|
|Binary Synchronous Communication Protocol (BISYNC)||Yes|
|Cisco's High-Level Data Link Control (HDLC)||No|
|Link Access Procedure, Balanced (LAPB)||Yes|
|Point-to-Point Protocol (PPP)||No|
|Synchronous Data Link Control (SDLC)||Yes|
|X.25||Yes (uses LAPB)|