Encoding Methods

Internetworking Basics

Chapter 7: Ethernet Technologies

Encoding Methods

These notes guide you through the understanding of encoding methods used on Ethernet.

Introduction

Overview
       Simple NRZ Encoding
       Clocking
       Bit Rate vs Frequency

Encoding on 10BASE-X Legacy Ethernet

       Manchester Encoding

Encoding on 100Mbps Fast Ethernet

       4B5B Encoding
       MLT-3 Encoding
       NRZ-I Encoding

Encoding on 1000Mbps Gigabit Ethernet

       8B1Q4 and 4D-PAM5 Encoding on 1000BASE-T
        8B10B and NRZ Encoding on 1000BASE-X

Encoding on 10Gbps Ethernet

Summary

Introduction

On completion of these notes you should...

Understand the encoding methods used for each different Ethernet technology.
Be aware that clocking information needs to be embedded in a signal for asynchronous data transfer.
Understand that both clocking requirements and the frequency limitations of different copper and fiber cables influence the choice of encoding method used.

Overview

Ethernet has evolved from the original 10Mps copper based technology to 100Mbps then 1000Mbps and onto the 10Gbps fiber based technology that is available today. For each technology, the transmission of data across the medium is no simple matter. In order to transport digital bits of data across any medium, various issues need to be considered; the bit transmission speed, the frequency of the carrier wave, signal to noise ratio, DC balance and clocking.

For each Ethernet technology, particular encoding techniques were developed such that data could be transmitted across the medium with the greatest efficiency. This document briefly describes the encoding techniques used.

~~ Simple NRZ Encoding ~~

NRZ encoding is one of the simplest encoding methods you can use. The method uses 0 volts for a logic '0' data bit and a high voltage for a logic '1' data bit. Various examples are shown in the diagram below:-

There is a problem with the NRZ encoding method though. When data is transmitted on Ethernet, there could be lots of 1's or 0's in sequence. When there are long runs of bits that are the same, there are no voltage level transitions, making it difficult to for the receiving station to know if it is clocking the signal correctly and so deducing the correct number of bits. The first example in the diagram above shows this lack of voltage level transition with long runs of 1's and 0's in sequence.

Since it can be difficult for a receiving station to know if it is correctly detecting the boundaries of each bit with the NRZ method, other encoding methods are more often used on Ethernet LAN's.

~~ Clocking ~~

When bits are sent from a sender to a receiver across a network they need a mechanism to synchronize their clocks so they both know when a bit starts and when it stops. This means that clocking information needs to be sent as well as the data so that receiving station can synchronizes its own clock with the sender's clock.

Synchronous connections use a separate clocking line to transmit clocking information. Look at the diagram below, where a '0' data bit is represented by 0V and a '1' data bit is represented by a high voltage. It is easy to deduce that the first 3 bits are 1 0 1. Thereafter, all we can tell is that there are a sequence of 0's. Can you tell how many?

It's difficult to tell isn't it! However, if a clocking signal is used along with the data signal, it is much easier to deduce where one bit starts and ends. The diagram below illustrates this. A separate clock signal is shown, where a series of pulses are sent at regular intervals.The clock signal can be used to figure out where the bit boundaries are. So, by referring to the clock pulses, we can deduce that after the first three bits in the data signal, there are eight '0' data bits in sequence.

However, Ethernet is asynchronous and it does not have a separate line that could be used to send clocking information. When a frame is transmitted, the series of 1's and 0's making up the preamble is used to initially synchronize the sender and receiver clocks. However, an Ethernet frame can transmit up to 1500 bytes of data after this initial preamble and the clocks could easily become out of synch while the data is being transmitted.

Since Ethernet does not have a separate clocking line that could be used to synchronize the clocks continuously, another method is needed for maintaining clock synchronization. In fact, clock synchronization is maintained through signal encoding. Different Ethernet technologies - 10BASE-X, 100BASE-TX etc. use different encoding methods, designed such that clock synchronization can be maintained. Of course, the encoding method is not just chosen for desirable clocking properties. The encoding method also needs to exhibit other desirable properties, such as maximum signal frequency, data rate and signal-to-noise ratio.

We will see how later how encoding schemes can send clocking information within the the data signal itself, without the need to send an additional clock pulses.

~~ Bit Rate vs Frequency~~

The rate at which data is transmitted is measured in bits per second. So 100Mbps means that 100 million bits are transmitted every second. Each Ethernet technology is limited in the number of bits that can be transmitted every second. Legacy Ethernet can transfer data at amaximum rate 0f 10Mbps, Fast Etherent is linited to 100Mbps, Gigabit Ethernet is limited to 1000Mbps and 10 Gigabit Ethernet to 10Gbps.

This is due to the limitations in the frequencies a transmission medium can tolerate, before noise renders a signal unreadable. For example, Category 3 UTP cable - used on 10BASE-T implementations - is rated at 16 MHz, while Category 5 UTP cable - used on 10BASE-TX implementations - is rated at 100 MHz.

The frequency of a signal is related to its cycle. Look at the sine wave below:-

The frequency of a signal is measured in Hertz (Hz). So 100MHz means 100 million cycles are completed every second.

One would think that the rate at which bits are transmitted over a medium are the same as the carrier frequency; i.e. if data is transmitted at 100Mbps does this mean the frequency is 100MHz? The data rate and the signal frequency are actually rarely the same on Ethernet.

The relationship between Mbps and MHz for a network cabling system depends on the signal encoding used for the data as well as the desired data rate. The diagram below illustrates this. Two different methods of encoding data are shown. In both cases, the bit rate is the same. However, the first encoding method produces a signal frequency twice that of the second encoding method.

On 10BASE-T Ethernet systems the encoding method used is such that the data rate and the signal frequency are actually the same. The Fast Ethernet standard 100BASE-TX uses a different signal encoding scheme which enables it to transmit 100Mbps with an average frequency of 32 MHz.

To sum up, for each Ethernet technology, an encoding method was chosen such that it was suitable in terms of data rate, frequency, signal-to-noise ratio and clocking.

Encoding on 10BASE-X Legacy Ethernet

Legacy Ethernet transmits data at a bit rate of 10Mbps. The connection is asynchronous which means that the clock rate needs to be deduced from the data signal. The simple NRZ encoding method cannot be used because of the difficulty of determining clocking information if a long sequence of '1's or '0's are transmitted.

So, for 10BASE-5, 10BASE-2 and 10BASE-T, another encoding method called Manchester encoding is used because it provides a more reliable clocking signal. On 10BASE-5 Ethernet, the Manchester encoded signal is sent over thick coaxial cable. On 10BASE-2 Ethernet, the encoded signal is sent over thin coaxial cable. On 10BASE-T Ethernet, the encoded signal can be sent over Category 3 UTP cable,although any new installations implementing 10BASE-T today would probably use Category 5 or 5e.

~~ Manchester Encoding ~~

Manchester encoding is an encoding method commonly used on Legacy Ethernet networks.

There are two rules to follow using this encoding method...

To send a logic '0' data bit, increase the voltage up from 0 to +V in the middle of the bit period.
To send a logic '1' data bit, decrease the voltage down from +V to 0 in the middle of the bit period.

The diagram below illustrates this:-

We can see that a high-to-low transition represents a logic '0' data bit and a low-to-high transition represents a logic '1' data bit.

Advantages

Regardless of the data sequence, there is a consistent voltage level transition in the middle of each bit-time. These frequent voltage level transitions, allow the receiver to easily determine the beginning and end of each bit, since each mid-bit transition can be used as a clock reference. Thus, Manchester encoding is effectively self-clocking, which is why it is often preferred over NRZ.

So...

A separate clock signal is not required
There are no long strings of logic '0' or logic '1' levels to cause the clock to drift.

Disadvantages

When a series of alternating 0's and 1's are sent, the frequency of the signal is 5MHz. When, a series of 0's or 1's are sent, an extra transition occurs where the signal is pulled either up or down in preparation for the next bit. This actually doubles the frequency of the signal to 10MHz.

For Manchester encoding on Legacy Ethernet, the average frequency will be 5MHz but in the worst case, the frequency may rise to 10MHz.

So...

The frequency modulation may reach twice that of NRZ
It needs more complex decoding circuitry

Encoding on 100Mbps Fast Ethernet

10Mbps Legacy Ethernet uses Manchester encoding where the clock signal can be deduced from every data bit. As explained earlier, the transmission frequency can double from 5MHz to 10MHz due to extra transitions occurring where the signal is pulled either up or down in preparation for the next bit.

If Manchester encoding was used on 100Mbps Fast Ethernet, the frequency harmonics would be such that some frequencies might reach 375MHz. Since CAT 5 UTP is rated at a 100MHz frequency, this encoding method cannot be used. So, two alternative encoding methods were adopted, MLT-3 for 100BASE-TX and NRZ-I for 100BASE-FX.

Advantages

Both the MLT-3 and NRZ-I encoding methods reduce the range of signal frequencies to an acceptable level, below 100MHz.

Disadvantages

On the negative side, both encoding methods run the risk of losing clock signal information. This is similar to the problem with NRZ, where there are no voltage level transitions when a sequence of 0's or 1's are sent. In the case of MLT-3 or NRZ-I, there are no voltage level transitions if a sequence of 0's are sent.

To combat this transition problem, an additional encoding procedure is used. The data is first encoded using using a method called 4B/5B. Then this 4B/5B encoded data is encoded further using MLT-3 for 100BASE-TX and NRZ-I for 100BASE-FX. This double encoding procedure ensures that there are no transition problems and the signal frequency is an acceptable level, below 100MHz. In other words, the 4B/5B encoding takes care of the transition problem and the MLT-3 or NRZ-I takes care of the frequency problem.

~~ 4B/5B Encoding ~~

4B/5B is an encoding method that replaces every group of 4 bits by a 5-bit code. The code ensures that at least one voltage level transition occurs for every 5 bits. The codes are shown below:-

	0	1	2	3	4	5	6	7
4-bit Nibble	0000	0001	0010	0011	0100	0101	0110	0111
5-bit Code	11110	01001	10100	10101	01010	01011	01110	01111

	8	9	10	11	12	13	14	15
4-bit Nibble	1000	1001	1010	1011	1100	1101	1110	1111
5-bit Code	10010	10011	10110	10111	11010	11011	11100	11101

As an example of 4B/5B encoding, let's encode the data stream 0111010000100000. As shown below, the data is first grouped into 4-bit nibbles. Then a 5-bit code is applied to each separate 4-bit group.

Data stream:	0 1 1 1 0 1 0 0 0 0 1 0 0 0 0 0
4 bit nibbles:	0111	0100	0010	0000
5-bit stream:	01111	01010	10100	11110

Once a 4B/5B encoding method has been applied to a data stream, the physical layer then applies MLT-3 or NRZ-I encoding depending on whether copper of fiber is being used - 100BASE-TX or 100BASE-FX.

~~ MLT-3 Encoding ~~

MLT-3 is an encoding method used on Fast Ethernet 100BASE-TX networks. It is similar to Manchester encoding in that a logic '1' is represented by a voltage transition. However, whereas a Manchester encoded signal uses a two-state waveform (0V or +V), an MLT-3 encoded signal uses a three-state waveform (-V or 0V or +V.)

The MLT-3 encoding method uses a pattern +V, 0V, -V, 0V and the rules to follow are...

To send a logic '1' data bit, change the voltage level to the next level in the pattern
To send a logic '0' data bit, keep the voltage level the same as the previous voltage level.

For example, suppose the voltage level is currently on +V and we wish to send a logic '1'. Referring to the pattern +V, 0V, -V, 0V the voltage level needs to be lowered to 0V. To send another logic '1', the voltage level needs to be lowered to -V, and so on.

The diagram below illustrates this:-

The first encoding example above shows that voltage level transitions do NOT occur when a series of 0's are sent.

To illustrate the fact that 4B/5B encoding, when applied before MLT-3, ensure transitions occur even when a series of 0's are sent, let's encode the data stream 0 0 0 0 0 0 0 0.

Data stream:	0 0 0 0 0 0 0 0
4 bit nibbles:	0000	0000
4B/5B Stream:	11110	11110
MLT-3 Stream:	+0-00	+0-00

The MLT-3 stream is also illustrated in the diagram below:-

You can see that 4B/5B encoding ensures that when the data is further encoded using MLT-3, that transitions occur even for a long sequence of 0's.

Using MLT-3, it is possible to represent four or more bits with every complete waveform, at +V, 0V, -V, 0V. This means the frequency of the signal could potentially be 1/4 that of the bit-rate, i.e. for 100Mbps the frequency could be 25MHz. However, on 100BASE-T, 4B/5B encoding is applied first before applying the MLT-3 line encoding. Since 5 bits are transmitted for every 4 bits of data due to the 4B/5B encoding, the data bit rate is actually 125Mbs for 100Mbs data throughput. When MLT-3 line encoding is applied on top of the 4B/5B encoded data, this means the actual signal frequency is 1/4 of 125Mbs at 32.5MHz.

~~ NRZ Inverted ~~

Non-Return-to-Zero Inverted (NRZ-I) is an encoding method used on Fast Ethernet 100BASE-FX networks.

There are two rule to follow using this encoding method...

To send a logic '1' data bit, invert the voltage state from whatever it was before, in the middle of the bit period.
To send a logic '0' data bit, leave the voltage state as it is.

The diagram below illustrates this:-

So, to send a logic '1' then you just change the voltage state so it's the opposite of the previous voltage state. If the previous voltage was 0V than change it to +V. If the previous voltage was +V then change it to 0V. This means that only a '1' bit changes the voltage level. A '0' bit has no effect on the voltage, the voltage level is kept the same as the previous voltage level. You should also note that the encoding can be different for the same binary pattern depending on the voltage starting point.

NRZ-I is not used on 100BASe-TX copper implementations because the signal frequency is slightly too high for CAT 5 UTP cable. With fiber, the higher frequency of NRZ-I is acceptable because fiber does not suffer from EMI noise.

Encoding on 1000Mbps Gigabit Ethernet

Gigabit Ethernet runs at ten times the speed of Fast Ethernet. Issues to be considered when choosing suitable encoding methods are higher signal frequencies, SNR (signal to noise ratio), frequent enough voltage transitions for clocking and DC balance if fiber optic cables are used.

For copper cable based Gigabit Ethernet (1000BASE-T), a pair of encoding methods was chosen, 8B1Q4 and 4D-PAM5. For fiber optic based Gigabit Ethernet (1000BASE-X), a different pair of encoding methods was chosen, 8B10B and NRZ.

~~ 8B1Q4 and 4D-PAM5 Encoding 0n 1000BASE-T~~

Gigabit 1000BASE-T Ethernet was designed to operate in full-duplex mode at a speed of 1000Mbps. To achieve this, all four wire pairs must be used simultaneously in parallel when transmitting or receiving. Wire pairs are no longer separated into a pair for transmitting and a pair for receiving - like on Fast Ethernet. Any wire pair can be used for transmitting or receiving - sending and receiving at the same time if necessary. Since any wire pair can transmit and receive at the same time, this means that there are permanent collisions on the wire. Hybrid circuits at the ends of each wire pair can separate out transmission signals from receive signals.

To achieve the 1000Mbps data rate, each wire pair must be able to transmit or receive at a data rate of 250Mbps. You can imagine the wire pairs acting as separate lanes, sharing the data between them. Effectively, any data to be transmitted is shared into quarters and each lane transmits it's quarter at 250Mbps. However, CAT 5 and 5E UTP cable are limited in the frequency range that can be used. To achieve the 250Mbps data rate, a pair of encoding methods are used.

Just as data is doubly encoded on Fast Ethernet , a pair of encoding methods is also used on Gigabit Ethernet over copper; the signal is first encoded using 8B1Q4 and then the 4D-PAM5 encoding method is applied after.

The 8B1Q4 encoding method converts each group of 8 data bits to four quinary symbols. Each quinary symbol is then line encoded using 4D-PAM5, which is a system that used five voltage levels, (similar to MLT-3 which uses three levels.)

Since 2 bits are represented for each quinary symbol and the clock rate is set at 125MHz, this gives 250Mbps data per twisted pair and therefore 1000Mbps for the whole cable.

~~ 8B10B and NRZ Encoding 0n 1000BASE-X~~

Gigabit Ethernet over fiber was designed to operate in full-duplex mode at a speed of 1000Mbps. Even though full-duplex mode is used with this technology, collisions do not occur because separate transmit and receive fiber lines are used. There are two fiber optic technologies designed to operate at 1000Mbps, 1000BASE-SX and 1000BASE-LX. 1000BASE-SX transmits light pulses with short wave lengths over multimode fiber,( the S in 1000BASE-SX stands for short-haul of short wavelength.) 1000BASE-LX transmits light pulses with long wave lengths over multimode or single-mode fiber, ( the L in 1000BASE-LX stands for long-haul or long wavelength).

The 4B5B / NRZ-I encoding used on Fast Ethernet fiber was rejected for Gigabit Ethernet over fiber because of its lack of DC balance. Maintaining DC balance is important because if a transmitter sends more 1's than 0's, you may end up with heating of lasers, resulting in higher error rates. To achieve the a DC balance as well as other desirable properties, a pair of encoding methods are used.

Just as data is doubly encoded on Fast Ethernet , a pair of encoding methods is also used on Gigabit Ethernet over fiber; the signal is first encoded using 8B10B and then the simple NRZ encoding method is applied after.

The 8B10B encoding method is similar to 4B5B in that a group of bits are replaced with code words. The difference is that with 8B10B, each group of 8 bits is replaced with a 10-bit code word. Since a 10-bit code word replaces each 8 bits of data, the line speed actually has to be 1.25Gbps for a data speed of 1Gbps.

The 10-bit code words were designed to limit the sequence of 1's or 0's that could occur, to maintain clocking. Additionally, the code words were designed to maintain DC balance. A calculation called the Running Disparity calculation is used to try to keep the number of '0's transmitted the same as the number of '1's transmitted.

Since the 8B10B encoding eliminates any possibility of long sequences of 1's or 0's and also eliminates DC bias, the simple NRZ encoding method can be used as the second encoding method. Remember, NRZ used on it's own, exhibits undesirable qualities - clocking cannot be maintained properly and there can be a lack of DC balance.

Here is a reminder of the NRZ encoding method:-

Encoding on 10Gbps Ethernet

10 Gigabit Ethernet (10GbE) operates over fiber at a data rate of 10Gbps in full-duplex mode. Collisions do not occur because separate transmit and receive fiber lines are used. There are various fiber optic technologies designed to operate at 10Gbps. Letters are used to indicate the fiber optic wavelengths each technology uses as well as the type of signal encoding used.

An "S" stands for an 850 nanometer (nm) short wavelength, an "L" stands for a 1310 nm long wavelength and an "E" stands for a 1550 nm very long wavelength. The letter "X" denotes 8B/10B signal encoding, while "R" denotes 64B/66B encoding and "W" denotes the WIS interface that encapsulates Ethernet frames for transmission over a SONET OC-STS-192 channel.

So we have...

10GBASE-SR: Uses a short wavelength and 64B/66B encoding

10GBASE-LX4: Uses a long wavelength multiplexed into into four wavelengths of light transmitted simultaneously over a single pair of fiber optic cables. 8B/10B encoding is used.

10GBASE-LR and 10GBASE-ER: Uses long wavelengths and 64B/66B encoding

10GBASE-SW, 10GBASE-LW and 10GBASE-EW: Works with SONET OC-STS-192 equipment.

The maximum transmission distance depends on the type of fiber being used.

10GBASE-SR multimode fiber can be used for distances up to 82m. 10GBASE-LX4 multimode fiber can be used for distance up to 300m, whereas single-mode fiber can run to longer distances of up to 10km. 10GBASE-LR single-mode fiber can also be used for distances up to 10km. 10GBASE-EW single-mode fiber can be used really long distances - up to 40km.

Summary

On completing these notes you should have learned the following key points:-

Each Ethernet technology uses an encoding method such that the signals exhibit desirable data rate, frequency, signal-to-noise ratio and clocking characteristics.
Manchester encoding is used on Legacy Ethernet
Fast Ethernet 100BASE-TX data is encoded twice; 4B5B encoding is used first then MLT-3 line encoding is applied.
Fast Ethernet 100BASE-FX data is encoded twice; 4B5B encoding is used first then NRZ-I line encoding is applied.
Gigabit Ethernet 1000BASE-T data is encoded twice; 8B1Q4 encoding is used first then 4D-PAM5 line encoding is applied.
Gigabit Ethernet 1000BASE-X data is encoded twice; 8B10B encoding is used first then NRZ line encoding is applied.
10 Gigabit Ethernet 10GBASE-X data is encoded twice. Sometimes 8B10B encoding is used first, sometimes 64B/66B. The variety of line encoding applied second depends on the fiber used.


	Site Home	Top	Unit Home