Chia (Janet) Wu
janetwu@mit.edu
The Open Systems Interconnection (OSI) Reference Model, developed by the International Standards Organization (ISO) is a popular model of protocol layering. Figure 1 illustrates how the different layers relate to each other.
The application layer contains the user programs that require network services (e.g. telnet, FTP). The transport layer disassembles data from the application layer into packets or reassembles data packets into data streams for the application layer. Packet delivery or forwarding is handled by the network layer. The data link layer takes a packet and turns it into a frame, a data structure that can be transmitted through the physical layer and recovered at the receiver and also provides error detection.
In protocol layering, data packets from higher layers in the model are black-boxed and encapsulated in lower layer packets. In Figure 2, packet data in the transport layer (data with the transport header, TH) are complete abstracted away in the network layer, so that the layer cannot differentiate the transport header with the rest of the data.
Byte stuffing is required when the PPP encapsulated packet contains special characters, like the flag byte, 0x7E. A flag byte found in the middle of a HDLC packet would confuse the receiver and cause a false end-of-packet condition. Stuffing replaces a flag byte in the PPP packet with a two byte sequence: 0x7D and 0x5E (0x7E xored with 0x20). The receiver should xor any value following 0x7D with 0x20. Since this makes 0x7D a special character, it must also be stuffed, so 0x7Ds found in the PPP packet are replaced with the sequence 0x7D 0x5D.
The receive module, on the other hand, implements the reverse function of the transmit module. It receives a byte stream from the physical layer device. Since the physical layer has no concept of packet structure, the HDLC controller must determine the boundaries of packets in the byte stream. Once a packet has been extracted from the byte stream, HDLC headers are stripped from the packet. The stripped packet (Figure 3) is then transmitted up to the higher level (PPP) device.
For the implementation of the state machine, one-hot encoding was used. We felt that this approach will gain the best results in terms of speed and area, because the target device will be a FPGA. FPGA devices tend to be rich in flip-flops, so using one-hot should not be expensive.
When byte-stuffing, the transmit state machine requires two clock cycles to transmit one byte of PPP data. Similarly, when the controller transmits data from the FCS generator instead of from the PPP source (there may be a packet queued up at the input to the HDLC controller), the state machine requires a way to let the PPP source know that it is not ready to receive any data. A not-ready signal, then, is asserted by the state machine when it will not clock PPP data to the physical layer (it is otherwise deasserted).
The default state is IDLE (Figure 6), where the state machine waits for the PPP source to signal the start of a new packet. While in IDLE, the state machine transmits 0x7E to the physical layer. Successive 0x7Es represent "empty" packets. When a start-of-packet is detected, the state machine also checks if default HDLC address and control fields will be used, via the default-fields signal. If default fields are used, then the PPP source only needs to provide the packet shown in Figure 3. Otherwise, the PPP source must supply the address and control fields.
In state DEFAULT FIELDS, the state machine transmits the default values for the address and control fields in the HDLC packet: 0xFF and 0x03, respectively. After they are transmitted, the state machine starts transmitting the PPP packet, performing byte-stuffing when necessary.
If default address and control fields are not used, then the PPP must present its own address and control fields. This can happen when different values are negotiated between network nodes. In any event, the PPP source presents its own fields on the same port as PPP packet data.
The generator can be implemented in a number of different ways. Our choice was a byte-oriented method using an XOR tree with a 32 bit register to hold the value of the FCS. A multiplexor selects which byte of the FCS to transmit back to the state machine.
Like the transmit state machine, the receive state machine is implemented using one-hot encoding. Unlike the transmit side, however, the receive side uses a 6-stage pipeline at its output. Since the state machine does not know a priori when a packet ends, the pipeline guarantees that the end-of-packet condition will be detected before the last 4 bytes of the packet reach the PPP layer (Figure 9). (The last 4 bytes are the FCS bytes, which are part of HDLC only and should not be transmitted to the PPP layer, see Figure 4.)
A new packet received in END causes the state machine to go to PACKET2, because the pipeline still contains data from the previous packet. When the pipeline is flushed, the state machine resumes normal behavior, transitioning back to PACKET.
In state machine verification, outputs are carefully monitored while inputs change. By asserting the appropriate combinations of inputs, the entire state machine (states and transitions) can be traversed. Furthermore, in traversing the entire transmit state machine, byte stuffing and packet encapsulation were tested, along with control signals sent to both the PPP module and the FCS; similarly, byte destuffing was tested in the receive state machine.
The FCS modules are actually straightforward to test; the Internet document Request for Comments (RFC) 1662 (PPP in HDLC-like Framing) includes an implementation of a software CRC, using table lookup. In order to test the hardware (XOR tree) FCS, then, the same input vectors can be processed through both modues, and the results compared.
Most errors found in the code resulted from carelessness in deasserting control signals. Also found was output contention between different processes.
Once individual modules have been tested, integration testing can proceed. A PPP packet generator was written for testing the transmit data path. The generator, implemented as a state machine, sent test vectors to the transmit state machine and could also receive and respond to control signals sent by the state machine back to the PPP layer. The packet generator model was required to effectively simulate the PPP-HDLC transmit interface. While not attempted, a physical layer device model could be written for the receive interface. However, the receive state machine uses far fewer control signals, and none of them route back to the physical layer.
System testing could have been done in either schematics, where VHDL entities become schematic symbols, or in VHDL, using component declaration and instantiation. We chose the latter, due to the simplicity of the process and ease in simulation. Each path was tested by linking together the state machine and the FCS generator; the transmit path also hooked up to the PPP model mentioned previously. See appendices for timing diagrams of system testing.
The final step to take is the synthesis and fitting of the HDLC controller design to a FPGA or CPLD. This process would normally include, according to the Viewlogic documentation:
omega@sunpal2 84 % vsyn -opt e_rcv_state ViewSynthesis - V6.0.4; Powerview 6.0 (041896) c Copyright 1985,1996 by Viewlogic Systems, Inc. -- ViewSynthesis built on Apr 19 1996 Project directory is /u21e/students/omega/thesis_view/synth/e_rcv_state Reading existing ref file /u21e/students/omega/thesis_view/synth/e_rcv_state/top.ref optimizing entity e_rcv_state e_rcv_state: optimizing partition 1 with 3883 gates e_rcv_state: optimizing partition 2 with 2734 gates e_rcv_state: optimizing partition 3 with 32 gates e_rcv_state: optimizing partition 4 with 32 gates e_rcv_state: optimizing partition 5 with 32 gates e_rcv_state: optimizing partition 6 with 32 gates e_rcv_state: optimizing partition 7 with 952 gatesThese results indicate that a 25-30 state one-hot encoded state machine with low to moderate complexity results in almost 7700 gates. We believe that the implementation of the FCS generators may equate to several hundred gates each, because of the sheer size of the XOR tree (32 bits wide and 12+ gates deep without optimization).
Without a successful and complete run of the synthesis tool, we are unable to comment on the performance and actual size of the final product. Furthermore, the synthesis and timing analysis processes are both potentially time-consuming.
Transmit state machine
Transmit FCS generator
Receive state machine
Receive FCS generator
PPP module written to test transmit data
path
Linked PPP and (transmit) HDLC modules
for system testing.
Revision History 971215 dtl