Reducing data dependent jitter utilising adaptive FIR

pre-emphasis in 0.18 µm CMOS

Marius Goosen\* and Saurabh Sinha

Department of Electrical, Electronic and Computer Engineering, University of Pretoria, 0002 Pretoria,

South Africa.

**Abstract** 

Due to advances of technology in multimedia applications in recent years, the demand for high user end bandwidth

point to point links has increased significantly. Jitter requirements have become ever more stringent with the increase

in high speed serial link data rates. The introduced jitter severely degrades the performance of the high speed serial

link. This paper introduces an adaptive FIR pre-emphasis technique as a means to alleviate the problem of limited

off-chip bandwidth introducing data dependant jitter. Mathematical as well as SPICE simulation results are presented,

together with the implemented integrated circuit layouts of the novel 0.18 µm CMOS implementation. Limited results

from the experimentally tested IC are also presented and discussed. The adaptive pre-emphasis technique employed

results in a simulated data dependant jitter reduction to less than 12.5 % of a unit interval at a data rate of 5 Gb/s and

a modelled 30" FR-4 backplane copper channel.

Keywords: High speed serial link, FIR pre-emphasis, adaptive pre-emphasis, 0.18 μm CMOS, data dependant jitter,

backplane serial link

Article outline

1. Introduction

2. Mathematical modelling

2.1 Jitter

2.2 FIR pre-emphasis filtering

3. Adaptive pre-emphasis engine

4. CMOS implementation

5. Results

5.1 Mathematical simulation results

5.2 SPICE simulation results

Corresponding author. Tel: +27 12 420-2177.

E-mail address: mgoosen@ieee.org (M.E. Goosen)

1

5.3 Experimental results

6. Conclusion

Acknowledgments

References

Appendix A: Detailed mathematical expressions for each of the applied FIR filter taps

1. Introduction

The increase in high user end bandwidth applications, such as high definition multimedia streaming has

brought forth a need for higher speed, higher fidelity serial links. High speed serial links are the high

bandwidth communication method of choice for its characteristic high bandwidth, low pin count solution

[1, 2]. This results in an unrivalled bandwidth per pin solution.

Jitter requirements have become more stringent with this rapid increase of backplane serial link data rates.

The increase in on-chip bandwidth, according to the well known Moore's law transistor scaling, has

produced higher speed, higher bandwidth serial links due to the fact that the corresponding off-chip

bandwidth does not scale at the same rate. The channel bandwidth limitation hence introduces a type of

timing jitter, called data dependent jitter (DDJ) which is one of the main contributors to the total system

jitter [3]. The channel bandwidth limitation exhibits a low pass filter effect attenuating the different

frequency components within the data signal by different amounts. Thus higher data rates result in lower

amplitude signals at the receiver as well as received signals exhibiting a long "tail" directly interfering

with adjacent transmitted data bits. This leads to uncertainty in determining the exact pulse edge with

regards to the optimal sampling instant. To alleviate the DDJ imposed by the backplane channel, the

transmitted data is pre-distorted by a FIR pre-emphasis filter [4 - 10].

Section 2 introduces the mathematical modelling of jitter and FIR pre-emphasis filtering. Section 3

continues with a detailed discussion of the adaptive FIR pre-emphasis, especially, the pilot signalling and

peak detection method as is implemented in this paper. Section 4 discusses the CMOS implementation of

the system followed by the mathematical and SPICE simulations, as well as the experimental results

achieved, in section 5. The paper is then concluded and the results are summarised.

2. Mathematical modelling

2.1 Jitter

Jitter can aptly be defined as a deviation in a timing event from the ideal, expected or intended occurrence

in time. This deviation on the timing event introduces distortion, degrading the fidelity of the serial link to

2

ultimately result in erroneous bits received. Jitter can be subdivided into two main categories, namely, random jitter and deterministic jitter. Random jitter is mainly caused by thermal vibrations, semiconductor doping and process variations which include thermal noise, flicker- or 1/f noise, shot noise and power supply noise [11, 12]. Deterministic jitter is mainly caused by crosstalk, switching noise, insufficient power delivery, electromagnetic interference, duty cycle distortion (DCD), inter symbol interference and discontinuities in the transmission path [11].

The most common way of analysing jitter is by statistical means. The random jitter component follows, as expected, the well known Gaussian distribution. The deterministic jitter, on the other hand is characterised as having a double delta dirac distribution and can be expressed as:

$$f_{DJ}(t) = \frac{1}{2} \left[ \delta \left( t - \frac{D}{2} \right) + \delta \left( t + \frac{D}{2} \right) \right]$$

where *D* is the distance between the two delta-dirac functions. The total system jitter can henceforth be determined as the convolution of the random and deterministic jitter probability functions (PDFs).

The convolution of the two PDFs results in a unique distribution around the pulse edges of the transmitted data signal. The distribution is illustrated in Fig. 1. From this distribution it can now be intuitively seen that the pulse edge location shifts around statistically. An error region is created in the event that the total system jitter completely closes the horizontal eye opening.



 $Fig.\ 1: Total\ jitter\ PDF\ situated\ around\ the\ pulse\ edges\ of\ the\ transmitted\ data\ signal.$ 

This paper focuses on the alleviation of data dependant jitter introduced by the limited off-chip bandwidth. The assumption is made that the data dependant jitter dominates the total system jitter, hence the other deterministic jitter contributions, as well as the random jitter can be ignored. This is, however, specifically for backplane copper channels where this assumption holds.

#### 2.2 FIR pre-emphasis filtering

A typical transmission line, such as used as the backplane copper channel, produces a low pass filter frequency response. This is due to its skin effect, dielectric losses and reflections. To overcome the low pass filter effect, one of two methods can be used to overcome the channel and package bandwidth limitations. These two methods are pre-emphasis at the transmitter or equalisation at the receiver [13]. Both ways of overcoming the DDJ rely on multiplying the channel transfer function (including the chip and bonding wires) with a certain transfer function to obtain a perfect flat magnitude frequency response and a linear phase response. The typical channel response used in this paper is a 30" FR-4 copper transmission line and is illustrated in Fig. 2.



Fig. 2: Channel response used for SPICE simulations. The channel response is a combination of the package parasitic frequency response as well as the copper backplane channel frequency response.

Implementing a transmitter pre-shaping FIR filter requires only that the weighted adjacent bits need to be added to the transmitted signal to cancel the tail of the channel impulse response. The transmitter does not dictate the use of a faster technology to operate properly [7]. One way to implement an *N*-tap FIR filter is to make use of a digital to analogue converter (DAC) and a digital CMOS counter to adjust the FIR filter tap coefficients [7], as was implemented in this paper and is discussed in Section 4.

A restriction of most widely implemented FIR filter pre-emphasis transmitters is that the coefficients of the FIR filter have been designed to be either fixed or externally adjustable [4, 7, 14, 15]. Adaptive pre-emphasis is sought after to automatically adjust filter tap coefficients to provide an easy to use system with high data integrity. The FIR filter should be adjusted to produce the optimal tap coefficients in order to provide an open eye diagram at the far end of the link. In this paper an adaptive pre-emphasis technique called pilot signalling and peak detection is presented, and implemented in the IBM 7WL 0.18

µm SiGe BiCMOS process. It is important to note that only CMOS circuits were utilised in the implementation, and the choice of process was due to the sponsored MPW run availability.

### 3. Adaptive pre-emphasis engine

Two methods of adaptive pre-emphasis are presented in [16] of which the most feasible is pilot signalling and peak detection, introduced in [17]. Pilot signalling and peak detection works on the basis that by transmitting a pilot signal corresponding to a specific filter tap, the received peak voltage can be compared to an "ideal" received voltage [17], hence a decision can be made regarding the specific filter tap coefficient and can be adjusted accordingly until the optimum value is reached. Each pilot signal or sequence is chosen such that it corresponds directly to a specific filter tap. The first pilot signal transmitted is used to adjust the first filter tap. Utilising the fact that the first filter tap is now at its optimum value, the second pilot signal can be transmitted, which corresponds to the first and second filter tap. Since the first filter tap is now a fixed value, the second filter tap can be adjusted accordingly. Fig. 3 illustrates the pilot signalling and peak detection process by means of a flowchart.

To start the adaptation process all filter taps are initialized to zero. A recursive loop is then entered in which the filter taps are adjusted. The first filter tap is adjusted to be a maximum, thus a value larger than the ideal received value will be detected at the receiver when transmitting its corresponding pilot signal. The received value (expected to be larger than the ideal value) is compared to the ideal value to produce an error value (the difference between the two values) and the filter tap coefficient is decreased until the detected peak value falls below the ideal value (Error > 0). The "ideal" value used for comparison is externally applied and corresponds directly to the wanted eye amplitude at the receiver. Hence the eye amplitude is adjustable from the maximum design value of 300 - 350 mV down to one least significant bit (LSB) of the current-mode DAC depending on the attenuation of the channel.



Fig. 3: Flowchart showing the pilot signaling and peak detection adaptation method. Four pilot signals corresponding to the first four filter taps are also shown. Adaptation engine adapted from the FPGA implementation presented in [17].

A low frequency return path communicates with the transmitter to count down the tap coefficient value until an end-of-conversion signal is received. The next filter tap is then initialized to be a maximum, while keeping the previous tap at its determined value. Each filter tap is directly decreased by one LSB results in a 1.58 % decrease in the maximum achievable swing. This was deemed negligible compared to the ease of implementing the reiterative structure in CMOS as well as the tolerance on the terminating 50  $\Omega$  on-chip resistor. Using the same procedure the optimal filter tap coefficient is received for the new corresponding filter tap. This iterative process continues until all the optimal filter tap coefficients have been obtained.

The channel and the filter can be represented by their impulse responses, with the assumption that the channel can be modelled as an LTI system. For the scenario of a low pass filter channel response, as is the case for copper backplane channels, the first channel coefficient will always be the largest in the channel impulse response. The first filter tap is henceforth used in determining the following tap and the iterative process continues. The final received data is easily calculated by taking the time domain convolution of the FIR filter impulse response, channel impulse response and the specific data sequence. This can be expressed as:

$$y_{FINAL}(n) = I_{CHN} \otimes I_{FIR} \otimes D_{sequence}$$
  
=  $y(n) \otimes D_{sequence}$ 

where  $y_{FINAL}$  is the final received data,  $I_{CHN}$  the channel impulse response,  $I_{FIR}$  the filter impulse response and  $D_{sequence}$  is the transmitted data sequence. The data sequences used for the adaptation process can now be used in evaluation of the pilot signalling and peak detection method of adaptive FIR pre-emphasis. The data sequences for all six filter taps follow the same pattern as was indicated in Fig. 3. The resulting received data streams are expressed in Appendix A.

Pilot signalling and peak detection is easier to implement than a least mean squares (LMS) convergence engine, since it uses a low frequency return path and not two identical master-slave transceivers and channels, which is the more popular method for implementing adaptive pre-emphasis [16]. The low frequency feedback path hence requires no careful matching and layout considerations when implementing such a link on a printed circuit board. The tap adjustments are accomplished by adjusting the tail current source, which is a 6-bit current-mode DAC. The different current components representing the filtered signal to be transmitted are inherently added together in the current-mode logic (CML) transmitter.

Pilot signalling and peak detection has the disadvantage that the data stream needs to be interrupted while finding the optimal tap coefficients, which is not the case for a master-slave convergence strategy. The amount of filter taps implemented are also limited by the amount of real estate on the die, since each tap requires a DAC (for tail current control), a CML XOR gate (for tap weight sign control) and a CML D flip-flop (for time delay control). The real estate limitation is however also a consideration when implementing the more popular LMS engine. This work implemented six FIR filter taps for reducing the DDJ of which the first filter tap acts as the pre-driver.

### 4. CMOS implementation

The novel CMOS implemented system is illustrated in Fig. 4. The multiplexer chooses between whether data should be transmitted or the training sequences, depending on the current state of the adaptation engine. The training sequences are generated by a pilot signal generator, comprising a CMOS read only memory for storing the specific training sequences and a CML parallel-load, serial-out shift register for transmitting the training sequence to the FIR pre-emphasis driver.



Fig. 4: Complete implemented system block diagram.

The state of the circuit is controlled by three main control signals, namely, ADJUST, EOC and SHIFT. The ADJUST control signal adjusts the current-mode DAC be decreasing its value by one least significant. The SHIFT control signal specifies that the pilot signal generator should load and shift the current training sequence. In the case that a SHIFT control signal is received without an ADJUST signal, the third end of conversion (EOC) control signal will be generated. This specifies that the current filter tap is at its optimum value and that the state can now change to the next FIR filter tap. This process continues until all filter taps have been trained and the state changes to accept and transmit the applied data. The receiver amplifies and compares the received pulses to the externally applied threshold voltage which dictates the vertical eye amplitude at the receiver.

The implemented MOS CML FIR filter structure is illustrated in Fig. 5. The FIR filter tap weights are adjusted with the current-mode DAC, while all the current contributions of the filtering action are inherently added in the  $50 \Omega$  termination resistors.



Fig. 5: MOS CML FIR filter implementation.

Another important aspect of the design is the ability to generate the pilot signals and transmit it through the channel for filter adaptation. Utilising the high frequency capability of CML gates to handle the fast data throughput, and the logic implementation ability of CMOS gate the pilot signal generator shown in Fig. 6 is implemented. The SPICE simulated pilot signals at 10 Gb/s is also shown in Fig. 6, one after the other.



Fig 6: Block diagram of the implemented pilot signal generator combining CML and CMOS gates.

The pilot signal generator is controlled by CMOS logic as indicated in Fig 6. A 3-bit counter keeps track of which pilot signal to select in the read-only memory to be placed on the CML parallel load shift register. The noise in the pilot signals is due to the clock leaking trough the CML D-flip-flops. A CML multiplexer is used to allow either pilot signals or data to be transmitted to the FIR pre-emphasis filter.

### 5. Results

This section contains the mathematical and SPICE circuit level simulations as well as the experimental results achieved. The mathematical simulations were done using MATLAB while the SPICE simulations were done using Cadence Virtuoso and the  $0.18 \mu m$  IBM 7WL SiGe BiCMOS process.

### 5.1 Mathematical simulation results

The eye diagrams for six filter taps are illustrated as well as the corresponding DDJ situated around the pulse edges. Fig. 7 through Fig. 9 illustrates the improvement made with each added filter tap. With only one filter tap implemented the eye diagram at the receiver is completely closed. This is also clear from the jitter situated around the pulse edges having a distribution larger than a unit interval. With each filter tap added the DDJ reduces to within 10 % of a unit interval with the implementation of six filter taps at a data rate of 10 Gb/s. The data integrity can further be improved with the addition of equalisation in the receiver.



Fig. 7: Eye diagrams for one and two filter taps as well as their corresponding DDJ situated around their pulse edges.



Fig. 8: Eye diagrams for three and four filter taps as well as their corresponding DDJ situated around their pulse edges.



Fig. 9: Eye diagrams for five and six filter taps as well as their corresponding DDJ situated around their pulse edges.

# 5.2 SPICE simulation results

With the application of 6-tap FIR pre-emphasis the eye diagram is dramatically improved. The improved eye diagram was adapted for an eye amplitude of 30 mV at the receiver. The simulated channel used,

compares well to implemented 30" FR-4 channels which could be used as a worst-case channel. The improved eye diagram at the receiver, utilising six FIR filter taps, is illustrated in Fig. 9.

The total DDJ present in the received pulses presented in Fig. 10 versus the amount of FIR filter taps implemented is summarised in Table 1. The improved eye diagram at the receiver, with six active filter taps, has a vertical eye amplitude of 20 mV and a horizontal eye opening of 175 ps. With the simulated data rate of 5 Gb/s, the eye diagram is sufficiently open for clock and data recovery to follow. The DDJ present in the received signal is 25 ps, which equates to 12.5 % of the unit interval. Using an ideal channel without any bandwidth limitations, the eye opening should approach that of the CML FIR filter maximum designed swing of 350 mV. Due to the worst-case channel used for simulation, the eye opening is severely attenuated, by approximately 23 dB, but still remains open which personifies the FIR filter structure. The eye amplitude attenuation is comparable to the channel loss exhibited at 2.5 GHz.





Fig. 10: Improved eye diagrams at the receiver with the application of FIR filter taps. (a) 1-tap FIR pre-emphasis applied. (b) 2-tap FIR pre-emphasis applied. (c) 3-tap FIR pre-emphasis applied. (d) 4-tap FIR pre-emphasis applied. (e) 5-tap FIR pre-emphasis applied. (f) 6-tap FIR pre-emphasis applied.

### 5.3 Experimental results

The experimental results show a complete functional 1<sup>st</sup> filter tap of the transmitter. Due to the low quality pulses received from the pulse generation circuits, due to process and current driving capability variations, the CMOS logic is unable to change the transmitter state. The transmitter state is hence stuck in the default reset state. However, basic functionality for the first filter tap was shown. Averaging over three tested integrated circuits (ICs), showed an output voltage swing deviation of less than 14 %. Considering the tolerances off the on-chip termination resistors as well as the off-chip biasing resistors, a 14 % deviation is quite acceptable and can be corrected with fine tuning. Fig. 11 illustrates random data transmitted at a low data rate of 10 Mb/s. As expected, at such a low data rate the copper cables used do not have on effect on the DDJ in the system.



Fig. 11: Differential output signal of the transmitter on the first state, the reset state. The data rate used for the functionality tests was 10 Mb/s.

As already mentioned, the pulses generated are of insufficient quality to switch the CMOS logic. The same pulse generators are used in the receiver for the generating lower frequency pulses for the return path. The return pulses are generated by connecting the transmitter directly to the receiver. The low quality return pulses are illustrated in Fig. 12.



Fig. 12: Close-up view of the pulses sent back to the transmitter. The designed pulse widths are overlaid to distinguish between the two pulses.

The pulses as shown above are of insufficient amplitude, to properly switch the CMOS logic. It is however noticed that the first pulse, which is used to trigger the second pulse, can just about trigger the CMOS logic before decaying. The pulses are generated whenever the differential voltage received is larger than the set threshold value, which is the case as illustrated.

The power consumption measured with only one active filter tap ( $P_{meas} = 36$  mW) corresponds to the simulated value ( $P_{sim} = 32.5$  mW) with one active filter tap. These values are consistent when comparing to literature with similar systems and data rates [1], [2]. The receiver on the other hand, being fully functional except for the low quality pulses generated, dissipates 19.8 mW compared to the simulated value of 18 mW. The deviations can be attributed to the off-chip biasing in both the transmitter and the receiver not taken into account in the simulated power values.

### 6. Conclusion

This paper dealt with the problem of limited off-chip bandwidth causing severe DDJ at the far end of a high speed serial link implemented on a conventional copper backplane. Adaptive FIR pre-emphasis was demonstrated as a means to alleviate the problem of limited off-chip bandwidth by extending the -3 dB cut-off frequency of the total channel frequency response. The pilot signalling and peak detection method of adaptive FIR pre-emphasis was implemented in the 0.18 µm IBM 7WL SiGe BiCMOS process. The technique presented was implemented using high speed CML circuits operating in excess of 5 Gb/s whilst still maintaining an internal voltage swing of at least 300 - 350 mV. The experimental results have shown an output swing deviation of less than 14 % when averaged over three tested ICs. The voltage swing at the output of the transmitter is hence large enough for data transmission over a typical copper channel. Each implemented FIR filter tap, of which there are six, consumes less than 10 mW when driven at maximum tail current. This brings the total power dissipation for the FIR filter to less than 60 mW when all filter taps are driven at a maximum, which will never practically be the case, hence resulting in an average power consumption of approximately 45-50 mW. For the experimental results presented, the power consumed in the FIR filter transmitter, including all biasing circuits, with the first filter tap active was 36 mW with a 1.8 V supply. The power consumption of the transmitter is quite high due to the CML implementation approach, however, all peripheral circuits not performing any function while filter taps are constant, which includes the receiver, can be switched off for power-saving.

The adaptive FIR pre-emphasis technique employed reduced the DDJ situated around the pulse edges of the received signal to below 15 %, corresponding to a high fidelity signal. The specification of 15 % jitter was already met with only three implemented filter taps, showing the ability to adapt for even worse channels or a higher speed than was used in the SPICE simulation. It should be noted further that transversal filter structure implemented requires a bit delay between successive filter taps which is controlled by a clock signal. Any phase noise on the clock signal will translate directly into random jitter in the transmitted pulse sequence. Hence the filter structure as implemented can only reduce the DDJ in the system, and through proper design, minimise the DCD. The final circuit layouts are presented in Fig. 13 overlaid on the MPW run IC for identification purposes, and consumes less than 1 mm<sup>2</sup> of Silicon real estate.



Fig. 13: MPW run IC sponsored by the MOSIS educational programme (MEP), shared with two other projects. The layouts of the transmitter and the receiver are overlaid for easy identification.

## Acknowledgements

The authors would like to thank ARMSCOR, the Armaments Corporation of South Africa Ltd, (Act 51 of 2003) and the Council for Scientific and Industrial Research (CSIR) for funding this research. The authors would further like to thank MOSIS for accepting this project into their educational program (MEP) an allowing for a free multi-project wafer (MPW) run.

### References

- [1] M. Bichan and A.C. Carusone, "A 6.5 Gb/s backplane transmitter with 6-tap FIR Equaliser and variable tap spacing", *Proc. of IEEE Custom Integrated Circuits Conf.*, pp. 611-614, San Jose, 13-16 Sept. 2008.
- [2] D. Tonietto, J. Hogeboon, E. Bensoudane, S. Sadeghi, H. Khor, P. Krotnev, "A 7.5Gb/s transmitter with self-adaptive FIR", Proc. of IEEE Symp. on VLSI Circuits Digest of Technical Papers, Honolulu, pp. 198-199, 18-22 Jun. 2008.
- [3] A. Kuo, R. Rosales, T. Farahmand, S. Tabatabaei, and A. Ivanov, "Crosstalk bounded uncorrelated jitter (BUJ) for high speed interconnects", *IEEE Trans. on instrumentation and measurement*, Vol. 54, No. 5, Oct. 2005.

- [4] M. Li, T. Kwasniewski, S. Wang and Y. Tao, "A 10 Gb/s transmitter with multi-tap FIR pre-emphasis in 0.18µm CMOS technology", *Proc. of the 2005 IEEE Asia South Pacific design automation conf.*, Shanghai, pp 679-682, 18-21 Jan. 2005.
- [5] C.H. Lin, C.H. Wang and S.J. Jou, "5 Gbps serial link transmitter with pre-emphasis", *Proc. of the 2003 IEEE Asia South Pacific design automation conf.*, Kitakyushu, pp. 795-800, 21-24 Jan. 2003.
- [6] F. Weiss, D. Kehrer and A.L. Scholtz, "Transmitter and receiver circuits for serial data transmission over lossy copper channels for 10 Gb/s in 0.13 μm CMOS", *Proc. of the IEEE radio frequency integrated circuits (RFIC) symp.*, San Francisco, pp. 397-400, 11-13 Jun. 2006.
- [7] R. Farjad-Rad, C.K.K. Yang, M.A. Horowitz and T.H. Lee, "A 0.4 µm CMOS 10 Gb/s 4-PAM pre-emphasis serial link transmitter", *IEEE J. of solid-state circuits*, Vol. 34, No. 5, pp. 580-585, May 1999.
- [8] K. Yoo, G. Han and S. Park, "A 10 Gbps analog adaptive equaliser and pulse shaping circuit for backplane interface", *Proc. of the 5<sup>th</sup> World scientific and engineering academy and society Int. conf. on circuits, systems, electronics, control & signal processing*, Dallas, pp. 225-229, 1-3 Nov. 2006.
- [9] S. Rylov, S. Reynolds, D. Storaska, B. Floyd, M. Kapur, T. Zwick, S. Gowda, and M. Sorna, "10+ Gb/s 90-nm CMOS serial link demo in CBGA package", *IEEE J. of solid-state circuits*, Vol. 40, No. 9, Sept. 2005.
- [10] R. Farjad-Rad, C.K. Yang, M.A. Horowitz, and T. Lee, "A 0.3-μm CMOS 8-Gb/s 4-PAM serial link transceiver", *IEEE J. of solid-state circuits*, Vol. 35, No. 5, May 2000.
- [11] M. Cases, D.N. de Araujo and E. Matoglu, "Electrical design and specification challenges for high speed serial links", *Proc. of IEEE Electronics Packaging Tech. Conf.*, Vol. 1, Singapore, pp. 29-33, 7-9 Dec. 2005.
- [12] A. Kuo, T. Farahmand, N. Ou, S. Tabatabaei and A. Ivanov, "Jitter models and measurement methods for high-speed serial interconnects", *Proc. of Int. Test Conf. (ITC)*, Charlotte, pp. 1295-1302, 26-28 Oct. 2004.
- [13] P. K. Hanumolu, G. Wei and U. Moon, "Equalizers for high speed serial linka", *Int. J. of high speed electronics and systems*, Vol. 15, No. 2, Feb. 2005.
- [14] C.Y. Yang and Y. Lee, "A 0.18 µm CMOS 1 Gb/s serial link transceiver by using PWM and PAM techniques", *Proc. of the IEEE Int. symp. on circuits and systems*, Vol. 2, pp. 1150-1153, Kobe, 23-26 May 2005.
- [15] M. Li, T. Kwasniewski, S. Wang and Y. Tao, "FIR filter optimization as pre-emphasis of high speed backplane data transmission", *Proc of the Int. IEEE conf. on communications, circuits and systems*, Chengdu, Vol. 2, pp. 773-776, 27-29 Jun. 2004 [16] D. Tonietto, J. Hogeboon, E. Bensoudane, S. Sadeghi, H. Khor, P. Krotnev, "A 7.5Gb/s transmitter with self-adaptive FIR", *Proc. of IEEE Symp. on VLSI Circuits Digest of Technical Papers*, Honolulu, pp. 198-199, 18-22 Jun. 2008.
- [17] K. Yoo and G. Han, "An adaptation method for FIR pre-emphasis filter on backplane channel", *Proc. of IEEE Int. Symp. on Circuits and Systems*, Island of Kos, pp. 5151-5154, 21-24 May 2006.

### Appendix A: Detailed mathematical expressions for each of the applied FIR filter taps

$$y_{FINAL}(0) = [h_0c_0]$$

$$y_{FINAL}(1) = [h_0c_0, h_0c_0 + h_1c_0 + h_0c_1, h_1c_0 + h_0c_1]$$

$$= [h_0c_0, h_0(c_0 + c_1) + h_1c_0, h_1c_0 + h_0c_1]$$

$$y_{FINAL}(2) = \begin{bmatrix} h_0c_0, h_1c_0 + h_0c_1, h_0c_0 + h_2c_0 + h_1c_1 + h_0c_2, h_1c_0 + h_0c_1, \\ h_2c_0 + h_1c_1 + h_0c_2 \end{bmatrix}$$

$$= \begin{bmatrix} h_0c_0, h_1c_0 + h_0c_1, h_0(c_0 + c_2) + h_2c_0 + h_1c_1, h_1c_0 + h_0c_1, \\ h_2c_0 + h_1c_1 + h_0c_2 \end{bmatrix}$$

$$y_{FINAL}(3) = \begin{bmatrix} h_0c_0, h_1c_0 + h_0c_1, h_2c_0 + h_1c_1 + h_0c_2, h_3c_0 + h_2c_1 + h_1c_2 + h_0c_3 + h_0c_0, \\ h_1c_0 + h_0c_1, h_2c_0 + h_1c_1 + h_0c_2, h_3c_0 + h_2c_1 + h_1c_2 + h_0c_3 \end{bmatrix}$$

$$= \begin{bmatrix} h_0c_0, h_1c_0 + h_0c_1, h_2c_0 + h_1c_1 + h_0c_2, h_3c_0 + h_2c_1 + h_1c_2 + h_0(c_3 + c_0), \\ h_1c_0 + h_0c_1, h_2c_0 + h_1c_1 + h_0c_2, h_3c_0 + h_2c_1 + h_1c_2 + h_0c_3 \end{bmatrix}$$

$$y_{FINAL}(4) = \begin{bmatrix} h_0c_0, h_1c_0 + h_0c_1, h_2c_0 + h_1c_1 + h_0c_2, h_3c_0 + h_2c_1 + h_1c_2 + h_0c_3, \\ h_4c_0 + h_3c_1 + h_2c_2 + h_1c_3 + h_0c_4 + h_0c_0, h_1c_0 + h_0c_1, \\ h_2c_0 + h_1c_1 + h_0c_2, h_3c_0 + h_2c_1 + h_1c_2 + h_0c_3, \\ h_4c_0 + h_3c_1 + h_2c_2 + h_1c_3 + h_0c_4 \end{bmatrix}$$

$$= \begin{bmatrix} h_0c_0, h_1c_0 + h_0c_1, h_2c_0 + h_1c_1 + h_0c_2, h_3c_0 + h_2c_1 + h_1c_2 + h_0c_3, \\ h_4c_0 + h_3c_1 + h_2c_2 + h_1c_3 + h_0(c_4 + c_0), h_1c_0 + h_0c_1, \\ h_2c_0 + h_1c_1 + h_0c_2, h_3c_0 + h_2c_1 + h_1c_2 + h_0c_3, \\ h_4c_0 + h_3c_1 + h_2c_2 + h_1c_3 + h_0c_4 \end{bmatrix}$$

$$y_{FINAL}(5) = \begin{bmatrix} h_0c_0, h_1c_0 + h_0c_1, h_2c_0 + h_1c_1 + h_0c_2, h_3c_0 + h_2c_1 + h_1c_2 + h_0c_3, \\ h_4c_0 + h_3c_1 + h_2c_2 + h_1c_3 + h_0c_4, \\ h_5c_0 + h_4c_1 + h_3c_2 + h_2c_3 + h_1c_4 + h_0c_5 + h_0c_0, \\ h_1c_0 + h_0c_1, h_2c_0 + h_1c_1 + h_0c_2, h_3c_0 + h_2c_1 + h_1c_2 + h_0c_3, \\ h_4c_0 + h_3c_1 + h_2c_2 + h_1c_3 + h_0c_4, \\ h_5c_0 + h_4c_1 + h_3c_2 + h_2c_3 + h_1c_4 + h_0c_5 \end{bmatrix}$$

$$= \begin{bmatrix} h_0c_0, h_1c_0 + h_0c_1, h_2c_0 + h_1c_1 + h_0c_2, h_3c_0 + h_2c_1 + h_1c_2 + h_0c_3, \\ h_4c_0 + h_3c_1 + h_2c_2 + h_1c_3 + h_0c_4, \\ h_5c_0 + h_4c_1 + h_3c_2 + h_2c_3 + h_1c_4 + h_0(c_5 + c_0), \\ h_1c_0 + h_0c_1, h_2c_0 + h_1c_1 + h_0c_2, h_3c_0 + h_2c_1 + h_1c_2 + h_0c_3, \\ h_4c_0 + h_3c_1 + h_2c_2 + h_1c_3 + h_0c_4, \\ h_5c_0 + h_4c_1 + h_3c_2 + h_2c_3 + h_1c_4 + h_0c_5 \end{bmatrix}$$

### List of figures

- Fig. 1: Total jitter PDF situated around the pulse edges of the transmitted data signal.
- Fig. 2: Channel response used for SPICE simulations. The channel response is a combination of the package parasitic frequency response as well as the copper backplane channel frequency response.
- Fig. 3: Flowchart showing the pilot signalling and peak detection adaptation method. Four pilot signals corresponding to the first four filter taps are also shown. Adaptation engine adapted from the FPGA implementation presented in [17].
- Fig. 4: Complete implemented system block diagram.
- Fig. 5: MOS CML FIR filter implementation.
- Fig 6: Block diagram of the implemented pilot signal generator combining CML and CMOS gates.
- Fig. 7: Eye diagrams for one and two filter taps as well as their corresponding DDJ situated around their pulse edges.
- Fig. 8: Eye diagrams for three and four filter taps as well as their corresponding DDJ situated around their pulse edges.
- Fig. 9: Eye diagrams for five and six filter taps as well as their corresponding DDJ situated around their pulse edges.
- Fig. 10: Improved eye diagrams at the receiver with the application of FIR filter taps. (a) 1-tap FIR pre-emphasis applied. (b) 2-tap FIR pre-emphasis applied. (c) 3-tap FIR pre-emphasis applied. (d) 4-tap FIR pre-emphasis applied. (e) 5-tap FIR pre-emphasis applied. (f) 6-tap FIR pre-emphasis applied.
- Fig. 11: Differential output signal of the transmitter on the first state, the reset state. The data rate used for the functionality tests was 10 Mb/s
- Fig. 12: Close-up view of the pulses sent back to the transmitter. The designed pulse widths are overlaid to distinguish between the two pulses.
- Fig. 13: MPW run IC sponsored by the MOSIS educational programme (MEP), shared with two other projects. The layouts of the transmitter and the receiver are overlaid for easy identification.

Table1.
DDJ present in the received signal.

| FIR taps | Jitter (ps) | % of UI | Eye amplitude (mV) |
|----------|-------------|---------|--------------------|
| 1        | -           | ≥ 100   | =                  |
| 2        | 50          | 25      | 14                 |
| 3        | 50          | 25      | 16                 |
| 4        | 28          | 14      | 19                 |
| 5        | 28          | 14      | 20                 |
| 6        | 25          | 12.5    | 20                 |