# Lawrence Berkeley Laboratory UNIVERSITY OF CALIFORNIA ## Physics Division Presented at the IEEE 1995 Nuclear Science Symposium and Medical Imaging Conference, San Francisco, CA, October 21–28, 1995, and to be published in the Proceedings A CMOS Delay Locked Loop and Sub-Nanosecond Time-to-Digital Converter Chip D.M. Santos, S.F. Dow, and M.E. Levi December 1995 #### DISCLAIMER This document was prepared as an account of work sponsored by the United States Government. While this document is believed to contain correct information, neither the United States Government nor any agency thereof, nor The Regents of the University of California, nor any of their employees, makes any warranty, express or implied, or assumes any legal responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights. Reference herein to any specific commercial product, process, or service by its trade name, trademark, manufacturer, or otherwise, does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States Government or any agency thereof, or The Regents of the University of California. The views and opinions of authors expressed herein do not necessarily state or reflect those of the University of California. Ernest Orlando Lawrence Berkeley National Laboratory is an equal opportunity employer. ## A CMOS Delay Locked Loop And Sub-Nanosecond Time-to-Digital Converter Chip Talk presented at the IEEE Nuclear Science Symposium and Medical Imaging Conference October 1995 > Dinis M. Santos Aveiro University, 3810 Aveiro, Portugal Scott F. Dow, Michael E. Levi E.O. Lawrence Berkeley National Laboratory, University of California, Berkeley, CA 94720 This work was supported by the Director, Office of Energy Research, Office of High Energy and Nuclear Physics, Division of High Energy Physics, of the U.S. Department of Energy under Contract No. DE-AC03-76SF00098. ### DISCLAIMER Portions of this document may be illegible in electronic image products. Images are produced from the best available original document. ### A CMOS Delay Locked Loop And Sub-Nanosecond Time-to-Digital Converter Chip Dinis M. Santos Aveiro University, 3810 Aveiro, Portugal Scott F. Dow, Michael E. Levi E.O. Lawrence Berkeley National Laboratory, Berkeley, CA 94720 #### Abstract Many high energy physics and nuclear science applications require sub-nanosecond time resolution measurements over many thousands of detector channels. Phase-locked loops have been employed in the past to obtain accurate time references for these measurements. An alternative solution, based on a delay-locked loop (DLL) is described. This solution allows for a very high level of integration yet still offers resolution in the sub-nanosecond regime. Two variations on this solution are outlined. A novel phase detector, based on the Muller C element, is used to implement a charge pump where the injected charge approaches zero as the loop approaches lock on the leading edge of an input clock reference. This greatly reduces timing jitter. In the second variation the loop locks to both the leading and trailing clock edges. In this second implementation, software coded layout generators are used to automatically layout a highly integrated, multi-channel, time to digital converter (TDC). Complex clock generation can be achieved by taking symmetric taps off the delay elements. The two circuits, DLL and TDC, were implemented in a CMOS 1.2µm and 0.8µm technology, respectively. Test results show a timing jitter of less than 35 ps for the DLL circuit and better than 135 ps resolution for the TDC circuit. #### I. INTRODUCTION The generation of precise time delays with very low time jitter is useful in many time measurement applications in nuclear and high energy physics. Most solutions use phase-locked-loops (PLL). Delay-locked loops (DLL's) may, however, be preferable due to their inherently better stability. In designing DLL's, however, one of the main problems is the phase comparator. Unlike the situation in synchronization circuits for telecommunications, the phase comparator must distinguish a positive (leading) from a negative (lagging) delay between the relevant features of the two waveforms to be compared. Several solutions have been proposed for this problem[1,2]. This design is proposed as a robust alternative when only one frequency of operation is of interest. The circuit lends itself, to be used as a kernel in a number of applications, such as clock pulse alignment and de-skewing, time interval generation for serialization and de-serialization schemes in telecommunications, and time-to-digital conversion. Direct time-to-digital conversion, as opposed to time-to-amplitude conversion followed by a multichannel analyzer, requires excessive clock speeds if sub-nanosecond time resolutions are to be achieved. An improvement can be made if the period of the clock is finely divided into smaller intervals and then edges, not pulses, are counted. In this work a tapped, electrically controllable delay line was used to obtain edges delayed by exactly 1/32 of the clock period. Precise spacing of the successive pulse edges was achieved by means of a feedback scheme, forming a delay-locked loop (DLL). #### II. DELAY LOCKED LOOP An overall block diagram of the DLL circuit is shown in Fig.1. Fig. 1 Block diagram of the DLL chip The tapped delay chain is controlled by a DC voltage generated by the phase detector, the operating principle of which ensures that feedback is always negative. The multiplexer is built along conventional lines, and will not be discussed here. The critical components of the diagram in Fig.-1 are the delay chain and the phase detector. #### A. Delay Chain The delay chain is a modified version of the one used by Arai et al[3]. One cell is depicted on Fig. 2. During the time when the input of the circuit in Fig. 2 is high, the current available to discharge the parasitic capacitance C<sub>1</sub> seen from point B to ground is the current through M5. This in its turn determines the time delay between turning on M2 and turning off M4. It can be readily observed that this variable delay acts on the trailing edge of the waveform at the output of the inverter M2-M4. A similar line of thought applies to M6 with respect to the charging rate of the parasitic capacitance C2, with the difference that now the gate voltage of M6 controls the delay of the leading edge of the waveform at the output of the circuit. If this cell is part of a chain of n similar cells, a pulse delayed by nt with respect to the input appears at the output where t is the delay of the cell represented in Fig. 2. This delay is controlled by the gate voltage of M5 and M6. The range of delay can be controlled by sizing the transistors in the circuit. Fig. 2 Delay element unit cell. #### B. Phase Detector The causal relationship between the two waveforms to be compared (A and B in Fig. 4) makes it necessary for the phase detector to distinguish between a phase lag of a and a phase lead of $2n\pi$ -a. This is distinct from the situation in a PLL, as in PLL's the two waveforms whose phase is to be compared are uncorrelated. The P and N pulses needed to drive the charge pump of Fig. 3 are generated by a phase detector based on the Müller C-element. [4] Fig. 3 Complete phase detector Fig. 4 shows the relevant idealized waveforms for the complete phase detector circuit. It is readily observed in Fig. 4 that, as long as the A and B pulses overlap, the output of the C-element follows whichever of the inputs, A or B, lags the other. The only condition for this is that the two inputs must overlap. The addition of some simple logic permits the generation of the P and N signals to control the pump that generates the control voltage $V_{\text{C}}$ for the delay chain. As implemented, the circuit operates on the falling edge of waveforms A and B. Feedback is negative, as needed: a positive change in $V_{\rm C}$ decreases the delay. Under lock, P will stay at 1 and N at 0. $V_c$ only has to change in order to compensate for the charge lost in the holding capacitor $C_h$ through leakage. A very low jitter is thus to be expected. Fig. 4 Idealized waveforms of the phase detector circuit showing the proportional control. The DLL circuit was implemented in a 1.2µm, double-metal, single poly-technology by MOSIS. A 16-element delay cell was built. Care was taken to ensure that the load presented at each tap of the cell was similar. This is guaranteed by a decoder using conventional gate logic. Appropriate termination of the delay chain was achieved by proper sizing of the load from the last cell. A reset-on-power-up circuit was also added in order to provide for proper initialization. This circuit ensures that the chain starts at minimum delay (approximately 18ns), so that the A and B waveforms in Fig. 4 are certain to overlap. #### III. TDC Cicuit #### A. Dual Edge Phase Detector For the TDC circuit a pair of the Müller C-element based phase detectors are used. Each phase detector separately locks to the leading and falling edges of the reference clock. The dual edge control allows very long delay chains to be constructed without concern for signal dispersion or extinction due to potentially different propagation speeds of the two clock edges. Along with some logic and two charge pumps, this circuit generates, with a minimum number of components, two control voltages Vc1 and Vc2 needed to control the delay chain. It can be shown that the two resulting feedback loops, being of first order, are inherently stable. Similar TDC circuits have been previously reported; however, these are based on single-edge control.[3,5,6] #### B. TDC Implementation The simplified block diagram of the TDC circuit is shown in Fig. 5. The critical blocks are the DLL and the dynamic latches. The delay chain shown is a chain of double inverters with a dual phase detector for separately controlling the delay on the leading and the falling edge. In this variation the control voltages Vc1 and Vc2 separately connect to the control points on M4 and M5 in the delay element. The simulation results for the time delay of the delay element unit cell is shown in Fig. 6. The result shown is given for Vc1=Vc2. Fig. 5 Block diagram of the TDC chip #### C. Latching Scheme Operation of the TDC output latches is shown in Fig. 5. The first set of latches (register) fulfills the function of storing the state of the STOP line along the 32 time intervals into which the clock period is divided. This information must, however be stored until at least the end of the cycle (encoder). Fig. 6 Simulation of the delay through a unit cell as a function of the control voltage for Vc1=Vc2. #### IV. TEST RESULTS FOR THE DLL CIRCUIT A chip was produced and tests were carried out using a pulse generator and a time interval counter. The loop locked for clock periods between 17.97 and 30.08ns (≈33 to 55MHz). Maximum deviation of the measured delay per cell with respect to the expected value was 0.89%. Lock time was 2.5μs. Measured jitter using the above setup was ≈50ps. However, the manufacturer quotes 37ps for the inherent jitter of the interval counter. When this is taken into account, a reasonable value for the jitter of the system should be below 35ps. #### V. TEST RESULTS FOR TDC CIRCUIT A demonstration chip was built using a standard 0.8µm digital technology, with a single poly-silicon layer and three metal layers. The chip was then tested using a fast pulse generator, a high resolution (<37ps) time interval counter and a logic analyzer, along with some standard test equipment. Under the stated conditions, the loop was stable for clock periods between 16.89ns (59.2Mhz) and 25.15ns (39.8Mhz), corresponding to a time resolution of 0.53ns and 0.79ps respectively. Time jitter, defined as the rms delay deviation from the expected value, was below 80ps throughout the measured range of operation, even when no allowance is made for the time jitter of the experimental setup itself. Differential and integral non-linearity showed values of the same order of magnitude, both staying below one third of a least significant bit throughout the range. Plots of worst case INL and DNL are shown in Fig. 7. #### Worst Case DNL And INL Fig. 7 Plot of TDC Integral and Differential Non-Linearity. #### VI. CONCLUSIONS Loop performance is limited at the high end of the frequency range by the minimum delay achievable within the available the technology, and at the low end by the degree to which current control transistors M4 and M5 can be starved. If a very long chain is desired, the inevitable asymmetries in the delay times for the leading and the falling edge of the clock pulse (assumed above to be a symmetrical square wave) will eventually lead to pulse extrinction. This difficulty can be circumvented by using a double loop configuration, where one loop controls the leading edge of the delayed pulse and the other the falling edge. #### REFERENCES - [1] Waizman, A.: "A Delay Line Loop for Frequency Synthesis of De-Skewed Clock" ISSCC95 Conf. Abstracts, pg. 298. - [2] Lee, T., et al., "A 2.5V Delay-Locked Loop for an 18Mb 500 MB/s DRAM" ibidem, pg. 300. - [3] Arai, Y., Matsumura, T., Endo, K.: "A CMOS Four-Channel ×1K Time Memory LSI with 1-ns/b Resolution" *IEEE J. Solid-State Circuits*, 27, NO. 3, pp. 359-364, March 1992. - [4] See, e. g., Mead & Conway, Introduction to VLSI Systems, Addison Wesley 1980, p. 250, 254. - [5] J. Christiansen, An Integrated CMOS 0.15 ns Digital Timing Generator for TDC's and Clock Distribution Systems, *IEEE Trans. Nucl. Sci.*, Vol. 42, No. 4, August 1995. - [6] Y. Arai and M. Ikeno, A Time Digitizer CMOS Gate-Array with a 250 ps Time Resolution, KEK Preprint 95-97, June 95.