Time Resolution Improvement Using Dual Delay Lines for Field-Programmable-Gate-Array-Based Time-to-Digital Converters with Real-Time Calibration

: This paper presents a time-to-digital converter (TDC) based on a ﬁeld programmable gate array (FPGA) with a tapped delay line (TDL) architecture. This converter employs dual delay lines (DDLs) to enable real-time calibrations, and the proposed DDL-TDC measures the statistical distribution of delays to permit the calibration of nonuniform delay cells in FPGA-based TDC designs. DDLs are also used to set up alternate calibrations, thus enabling environmental effects to be immediately accounted for. Experimental results revealed that relative to a conventional TDL-TDC, the proposed DDL-TDC reduced the maximum differential nonlinearity by 26% and the integral nonlinearity by 30%. A root-mean-squared value of 32 ps was measured by inputting the constant delay source into the proposed DDL-TDC. The proposed scheme also maintained excellent linearity across a range of temperatures. This study proposes a novel architecture that employs two TDLs for an FPGA-based TDC, referred to as a dual delay line (DDL) TDC. This architecture can be implemented within the calibration circuit of an FPGA chip to enable calibration operations in real time. A ﬁnite state machine (FSM) is used to manage both the calibration delay line and the conversion delay line, which allows (1) nonuniform delay cells to be calibrated and (2) time differences to be converted based on data held in calibration mapping memory. Experimental results demonstrated the superiority of the proposed DDL-TDC over conventional TDL-TDCs in terms of time linearity and maximum INL and DNL values, despite variations in temperature. This makes the proposed DDL-TDC highly robust for the measurement of time differences in a broad range of external environments.

Time resolution and time linearity are crucial issues in any FPGA-based TDC design. In 1997, Kalisz et al. presented a calibration method for an FPGA-based TDC implemented with a QuickLogic pASIC FPGA device. The TDC can reach a time resolution of 200 ps, and the conversion range of this TDC is 43 ns [13,14]. Dedicated carry lines of an FPGA are used as delay cells to perform time interpolation within the system clock period and to realize fine time measurements [15]. A TDC with calibration implemented in an Altera EP1K50TC144-1 FPGA has a resolution of 65 ps; a TDC implemented in a Xilinx XC2V4000-6BF957 FPGA has a resolution of 46.2 ps. The authors also utilized the dedicated resources of FPGA devices to design TDCs on FPGA platforms [16,17]. Another factor that affects the time linearity in FPGA-based TDC designs is the presence of ultra-wide bins (UWB) when dedicated resources are used in the FPGA device. Thus, because wave union TDCs divide the original bin size, they can deal with UWB effects and improve the time linearity [18][19][20][21]. However, the wave union method is not applicable to high-end FPGAs, in particular the Xilinx UltraScale FPGA that is manufactured with 20-nm process technology. Therefore, the authors proposed a dual-sampling TDL architecture and a bin decimation method for Xilinx UltraScale FPGAs that could make the delay elements as small and uniform as possible, so that the implemented TDCs can achieve high time resolutions beyond the intrinsic cell delay [23,24]. Recently, selected, divided, or interpolated versions of the delay bin have become more popular to manage the nonuniform delay cells in FPGA-based TDCs, and thus to improve the time resolution and time linearity [22][23][24][25][26][27][28][29][30][31].
Two-stage inner interpolation methods that employ an eight-phase clock and an equivalent time coding delay line within a 28-nm FPGA chip were reported [25]. Although such methods enable high time resolution, the designs are highly complex. Nonetheless, time histogram bin counts have been used to calibrate nonlinearity in TDL-TDC delay bins [26][27][28][29][30][31], and two-sample-sums and two-sample-differences of a quantized histogram can be used to minimize the influence of nonlinear quantization on the histogram [26]. Unfortunately, the quantization-and-nonlinearity minimization method depends on a highly complex mathematical analysis [26]. A multi-path delay line can select the most equal cell as a delay bin in an FPGA device to improve its linearity [27]. A merged delay line was also presented to delay the UWB effect [28]. Unfortunately, the methods in [27,28] must pre-run the code density test to obtain the time information of each bin, then re-design the TDC according to that time information during the calibration. Thus, the process is remarkably time consuming and impractical, especially in real-time systems. A simpler approach was proposed in [29], wherein a code density test was used to calibrate the delay cell, and the conversion output bit was adjusted to improve differential non-linearity (DNL) and integral nonlinearity (INL) values. Nonetheless, that nonlinearity correction was still processed in the same manner as an off-line program-based correction. In [30], the auto-calibration circuit was implemented in a single FPGA; however, the calibration was only run when the FPGA was started. This underlines the importance of developing a straightforward system that can be used to implement run-time corrections to resist environmental effects. In [31], the authors presented a dual-phase tapped delay line with on-the-fly calibration, implemented on a 40-nm FPGA. The results demonstrated that the characteristics of the delay cell tend to change at different temperatures. During a temperature shift from 10-50 • C, their TDC with on-the-fly calibration maintained a 12.8-ps root-mean-squared (RMS) standard uncertainty [31]. Several works have presented delay cells that are sensitive to environmental changes, especially temperature effects [13,14,16,19,23,24,30,31]. Moreover, a comparison is listed in Table 1, which presents the performance of the proposed DDL-TDC compared with existing works. This study proposes a novel architecture that employs two TDLs for an FPGA-based TDC, referred to as a dual delay line (DDL) TDC. This architecture can be implemented within the calibration circuit of an FPGA chip to enable calibration operations in real time. A finite state machine (FSM) is used to manage both the calibration delay line and the conversion delay line, which allows (1) nonuniform delay cells to be calibrated and (2) time differences to be converted based on data held in calibration mapping memory. Experimental results demonstrated the superiority of the proposed DDL-TDC over conventional TDL-TDCs in terms of time linearity and maximum INL and DNL values, despite variations in temperature. This makes the proposed DDL-TDC highly robust for the measurement of time differences in a broad range of external environments.
The remainder of this paper is organized as follows. The proposed FPGA-based DDL-TDC is outlined in Section 2. Details concerning the TDL-TDC, FPGA platforms, and DNL calibration circuits are presented in Section 2. Experiments aimed at evaluating the proposed scheme are described in Section 3. Conclusions are drawn in Section 4.

Method of TDL-TDC
TDL is the most popular structure used in TDC implementation ( Figure 1); however, a lack of uniformity in delay cells tends to undermine time linearity, particularly in FPGA-based TDC devices. Figure 1 shows the architecture of the TDL-TDC, and the histogram illustrates the time distribution of the delay cell in the delay chain. It is easy to observe that the delay cell cannot reach an equal delay time in the chain. This impairs the linearity of the TDC; we want to correct this nonlinearity effect in this work. A calibration method is presented in [29], which provides a simple bin calibration scheme for FPGA-based TDC designs. Thus, the uniformity delay cell in an FPGA-based TDC can be calibrated, and the linearity can be improved. Unfortunately, the method addressed in [29] calibrated the TDC output by using a software program after the TDL-TDC output had been measured. Thus, that scheme cannot calibrate the uniform delay cell immediately in a run-time conversion. Unlike off-line methods, such as that proposed in [29], we adopted a statistical approach to the calibration of an FPGA-based TDC [30] to address the topic of time linearity in real time.

Calibration of TDL-TDC
As the code density test is adapted, the time distribution of each delay cell can be expressed as follows: . . .
where h i indicates the number of hits for the i th delay cell, C is the total hits of this calibration run, T is the range of the calibration signal, and M delay cells are allocated in this chain. The time conversion is calculated from the cells when the signal is reached. As the signal reaches the N th cell, the time conversion result T N can be expressed as follows: The TDC conversion output can be further expressed through the quantization operation Q(•): Based on Equation (5), the time conversion can be obtained according to the hit distribution of each cell, which is called the code density statistical calibration. Because the statistical bin calibration requires considerable time to obtain the time distribution of each delay cell in the chain, two delay lines are adopted in the proposed DDL-TDC to reach run-time calibration. Thus, the calibration can alternate between these two delay lines. These two delay lines are assigned to dedicated fast lookahead carry logic, referred to as CARRY4 cells in the Xilinx Vertix-5 FPGA [32], in two vertical lines by using the commands RLOCand LOC of the Xilinx ISE 14.7 software (XilinX Int., San Jose, CA, USA) tool. The dedicated fast lookahead carry logic (CARRY4) has a small time delay in the Vertix-5 FPGA logic resource, which can obtain higher time resolution than other logic resources. In each CARRY4, four delay cells are implemented, and 512 delay cells are allocated in each vertical delay line. Figure 2 illustrates the layout of the two delay chains in the FPGA chip and the details of the CARRY4 cells. Figure 3 illustrates the architecture of the proposed DDL-TDC, in which two TDLs are incorporated in the calibration circuit and a real-time FSM is implemented upstream of the TDL-TDC output. The proposed FSM has two calibration mapping memories; these memories control the two delay lines to do calibration or run TDC conversion. For this reason, the DDL-TDC can run calibration and conversion based on the control signal from the proposed FSM. Figure 4 presents a flowchart for two FSMs that manage calibration and output. The calibration FSM(FSM Cal) has three states:

Architecture of DDL-TDC
• Init: As the TDL enters calibration mode, the FSM transitions to the Init state. In this state, calibration memory will be reset to zero for all addresses, and the system begins to run code density counts. The calibration memories, Cal_Mem A and Cal_Mem B, include 512 8-bit words; thus, 9-bit addresses must be reset.
• Cal_Run: Code density counts are run in this state to calibrate the time distribution corresponding to the delay line that will be calibrated. For this, 2 17 counts are run for the Cal_Run state to obtain the time distribution based on the code density test scheme [33].
• Cal_Num: After Cal_Run, the Cal_Num sums the hit counts for each delay cell. Then, the time distribution can be converted to time delay. These items of time information are stored within the calibration memory ((Cal_Mem A or (Cal_Mem B) appropriate for the relevant cell delay. Thus, the TDC output code can be calibrated from this calibration mapping memory.   Figure 5 illustrates the timing diagram of the proposed DDL calibration. In the beginning, the TDL-A takes 2 9 = 512 cycles to reset Cal_Mem A and then uses 2 17 = 131,072 cycles to count the hits of code density testing in the Cal_Run state of the TDL-A calibration FSM. Finally, TDL-A uses 2 9 = 512 cycles to obtain the calibration map and stores the parameter in Cal_Mem A. For these (2 9 + 2 17 + 2 9 ) cycles, TDL-TDC A executes calibration, and TDL-TDC B runs the time conversion at the same time. The operation changes for the next (2 9 + 2 17 + 2 9 ) cycles; in those cycles, the TDL-TDC A runs time conversion, and the TDL-TDC B executes calibration. As TDL-TDC A executes the calibration, cal_start and cal_stop connect to Start_A and Stop_A; the input signals Trigger 1 and Trigger 2 connect to the input signals of TDL-TDC B (Start_B and Stop_B). In this work, we chose 6 ns as our clock period. Thus, the calibration time of TDL-TDC A is (2 9 + 2 17 + 2 9 ) × 6 ns = 792, 576 ns 0.8 ms. In this 0.8 ms, TDL-TDC A executes the calibration, and TDL-TDC B runs the time conversion. However, in the next 0.8 ms, TDL-TDC B executes the calibration, and TDL-TDC A runs the time conversion. Consequently, the proposed DDL-TDC performs calibration every 0.8 ms. This time is short enough to adapt to changes in the environment; the ongoing conversion is suitable for ToF LiDAR applications.

Experiment Setup
To demonstrate the proposed FPGA-based DDL-TDC, our proposed DDL-TDC architecture was implemented on a Xilinx XC5VLX110T FPGA chip by using a Xilinx ISE 14.7 tool to synthesize the RTL Verilog code and do placement and routing to evaluate its performance in terms of time linearity. Figure 6 shows the experimental setup of the proposed DDL-TDC measured system. For this, Trigger 1 and Trigger 2 166-MHz clock signals were generated using Agilent (Agilent Technologies, Santa Clara, CA, USA) 81130A signal generators with two asynchronous sources. Calibrations cal_start and cal_stop were respectively generated using two on-board oscillators from 100-MHz and 200-MHz clocks that employed discrete on-chip phase-locked loops (PLL) to generate two 166-MHz clock signals. Based on a clock signal of 166 MHz, the time conversion range was 6 ns. The resource utilization of the proposed DDL-TDC is shown in Table 2. The DDL-TDC required an additional 2% of the slice registers, 3% slice LUTs, 5% occupied slices, two PLLs, and two block RAM units, which caused a light resource overhead relative to the TDL-TDC, and the power consumption of the DDL-TDC was increased by only 7% relative to the TDL-TDC. The 6-ns interval was chosen in this work because we measured the average delay time for a vertical CARRY4 delay chain. The common technique to set the (average or total) delay of the cells is to lock the total delay of the line to a reference. For this circuit, 6 ns was a suitable delay time for 512 delay cells. A longer time would not have been suitable because the delay cells would not have worked in a single vertical chain. A smaller time would not have been suitable because the conversion range would have been too small. Thus, we chose 6 ns (166 MHz) as the clock cycle in this work. This was only suitable for the particular FPGA in this work. If a different FPGA were used, a different frequency would be expected.

Results and Discussion
The proposed DDL-TDC output 7-bit~5-bitdigital code over a conversion range of 6 ns, such that the least significant bit (LSB) of the DDL-TDC was 46.875 ps (= 6 ns/128)~187.5 ps (= 6 ns/32).   Table 3 shows the linearity improvement of the DDL-TDC compared with the TDC-TDC, which is presented as percentages without any calibration. The DDL-TDC clearly outperformed TDL-TDC in terms of INL and DNL. These results clearly demonstrate the efficacy of DDL-TDC in terms of time linearity.  To verify the time resolution of the proposed DDL-TDC, a constant time interval was fed as Trigger 2, which was constantly delayed by the Trigger 1 signal. Figure 10 presents the histogram of the measured data after performing 10 × 2 17 measurements of the constant delay time. The measured mean time was 252.8 ps, and the RMS value was 32.41 ps. Thus, the single-shot precision of DDL-TDC was 22.9 ps = 32.41/ √ 2 ps. For various temperatures, Figure 11 also illustrates the RMS values for real-time calibration TDC (DDL-TDC) and for the TDL-TDC with a constant mapping calibration table in software. The proposed DDL-TDC exhibited considerable robustness against fluctuations in temperature, as indicated by the RMS values in Figure 11. Consequently, the proposed DDL-TDC not only improved DNL and INL performance but also achieved high-resolution measurements in an FPGA-based TDC design.   In addition, the proposed DDL-TDC achieved high-resolution and high-linearity measurements. Thus, the time conversion range was only 6 ns, but a 32-ps resolution time was achieved for high-resolution measurements. The DDL-TDC could be augmented with an additional coarse counter to increase the conversion range for some applications that might require larger conversion range measurements. However, the fine time measurements would still use the DDL-TDC to maintain the high-resolution measurements.

Conclusions
This paper proposes an FPGA-based DDL-TDC, which uses FSMs to calibrate the delay in each TDC delay cell to minimize DNL values. The two delay lines in the DDL circuit perform calibrations alternately, which allows calibrations to be made in real time. Experimental results demonstrated the efficacy of the proposed DDL-TDC in terms of DNL and INL values. The proposed scheme enables this DDL-TDC to manage dedicated nonuniform delay distribution in FPGA and thereby maintain a high degree of linearity in real time.