Article

0.13 μm CMOS Traveling-Wave Power Amplifier with On- and Off-Chip Gate-Line Termination

Aleksandr Vasjanov 1,2,* and Vaidotas Barzdenas 1,2

1 Department of Computer Science and Communications Technologies, Vilnius Gediminas Technical University, 10221 Vilnius, Lithuania; vaidotas.barzdenas@vgtu.lt
2 Micro and Nanoelectronics Systems Design and Research Laboratory, Vilnius Gediminas Technical University, 10257 Vilnius, Lithuania
* Correspondence: aleksandr.vasjanov@vgtu.lt; Tel.: +370-688-64-458

Received: 22 November 2019; Accepted: 6 January 2020; Published: 10 January 2020

Abstract: Broadband amplifiers are essential building blocks used in high data rate wireless, radar, and instrumentation systems, as well as in optical communication systems. Only a traveling-wave amplifier (TWA) provides sufficient bandwidth for broadband applications without reducing modern linearization techniques. TWA requires gate-line and drain-line termination, which can be implemented on- and off-chip. This article compares the performance of identical 0.13 μm CMOS TWAs, differing only in gate-line termination placement. Measurement results revealed that the designed TWAs with on- and off-chip termination have a bandwidth of 10 GHz with a maximum gain of 15 dB and a power-added efficiency (PAE) of 5%–22% in the whole operating frequency range. Placing the gate-line termination off-chip results in an $S_{21}$ flatness reduction, compared to the gain of a TWA with on-chip termination. Gain fluctuation over frequency is reduced by 4–8 dB when the termination resistor is placed as an external circuit.

Keywords: 5G; distributed; power amplifier; RF; traveling-wave; TWA; wireless

1. Introduction

Internet of Things (IoT), machine-to-machine (M2M) communication, software-defined radios (SDR), vehicle-to-everything (V2X) communication, multiple-input multiple-output (MIMO) systems, and cloud-based services, all of which employ high-speed wireless data transmission, are currently in the spotlight of modern research, with the first of these concepts being integrated in the most common household items [1–3]. Broadband amplifiers are the essential building blocks used in high data rate wireless, radar, instrumentation, as well as optical communication systems [4]. Multiple architectures are currently available for the system architect to choose from, which include different modifications of envelope tracking, Doherty, traveling-wave amplifier (TWA), and outphasing power amplifiers [5]. Only the TWA provides sufficient bandwidth for broadband applications without reducing modern linearization techniques, such as analog or digital predistortion [6]. Due to cost and integration considerations, CMOS offers a higher level of integration at a lower cost compared with other high-speed III-V group semiconductor technologies, such as GaAs and SiGe. The mature deep submicron (0.35–0.11 μm) CMOS technology [7] provides a high price to performance ratio for high-speed active devices along with on-chip passive components, necessary when designing broadband amplifiers. Distributed amplification is considered as a major technique for broadband power amplifiers (PA), and with the scaling of CMOS processes, the achievable unity power gain frequency $f_t$ tops 100 GHz, in turn, allowing us to design microwave or millimeter-wave amplifiers [8–10]. The TWA architecture has been evolving since 1950, from vacuum tube amplifiers to modern miniature devices, which incorporate different design solutions. The latter include uniform and non-uniform single-stage,
multi-stage, and matrix topologies, ones that employ transformer coupling or noise figure reduction techniques [4]. TWA requires gate-line and drain-line termination, which can be implemented on- and off-chip. The aim of this article is to compare the performance of identical TWAs, differing only in gate-line termination placement.

The structure of this article is as follows: the Section 2 provides the essentials of designing an 8TWA and different reported design solutions, the Section 3 presents the designed TWAs with the simulation, and the Section 4 presents and discusses measurement results; the conclusions and references are in Section 5.

2. The Traveling-Wave Power Amplifier Architecture

The conventional TWA topology (Figure 1a) is based on a principle of a ladder low-pass filter, forming a transmission line with a characteristic impedance \( Z_0 \) (typically 50 \( \Omega \)), although amplifiers based on bandpass filters are also valid [7]. Since interconnects with a typical length less than a few hundred microns are not considered as full-scale transmission lines, the transmission lines are artificially constructed using a ladder of integrated lumped-element inductors and capacitors.

![Figure 1. Traveling-wave amplifier (TWA): (a) concept and (b) small-signal model.](image)

The transmission lines in an integrated circuit (including CMOS) technology are artificially constructed from a ladder of inductors and transistor gate-source (or base-emitter in the case of bipolar devices) parasitics acting as capacitors with low equivalent series resistance (ESR) as presented in Figure 1b. As a result, this structure does not provide an unlimited bandwidth by acting like a periodical low-pass filter. The maximum bandwidth and the characteristic impedance \( Z_0 \) of a TWA is limited to the cutoff frequencies of artificial gate and drain transmission lines and can be generalized using the following expressions,

\[
\omega_c = \sqrt{\frac{1}{LC}} \quad (1)
\]

\[
Z_0 = \sqrt{\frac{L}{C}} \quad (2)
\]

where \( L \) and \( C \) are the values of the inductors and capacitors of the artificial transmission lines. Assuming that both the overall size and power dissipation of the amplifier are fixed and terminated in its characteristic impedance, then the smaller each stage is, and the higher the number of stages, the larger the \( \omega_c \) and the larger the TWA bandwidth. In reality, the bandwidth is further limited by the ubiquitous presence of losses throughout the whole pseudo-transmission line structure (this includes series resistance in the inductors, series resistance in the transistor gates, and transistor output conductance limitations) [9,11]. It is possible to reduce the losses in the transmission line inductors by simply reducing their size, but referring to Equation (1), this changes the characteristic impedance.
In order to compensate for this, one can also reduce the capacitance but to a limit of the lowest value of the FET parasitic capacitance [11]. A TWA is called uniform if all transistors are identical. In the case of a uniform TWA, the optimum power load is constant for each drain-line section [12]. It is also worthwhile noting that the artificial transmission lines are formed both at the TWA input and output, and depending on the number of identical sections, the overall area can be quite large. Both drain and gate transmission line ends are terminated appropriately at both ends to avoid reflections and achieve maximum gain with a flat response [13]. The voltage and power gain of the conventional TWA is expressed as,

\[ A_v = \frac{1}{2} n g_m Z_0, \]

\[ A_P = \frac{1}{4} n^2 g_m^2 Z_0^2, \]

where \( n \) is the number of distributed stages, \( g_m \) denotes the transconductance of each stage, and \( Z_0 \) is the drain-line termination. In order to increase the gain of the conventional TWA, one should increase the number of stages, but due to attenuation inflicted by the artificial transmission lines, it is limited to the optimum number of stages [14], which is given by,

\[ n_{\text{opt}} \leq \frac{2}{r_g \omega^2 C_{GS}^2 Z_0}, \]

where \( r_g \) is the parasitic resistance of the gate and the inductors, \( \omega \) is the highest frequency of interest, \( C_{GS} \) is transistor gate-source parasitic capacitance, and \( Z_0 \) is the characteristic impedance. Usually, the optimal TWA segment number lies in the region of \( n_{\text{opt}} = [3...5] \) due to a non-proportional gain in bandwidth compared to the losses in efficiency. The theoretical maximum power-added efficiency (PAE) of the conventional TWA can be expressed [13] as,

\[ \eta_{\text{PAE,max}} < \left( 1 - \frac{1}{A_v} \right) \frac{1}{8} n Z_0 \frac{Z_0}{R_L}. \]

Equation (6) indicates that increased periphery per stage helps to increase efficiency if this increased periphery does not adversely affect the gain at the highest frequency of interest [13,15].

Due to the superior performance in gain, bandwidth, stability, and input-output isolation, the cascode configuration is employed in gain stages in most reported papers that have reviewed TWAs [16,17]. However, a conventional distributed amplifier has disadvantages that half of the input power is wasted in the left termination of the drain transmission line, and each FET operates under different efficiency conditions. Another issue in designing of TWAs is input termination noise. For maximum power transfer from the antenna to the front-end circuit, a 50 Ω passive resistor is usually employed to terminate the input transmission line [18,19]. Various TWA design and performance improvements have been published with some of the more interesting papers, according to the authors, discussed below.

Different variations of non-uniform TWAs, where inductor values, or transistor sizing, or both, are progressively changed, and various solutions on terminating both the gate- and the drain-lines, have been proposed. One of the reported variations of the conventional TWA, called “tapered” TWA, involves progressively decreasing the inductor value along the output (drain) line. The drain transmission line tapering maximizes the forward-traveling wave while minimizing the unwanted reverse wave. The signals at each FET input are equalized since tapering compensates for the line attenuation. The power gain of a tapered distributed amplifier will be quadrupled if both the load current and the FET voltage gain are doubled. Theoretically, the tapered distributed amplifier totally eliminates the reverse waves on the drain-lines and has a class A efficiency approaching 50% [18]. Moreover, a paper [13] proposes to increase the already high gain bandwidth of the TWA by sizing each T-section (if the transmission line is to be simplified using a low pass LCL filter) \( K \) times smaller than the previous. Therefore, as reported, the signals travel down the lines toward the load-end,
the cutoff frequency becomes $K$ times larger, and attenuation becomes $K$ times smaller from each stage to the next stage. This results in close to an exponential improvement in bandwidth, compared to the conventional TWA using the same number of stages. However, a drawback exists, as the gain of each stage is decreasing almost linearly for each stage compared to the next.

Another paper [16] reported a 5-segment non-uniform TWA design where a center segment (3rd segment) size is adopted and is 2.5 times larger than other segments to boost the transconductance, while the others were designed for the required cutoff frequency of the synthetic lines. They reported a pass-band gain of 9.5 dB over a 3 dB bandwidth of 32 GHz in a 0.18 μm CMOS process.

A paper [19] reported a technique of improving noise figure (NF) and enhancing gain without consuming additional power. The proposed topology is based on adding a common gate (CG) transistor at the input transmission line for impedance matching with a transconductance of the transistor equal to $Z_0$. One significant feature of the proposed network is that the noise contribution of the matching CG transistor appears at its source and drain with a 180-degree phase shift while the signal appears with the same polarity at these nodes. This feature is employed in the proposed topology to reduce the noise contribution of the CG matching element as well as increasing the gain. The results of the proposed architecture show an average NF of 1.8 dB and a gain of 16 dB in a bandwidth of 12 GHz in 0.18 μm CMOS technology.

A conventional TWA reported in [20] proposed a gate transmission line termination circuit to reduce NF. The adopted RLC terminal network can reduce the NF to a flat and low level over the whole low-to-medium frequency range of DC–7 GHz at the expense of little input matching $S_{11}$ degradation. As a result, $S_{11}$ in the range of 5–17.7 dB, $S_{21}$ of 10.5 ± 1.4 dB, and NF in the range of 3.2 ± 0.3 dB were achieved over the DC–10.5 GHz band.

Usually TWA utilizes a single-stage design, but several papers propose different combinations, including parallel [15], cascaded single-stage (CSSDA) [17], multistage, matrix [21], and even a distributed Doherty amplifier [22,23].

An interesting approach of increasing efficiency without compromising bandwidth has been proposed in [24], demonstrating a gate-drain transmission line folded design approach, which also acts as a transformer. As a result, a well-controlled and defined coupling coefficient was achieved. The paper demonstrated two designs, one in a 0.18 μm and the other in a 90 nm CMOS process. With the respective gains of 9.5 dB and 7 dB, the TWAs had a bandwidth of 32 GHz and 61.3 GHz with a power dissipation of 71 mW and 60 mW.

The discussion above presents different approaches for improving TWA parameters, whereas the topic of defining the difference in TWA performance when the gate termination resistor is either on- or off-chip, is unknown to the authors. A summary of papers that report submicron CMOS TWA parameters, alongside the achieved results discussed in this article, is presented in Table 1. The latter summary is used as a basis of comparison to the research results presented in this paper.

<table>
<thead>
<tr>
<th>Ref.</th>
<th>Process</th>
<th>$VDD$, V</th>
<th>$\Delta f$, GHz</th>
<th>$A_{vp}$, dB</th>
<th>PAE, %</th>
<th>Topology</th>
</tr>
</thead>
<tbody>
<tr>
<td>[25]</td>
<td>0.18 μm CMOS</td>
<td>2.6</td>
<td>1–17.2</td>
<td>9</td>
<td>3.6–6.2 @ 2–16 GHz</td>
<td>Non-uniform</td>
</tr>
<tr>
<td>[26]</td>
<td>0.18 μm CMOS</td>
<td>2.8</td>
<td>1–23.8</td>
<td>11.9</td>
<td>2.7–10 @ 1–22 GHz</td>
<td>Uniform, two stage</td>
</tr>
<tr>
<td>[27]</td>
<td>0.13 μm CMOS</td>
<td>3.5</td>
<td>2–10</td>
<td>10</td>
<td>9–19 @ 2–10 GHz</td>
<td>Uniform</td>
</tr>
<tr>
<td>[28]</td>
<td>0.13 μm CMOS</td>
<td>3.5</td>
<td>2–16</td>
<td>10</td>
<td>9–17 @ 2–16 GHz</td>
<td>Uniform</td>
</tr>
<tr>
<td>[29]</td>
<td>0.18 μm CMOS</td>
<td>2</td>
<td>3–12.9</td>
<td>10.46</td>
<td>N/A</td>
<td>Uniform</td>
</tr>
<tr>
<td>[30]</td>
<td>0.13 μm CMOS</td>
<td>N/A</td>
<td>DC–8</td>
<td>6–15</td>
<td>N/A</td>
<td>Uniform</td>
</tr>
<tr>
<td>This work</td>
<td>0.13 μm CMOS</td>
<td>1.6</td>
<td>1–10</td>
<td>7–13</td>
<td>5–22 @ 1–10 GHz</td>
<td>Uniform</td>
</tr>
</tbody>
</table>
Only the designed TWA with on-chip gate termination is included in Table 1, as it provides a flatter gain response compared to that of a traveling-wave amplifier with off-chip gate termination. The proposed TWA with on-chip gate termination provides bandwidth and gain parameters which are on par with published amplifiers, but operates at a lower supply voltage and an overall higher PAE. The main goal set by the authors is to quantify the gate termination resistor placement impact on the performance of a uniform cascode TWA topology with the schematic design, simulation, and measurement results discussed in the following chapters.

3. Fabricated Traveling-Wave Amplifier Integrated Circuit Design Analysis

The designed application-specific integrated circuit (ASIC) contains two independent TWAs with the simplified schematics presented in Figure 2 and the layout alongside the fabricated IC microphotograph given in Figure 3. The geometry of both TWAs is identical except for the gate bias and termination circuits. TWA2 contains an integrated bias and RF termination circuit, whereas TWA1 contains an external one. Both the gate and the drain termination circuits present a 50 Ω load for AC signals but are an open circuit for DC to lower the power consumption. The series capacitance for termination resistors is formed by four different value capacitors in parallel, in order to provide the lowest impedance to ground at different frequencies, and thus widening the matching bandwidth. The TWA itself contains four identically sized segments of transistors in cascode configuration, which are shown in Figure 2 as double gate NMOS devices. The input LC transmission line has been designed by adjusting the inductance to the 253 fF input capacitance for each of the four TWA segments according to Equations (1) and (2). The latter input capacitors, together with 770 pH inductors, form an LC circuit with a corner frequency of 11.5 GHz.

Figure 2. IBM 0.13 µm CMOS traveling-wave amplifier chip simplified schematic.
TWAs are connected to the package using double bondwires in order to achieve a transmission line corner (80°C) operating conditions, worst-case fabricated NMOS, and PMOS transistor parameter models) and fast SS corner (40°C operating conditions, best-case fabricated NMOS, and PMOS transistor parameter models) and ESD parameter models). The supply voltage for both designed TWAs is 1.6 V. 

The estimated gate and drain inductor \( L \) values were further optimized during the simulation stage in order to achieve the flattest gain response over the widest frequency range. Moreover, the input transmission line component values were optimized, taking into account the parasitic parameters of the package (bondwires, ESD, and pad capacitance).

The input of both TWAs are DC blocked using 10 pF capacitors. The input and output of both TWAs are connected to the package using double bondwires in order to achieve a transmission line impedance closer to 50 \( \Omega \). The supply voltage for both designed TWAs is 1.6 V. \( V_{\text{cascode}} \) is used as gain control for both TWAs.

The designed TWAs were simulated under three operating condition sets: typical \( TT \) corner (40 °C operating conditions, typical NMOS, and PMOS transistor parameter models), slow SS corner (80 °C operating conditions, worst-case fabricated NMOS, and PMOS transistor parameter models) and fast FF corner (~40 °C operating conditions, best-case fabricated NMOS, and PMOS transistor parameter models).

4. Fabricated Traveling-Wave Amplifier Integrated Circuit Simulation and Measurement Results

TWAs with off-chip gate termination (TWA1) simulation results are presented in Figure 4. Small-signal S-parameter simulation results under different operating corner conditions are presented in Figure 4a. The TWA with an off-chip gate termination resistor operating bandwidth with \( S_{11} \leq -5 \) dB spans from 1 GHz to around 10.8 GHz. The \( S_{21} \) gain is above 10 dB and with a \( \pm 1 \) dB offset over different corners to around 7 GHz.
sufficient gain, and therefore, PAE up to 5 GHz. At higher frequencies, the gain of the transistor naturally reduces, and therefore, the PAE drops. In order to increase the PAE above 5 GHz, a smaller value inductor is required.

Figure 4. Traveling-wave amplifier chip simulation and measurement results: (a) TWA1 with off-chip termination gate SP simulation and measurement results, (b) TWA1 with off-chip gate termination HB simulation results.

The initial ideal model simulation results, on the other hand, revealed a flat gain response up to 10 GHz. The gain response is not flat as the termination resistor is connected through a single bondwire, which introduces reflections into the TWA gate transmission line. The noise figure NF is in the region from 5 dB to 7.5 dB up to 8 GHz, and the shape of the curve follows the shape of the S21 gain curve. The stability curve is not presented as the stability factor $K_f > 6$ in the whole frequency region of interest. The nominal bias voltage for the designed TWA1 is $V_{bias} = 660$ mV.

Large-signal harmonic balance simulation results over the full operating bandwidth at three corner conditions are presented in Figure 4b. The average output-referred compression point (OR-P1dB) is around 11 dBm. The PAE is equal to a classical $AB$-class PA up to around 5 GHz. The simulations were conducted with a constant 33 nH RF choke inductor. The latter inductor provides sufficient gain, and therefore, PAE up to 5 GHz. At higher frequencies, the gain of the transistor naturally reduces, and therefore, the PAE drops. In order to increase the PAE above 5 GHz, a smaller value inductor is required.

TWA with on-chip gate termination (TWA2) simulation results are presented in Figure 5. Small-signal S-parameter simulation results under different operating corner conditions are presented in Figure 5a and are similar to the results presented in Figure 4a. The on-chip termination leads to a much flatter gain response, a better $S_{11}$ value, and lower noise levels over the frequency range of interest, compared to that of TWA1. Large-signal harmonic balance simulation results over the full operating bandwidth at three corner conditions are presented in Figure 5b.
sufficient gain, and therefore, PAE up to 5 GHz. At higher frequencies, the gain of the transistor naturally reduces, and therefore, the PAE drops. In order to increase the PAE above 5 GHz, a smaller value inductor is required.

**Figure 4.** Traveling-wave amplifier chip simulation and measurement results: (a) TWA1 with off-chip termination gate SP simulation and measurement results, (b) TWA1 with off-chip gate termination HB simulation results.

TWA with on-chip gate termination (TWA2) simulation results are presented in Figure 5. Small-signal S-parameter simulation results under different operating corner conditions are presented in Figure 5a and are similar to the results presented in Figure 4a. The on-chip termination leads to a much flatter gain response, a better S11 value, and lower noise levels over the frequency range of interest, compared to that of TWA1. Large-signal harmonic balance simulation results over the full operating bandwidth at three corner conditions are presented in Figure 5b.

**Figure 5.** Traveling-wave amplifier chip simulation and measurement results: (a) TWA2 with on-chip gate termination SP simulation and measurement results, (b) TWA2 with on-chip gate termination HB simulation results.

The large-signal simulation results are similar for both TWA1 and TWA2 with a PAE value close to that of a classical PA and dropping below 20% at frequencies above 5 GHz. The TWA with an on-chip termination circuit is also unconditionally stable over the whole frequency range. Therefore, the stability factor Kf is not shown. The nominal bias current for the designed TWA2 is \( I_{bias} = 0.55 \text{ mA} \).

Agilent E4432B signal generator, 7 GHz bandwidth Agilent N9010A spectrum analyzer, Keysight E3631A laboratory power supply, Agilent U3402A bench multimeter, and a 6 GHz bandwidth HP8753E vector network analyzer were used during the designed dual-TWA ASIC measurements.

\( P1dB \) and PAE measurements at frequencies of 1 GHz, 2 GHz, and 2.9 GHz for both designed TWAs are presented in Table 2. Summarizing the presented results, the manufactured TWAs performed 2.3% to 3.1% less efficiently than those simulated with a maximum overall PAE reaching 22%.

**Table 2.** Designed TWA output power and efficiency measurement summary.

<table>
<thead>
<tr>
<th>TWA1 (External Bias)</th>
<th>TWA2 (Internal Bias)</th>
</tr>
</thead>
<tbody>
<tr>
<td>( f ), GHz</td>
<td>( IR-P1dB ), dBm</td>
</tr>
<tr>
<td>Sim.</td>
<td>1.0</td>
</tr>
<tr>
<td>Meas.</td>
<td>1.0</td>
</tr>
<tr>
<td>Sim.</td>
<td>2.0</td>
</tr>
<tr>
<td>Meas.</td>
<td>2.0</td>
</tr>
<tr>
<td>Sim.</td>
<td>2.9</td>
</tr>
<tr>
<td>Meas.</td>
<td>2.9</td>
</tr>
</tbody>
</table>

S-parameter measurement results are presented alongside simulation results in Figures 4a and 5a. The overall shape of S-parameter responses for both TWAs differs from the simulated ones. The shape of the \( S_{21} \) gain response for TWA1 is close to simulated of SS corner condition, although the matching quality \( S_{11} \) response is worse due to the impedance mismatch in the off-chip gate termination resistor. The \( S_{21} \) gain response for TWA2 with on-chip gate termination is shifted to the lower frequency range, maintaining the simulated curve shape. On-chip gate termination provides a better \( S_{11} \) response, compared to that of TWA1. This is due to the smaller impact of the bondwire on the RF input transmission line.
The measured frequency ranges do not fully correspond to the simulated ones due to the signal generator range (up to 3 GHz) and the VNA range (up to 6 GHz). It is also to be noted that the measured $S$-parameter curves contain noise at the supported bandwidth edge in Figures 4a and 5a.

5. Conclusions

A comparison of traveling-wave amplifiers with on- and off-chip gate-line termination, which are suitable for modern high-bandwidth high-speed radio frequency systems, has been presented in this article. TWA architecture analysis and various existing configurations and design solutions are included alongside the reviewed TWA articles designed in submicron CMOS technologies. Both presented TWAs provide more than an octave operating bandwidths, but provide a $PAE$ of up to 22%, which is similar to that of classical PAs. Both presented TWAs, one of which has an on-chip gate-line termination and the other an off-chip one, have been designed using the IBM 0.13 μm CMOS process with a bandwidth of 10 GHz and a maximum gain of 15 dB. Placing the gate-line termination off-chip results in an $S_{21}$ flatness reduction, compared to the gain of a TWA with on-chip termination. Gain fluctuation over frequency is reduced by 4–8 dB when the termination resistor is placed as an external circuit due to impedance mismatches, which are introduced to the gate-line by the bondwires, ESD, and pad capacitances at their resonant frequencies.

Author Contributions: All authors contributed to the present paper with the same effort in finding available literature resources, as well as writing the paper. All authors have read and agreed to the published version of the manuscript.

Funding: This research received no external funding.

Conflicts of Interest: The authors declare no conflict of interest.

References


