# A Sub-GHz Wireless Transmitter Utilizing a Multi-Class-Linearized PA and Time-Domain Wideband-Auto I/Q-LOFT Calibration for IEEE 802.11af WLAN Ka-Fai Un, Student Member, IEEE, Wei-Han Yu, Student Member, IEEE, Chak-Fong Cheang, Student Member, IEEE, Gengzhen Qi, Pui-In Mak, Senior Member, IEEE, and Rui P. Martins, Fellow, IEEE Abstract—Broadband dynamic spectrum access in the sub-GHz band emerges as a potential solution for deploying low-cost range-enhanced wireless connectivity, suitable for under/less-developed countries. This paper describes a sub-GHz wireless transmitter (TX) with an integrated multi-class-linearized power amplifier (PA) compliant with the IEEE 802.11af wireless local-area network. It features a wideband in-phase/quadrature (I/Q) modulator exploiting two-stage 6-/14-path harmonic-rejection mixers plus $G_m - C$ low-pass filters to manage the spurs emission induced by hard-switched mixing. The entailed 8/16-phase local oscillator (LO) is generated by injection-locked phase correctors plus frequency dividers to relax the frequency and tuning range of the reference LO. The linearized PA features overdriven-class-A/B/C cells to balance the power efficiency and linearity; a dual-gate input pair to enlarge the linear gain range; and a wideband low-impedance ground at the second harmonic to suppress the harmonic distortion and ground bounces. The wideband I/O imbalance and LO feedthrough are resolved by automatic digital calibration, which incorporates time-domain parameter estimation for better computational efficiency. Benchmarking with the recent art, this TX + PA solution fabricated in 65-nm CMOS exhibits higher system power efficiency (from 7.4% to 18.5%) and 1-dB compression point (from $OP_{1\ dB}:+12.5$ to +16.3 dBm). When delivering a 64-QAM orthogonal frequency division multiplexing signal at > +10 dBm, the chip demonstrates sufficiently low noise floor (-143 dBc/Hz), adjacent channel leakage ratio (< -40 dB), and error vector magnitude (< 3.7%) fulfilling the specifications. Index Terms—Adjacent channel leakage ratio (ACLR), CMOS, digital calibration, $G_m-C$ low-pass filter, harmonic rejection mixer (HRM), harmonic rejection ratio (HRR), IEEE 802.11af, image rejection ratio (IRR), in-phase/quadrature (I/Q) imbalance, local oscillator (LO), LO feedthrough (LOFT), LO-leakage rejec- Manuscript received March 24, 2015; revised July 03, 2015; accepted July 25, 2015. Date of publication August 14, 2015; date of current version October 02, 2015. This work was supported by the University of Macau (MYRG2015-00040-FST) and by the Macao Science and Technology Development Fund (FDCT)-SKL Fund. K.-F. Un, W.-H. Yu, C.-F. Cheang, G. Qi, and P.-I. Mak are with the State-Key Laboratory of Analog and Mixed-Signal VLSI, and Faculty of Science and Technology, Department of Electrical and Computer Engineering, University of Macau, Macao, China (e-mail: pimak@umac.mo). R. P. Martins is with State-Key Laboratory of Analog and Mixed-Signal VLSI, and Faculty of Science and Technology, Department of Electrical and Computer Engineering, University of Macau, Macao, China, and also with the Instituto Superior Técnico, Universidade de Lisboa, 1649-004 Lisbon, Portugal (e-mail: rmartins@umac.mo). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TMTT.2015.2462815 tion ratio (LRR), power amplifier (PA), transmitter (TX), widehand ### I. INTRODUCTION S sub-GHz band propagation features lower path loss and better penetration to obstacles such as walls and floors, IEEE 802.11af [1] is emerged as a low-cost range-enhanced wireless local-area network (WLAN) for opportunistic use of the sub-GHz band to supplement the legacy at the 2.4- and 5-GHz bands. For instance, at a 16-dBm transmitter (TX)'s output power, the sub-GHz band WLAN can theoretically cover a distance of 600 m for the receiver to have an 8-dB signal-to-noise ratio, which is much wider than the 250 m at 2.4 GHz and 150 m at 5 GHz [2]. Yet, due to the coexistence of many incumbent users distributed over a 10×-wide (from 54 to 862 MHz) sub-GHz band spectrum, it is challenging to develop a wideband TX with high output-spectrum purity, especially with an integrated power amplifier (PA) that has a tight tradeoff between the power efficiency and linearity. This paper introduces a number of wideband TX and PA design techniques, aiming to minimize the unwanted spurs emission with less external filtering, while fulfilling the noise floor, spectral mask, and error vector magnitude (EVM) specifications without sacrificing the power efficiency. The spectrum purity of a wideband TX can be limited by the presence of various circuit nonidealities. The nature and frequency relationship of those unwanted spurs with respect to the fundamental signal $(f_{ m sig})$ are depicted in Fig. 1, where the prospective solving techniques are outlined. The in-band spurs are mainly the image leakage at $f_{\text{img}}$ and LO feedthrough (LOFT) at $f_{LO}$ associated with the wideband in-phase/quadrature (I/Q) modulator (MOD). Both the image rejection ratio (IRR) and LO-leakage rejection ratio (LRR) can be addressed by digital calibration as the image is mainly due to I/Q imbalances, whereas the LOFT is originated from component mismatches and LO coupling. To be described later, we propose a time-domain parameter estimation scheme together with a simple envelope detector to allow closed-loop I/Q-LOFT calibration, automatically correcting the frequency-dependent mismatches throughout the sub-GHz band. Fig. 1. Spectrum purity of a wideband TX+PA under a single-tone output (at $f_{\rm sig}$ ). The spurs are due to various circuit nonidealities and are addressed here by combining wideband circuit and auto-calibration techniques. In the MOD, the third-order nonlinearity (MOD-HD<sub>3</sub>) of the mixer and baseband (BB) circuitry can generate a sideband spur at $f_{\text{MOD-HD3}}$ (and its image at $f_{\text{MOD-HD3,img}}$ ). Together with the AM-AM and AM-PM distortion characteristics of the PA (not included in Fig. 1 for clarity) that affects both adjacent channel leakage ratio (ACLR) and EVM when delivering a modulation signal, a one-shot digital predistortion (DPD) scheme for the sub-GHz band is proposed. It is based on the Volterra-series model to address the memory-effect-like interaction between counter-intermodulation [3], frequency-dependent I/Q imbalance [4], and PA nonlinearities [5] in one combined process. The theory is detailed in [6]. The PA nonlinearity can also create out-band third-order harmonic distortion (e.g., PA-HD<sub>3</sub>), posing a tight tradeoff between the power efficiency and spectrum purity. To address this, a multi-class-linearized PA (MCL-PA) [7], [8] combining overdriven-class-A, class-B, and class-C cells is proposed. It also features a dual-gate input to enhance the linear gain range, and a wideband low-impedance (low-Z) ground at the second harmonic [9], [10] to suppress the PA-HD<sub>3</sub> and ground bounces. The out-band spurs at $f_{HR3}$ , $f_{HR5}$ and $f_{HR7}$ are induced by the hard-switching mixers of the MOD, which can be alleviated via a multi-path harmonic-rejection mixer (HRM). Expanding the number of HRM paths can reject more harmonics, but penalizing the circuit complexity, while the harmonic rejection ratio (HRR) can be limited by both gain and phase mismatches. For instance, a 6-path HRM (6P-HRM) [11] can suppress the critical third and fifth harmonics by $\sim$ 35 dB under typical 1% gain and 1° LO phase errors. The 18-path HRM [12] can suppress harmonics up to the 15th by 40 dB, but requiring 18-path BB inputs and an 18-phase 33%-duty-cycle LO, which degrades the power and area efficiencies. Although a GHz-bandwidth (BW) digital-to-analog converter (DAC) [13] can serve as a wideband TX to avoid harmonic mixing, I/Q mismatch, and LOFT, a high-speed I/O interface (9 bits × 8 channels at 625 MS/s) is required to achieve a clean spectrum ( $IM_3 < -58$ dBc), while a high output power (+11 dBm) consumes substantial static power (375 mW at a 2.2-GHz BW). Moreover, wideband RF DPD calibration is required to achieve high linearity over the entire frequency band [14]. In this work, a cooperative use of the HRM and RF filter secures a high HRR throughout the sub-GHz band, while relaxing the required LO phase accuracy. The RF filter also suppresses the third harmonic of the LOFT at $f_{LO3}$ that can be harmful owing to possible component mismatches among the mixer cells. This work achieves an HRR of 60 dB, which outperforms [15] with harmonic rejection mixer and active LC filter, and [16] with polyphase mixing and RC filter at the output stage. Comparing with [17], this work integrates an on-chip PA to deliver higher power and removes the capacitor-inductor-capacitor (CLC) filter to save chip area without sacrificing the harmonic rejection and linearity. By utilizing the $G_m$ -linearization technique and wideband low-Z ground, this work achieves better linearity and power efficiency than [16], while [15] also relies on an off-chip PA to deliver a +10-dBm output power. Comparing with the prior art [15]–[17], this work advances most metrics while meeting the noise floor, spectral mask, and EVM of the IEEE 802.11af standard. This paper is organized as follows. The TX architecture and frequency plan are presented in Section II, and the key building blocks are detailed in Section III. The I/Q-LOFT calibration algorithm is described in Section IV and the noise analysis is given in Section V. The experimental results are summarized in Section VI, and the conclusions are drawn in Section VII. ### II. PROPOSED TX ARCHITECTURE AND FREQUENCY PLAN Fig. 2 depicts the proposed TX architecture. The signal paths are differential to stem all even harmonics. The BB data ( $I_{\rm BB}$ and $Q_{\rm BB}$ ) are inputted to the MOD via two off-chip DACs. To manage the MOD's spurs emission, the sub-GHz band is divided into lower (from 54 to 432 MHz) and upper (from 432 to 864 MHz) sub-bands, served by two-stage 14P-HRM and 6P-HRM, respectively. The two-stage HRM realization [18] desensitizes the HRR to gain mismatch thanks to the multiplicative effect of stage errors. The first stage involves pre-gain ratios embedded into the BB first-order passive-RC low-pass filters (LPF<sub>BB</sub>), whereas the second stage features post-gain ratios realized with the RF second-order $G_m - C$ low-pass filters (LPF<sub>LB/HB</sub>). To generate a staircase LO waveform with certain harmonics not present, the two-stage 14P-HRM is driven by a 16-phase LO for 22.5° phase-shift resolution. With it, LO harmonics up to the 13th can be rejected $(13 \times 54 \text{ MHz} = 702 \text{ MHz})$ . The selected pre-gain ratios are [1/5:1/2:5/8:3/5:5/8:1/2:1/5] to fit the realization of passive-RC LPF<sub>BB</sub> with internal resistive dividers [17]. The required post-gain ratios at RF are integer values [4:3:4] easing the component matching. They together approximate the irrational gain ratios $[\cos(3\pi/8):\cos(\pi/4):\cos(\pi/8):1:\cos(\pi/8):\cos(\pi/4):\cos(\pi/4):\cos(3\pi/8)]$ with <0.1% relative error. Similarly, the pre-gain ratios are [2/5:3/5:2/5] and post-gain ratios are [5:7:5] for the two-stage 6P-HRM. Together they approach the irrational gain ratios $[\cos(\pi/4):1:\cos(\pi/4)]$ with <0.1% relative error. Driven by an 8-phase LO with 45° phase resolution, the LO harmonics up to the fifth can be rejected $(5 \times 432 \text{ MHz} = 2.16 \text{ GHz})$ . Fig. 2. Proposed wideband TX + PA with automatic I/Q-LOFT calibration in a closed-loop form. To relax the phase accuracy of the required 8-/16-phase LO generator (LOG) and handle the rest harmonics (e.g., intrinsic HRR $_{15}\approx 24$ dB), simulation results shows that post RF filtering LPF $_{LB/HB}$ contributes another 19-dB (45 dB) rejection to HRR $_3$ (HRR $_{15}$ ). Hence, for an overall HRR of 60 dB, the matching accuracy of the LOG is relaxed to secure only 41-dB HRR. Two chains of direct-injection-locked 4-/8-phase phase corrector (4PC/8PC) are incorporated with even-ratio-only frequency dividers to cover the sub-GHz band, while relaxing the frequency and tuning range of the reference LO (LO $_{\rm ref}$ ). The six-stage 8PC chain generates the 8-phase LO for the upper sub-band, whereas the five-stage 4PC chain with division offers the 16-phase LO for the lower sub-band. To meet the relaxed HRR $_3$ target of 41 dB, the required root means square (rms) phase error for the whole band is $<2.0^{\circ}$ . The $\mathrm{HD_3}$ generated by the MOD can be handled by $\mathrm{LPF_{LB/HB}}$ , rendering the overall $\mathrm{HD_3}$ dominated by the PA. To suppress it, a highly linear inverter-based driver amplifier ( $\mathrm{DA_{PA}}$ ) followed by an MCL-PA is proposed. The latter employs a specific low-Z ground at the second harmonic to further suppress its $\mathrm{HD_3}$ . To validate the performance of the MOD, the MOD is followed by another drive amplifier $(DA_{\rm MOD})$ before output buffering. Also, to jointly calibrate the I/Q mismatch and LOFT in a closed-loop form, an envelope detector is added to sense the output of $DA_{\rm MOD}$ and deliver $OUT_{\rm DET}$ for the digital back-end implemented in the field-programmable gate array (FPGA). The correction parameters are applied to the BB source data, which can further include DPD to enhance the ACLR and EVM. The key building blocks are introduced below. ### III. CIRCUIT IMPLEMENTATION A. $$G_m - C$$ Low-Pass Filter (LPF<sub>LB</sub>/LPF<sub>HB</sub>) The circuit implementation of the MOD has been detailed in [17], except the RF filters (LPF<sub>LB/HB</sub>), which are replaced here by second-order $G_m-C$ low-pass filters (Fig. 3). Their frequency response can be derived as $$G_{Gm-C}(s) = \frac{G_{m1}G_{m3}}{C_{F1}C_{F2}s^2 + G_{m4}C_{F1}s + G_{m2}G_{m3}}$$ (1) where the BW $(f_0)$ and quality factor (Q) are given by, respectively, $$f_0 = \frac{1}{2\pi} \sqrt{\frac{G_{m2}G_{m3}}{C_{F1}C_{F2}}} \quad Q = \frac{1}{G_{m4}} \sqrt{\frac{G_{m2}G_{m3}C_{F2}}{C_{F1}}}.$$ (2) To meet 19-dB HRR $_3$ at different RF frequencies while keeping a constant Q, the LPF $_{\rm LB}$ features a tunable BW from 60 to 524 MHz with $C_{F1}$ and $C_{F2}$ switched from between 1.0 to 13.3 pF and from 0.828 to 6.4 pF, respectively. For the LPF $_{\rm HB}$ , the tunable BW is from 520 to 1030 MHz, realized with $C_{F1}$ and $C_{F2}$ switched from between 0.556 to 1.5 pF and from 82 fF to 393 fF, respectively. With four control bits, the 16 BW steps can be linked with the LO bands for automatic selection. $G_{m1-4}$ Fig. 3. Block diagram of the $G_m$ -C filter for LPF<sub>LB/HB</sub>. $G_{m1}$ features a triple-input structure to realize the post-gain ratios required for the MOD. are common-source amplifiers, while $G_{m1}$ and $G_{m4}$ are resistively degenerated to improve the linearity. To support the two-stage 6P-/14P-HRM, $G_{m1}$ features a triple-input structure to realize the post-gain ratios for the lower and upper sub-bands. Unlike [17] entailing three passive-RC LPFs, here the matching is simplified to involve only three input transistors of $G_{m1}$ , minimizing the routing mismatches in layout. Although the passive-RC filter in [17] seems more power efficient (7.7 mW, plus its buffer), it induces a passband loss of $\sim$ 3 dB. The $G_m-C$ filter (22 mW) here avoids the passband loss, resulting in better TX power efficiency and linearity. For instance, at the nominal output power the $\mathrm{IM}_3$ of the passive-RC filter in [17] is -45 dBc, limited by the MOS switches for BW tuning, as they experience a larger signal swing than the proposed $G_m-C$ filter. The proposed $G_m-C$ filter shows a simulated $\mathrm{IM}_3$ of -51 dBc, while its in-band group-delay variation is compensated by the memory DPD. # B. Driver Amplifier ( $DA_{MOD}$ ) The $\mathrm{DA_{MOD}}$ is a common-source cascode amplifier with a resistive load. To deliver a -15-dBm output power with $\mathrm{IM_3} = -51$ dBc, the output $\mathrm{IP_3}(\mathrm{OIP_3})$ of the $G_m - C$ filter can be relaxed to +13 dBm thanks to the gain (10.8 dB) of the DA. For testability, the DA is followed by an on-chip buffer ( $\mathrm{Buf_{MOD}}$ ) to drive the 50- $\Omega$ equipment. The simulated $\mathrm{OIP_3}$ of the DA<sub>MOD</sub> is from +17.9 to +18.3 dBm, which implies that the linearity of the MOD + DA<sub>MOD</sub> is primarily limited by the DA<sub>MOD</sub>. # C. Multi-Stage PA-Driver Amplifier (DA<sub>PA</sub>) and MCL-PA Fig. 4(a) shows the proposed multi-stage PA. It is composed of a $\mathrm{DA_{PA}}$ followed an MCL-PA. Inside the $\mathrm{DA_{PA}}$ , it further includes a pre-driver amplifier (PDA), large-gain ( $\mathrm{DA_{LG}}$ ), and small-gain ( $\mathrm{DA_{SG}}$ ) driver amplifiers; all are inverter-like amplifiers self-biased with resistive feedback [19]. The HD $_3$ of the DAs is designed to be <-60 dBc at the desired output level through feedback mechanism so as not to degrade the overall linearity of the PA [20]. The gain of the DA $_{LG}$ can be adjusted by the tranconductance ratio without extra power consumption. The simulated voltage gains of the large- and small-gain paths are 19.7 and 14.9 dB, respectively. The linearity of the output stage is guaranteed by the MCL-PA, where optimum class-A, class-B, and class-C cells are parallelized to flatten the transconductance over a large input swing, as shown in Fig. 4(b). Each transconductance is simulated when all the PAs cells are connected since the cells affect each other at the shared drain node. Typically, the Fig. 4. (a) Proposed PA features a $\mathrm{DA}_{\mathrm{PA}}$ to drive the MCL-PA. (b) Composition of the MCL-PA's transconductance. simulated HD<sub>3</sub> of the MCL-PA is <-50 dBc at 15-dBm output power. With 500-run PVT Monte Carlo simulations, only 0.2% of the samples has a HD<sub>3</sub> >-35 dBc. However, through a 1-bit ( $v_{bB3}=1.2/1.4$ V) calibration for the thick-oxide bias, HD<sub>3</sub> <-40 dBc can be achieved for all 500 runs. Also, careful layout is essential for matching the three cells, and the robustness of the MCL-PA can be confirmed. In terms of power efficiency, one may use only class-B/C cells in parallel for the MCL-PA. Yet, the MCL-PA cannot be effectively linearized without a class-A cell since the harmonic distortion is mainly affected by the flatness of the trans-conductance at small input swing [5], [21]. To improve the power efficiency of the class-A cell, it is overdriven by increasing its input swing using the DA<sub>LG</sub> and the transistors' sizes in the class-A cell can be smaller. Thus, the class-A cell can enter the triode region faster due to the larger overdrive voltage to reduce its dc current. This configuration is frequently applied for constant envelope applications [22]. For the proposed nonconstant envelope MCL-PA, the overdriven-class-A cell also enters the triode region faster, and the linear amplification is handed by the class-B and class-C cells at large input swing. Here, the MCL-PA is comprised by overdriven class-A, class-B, and class-C cells to extend the linear transconductance range [see Fig. 5(a)]. The required voltage gain of the DA<sub>LG</sub> is $1.76 \times$ to that of the DA<sub>SG</sub>. Comparing with the typical class-A cell, the channel width of transistors $M_{A1p}(400 \rightarrow 150 \ \mu\text{m})$ and $M_{A2p}(270 \rightarrow 100 \ \mu\text{m})$ can be significantly reduced for the overdriven-class-A cell. The simulated dc current of the MCL-PA can be reduced from 73.3 to 54.6 mA at a +10-dBm Fig. 5. (a) Schematic of the proposed MCL-PA. It features a novel dual-input structure to enhance and enlarge the linear gain range of the MCL-PA as plotted in (b) where the drain currents under one- and dual-gate inputs are compared. output power, and from 69.1 to 49.1 mA at the quiescent condition by applying the overdriven-class-A cell. The dual-gate input enhances the transconductance and enlarges the linear region of the MCL-PA at the expense of input capacitance (more tolerable in this work for sub-GHz operation). The two cascode thin-oxide transistors are both driven by DA<sub>LG</sub> and DA<sub>SG</sub>. Considering the overdriven-class-A cell, the transconductance ratio $\kappa_{dg-og}$ of the dual-gate input $G_{mAdg}$ to the one-gate input $G_{mAog}$ is calculated as $$\kappa_{dg-og} = \frac{G_{mAdg}}{G_{mAog}} = 1 + \frac{g_{mA2p}r_{oA2p}}{g_{mA1p}r_{oA1p} + g_{mA1p}r_{oA1p}g_{mA2p}r_{oA2p}}$$ (3) where $g_{mA1p,2p}$ is the transconductance of $M_{A1p,2p}$ , and $r_{oA1p,2p}$ is the output resistance of $M_{A1p,2p}$ . The simulated small-signal parameters $g_{mA1p}$ , $g_{mA2p}$ , $r_{oA1p}$ , and $r_{oA2p}$ are 41.34 mS, 35.83 mS, 68.83 $\Omega$ , and 82.05 $\Omega$ at the linear region, respectively, so that $\kappa_{dg-og}=1.26$ . The simulated drain currents of the MCL-PA with dual-gate and one-gate inputs are plotted in Fig. 5(b). The transconductance ratio is 1.33 at the linear region, which matches to the calculated $\kappa_{dg-og}$ . Moreover, the linear region is also increased by the dual-gate inputs since it can prevent the stacked MOSs $(M_{A2p}, M_{A2n})$ from entering the cutoff region while keeping the bias $(V_{bA2})$ low to reduce the dc current. With a larger transconductance of the MCL-PA, the output swing of the DA<sub>PA</sub> can be reduced to improve the overall linearity. For output-impedance matching over the sub-GHz band, an off-chip wideband transformer with an impedance ratio of 1:1 was employed. The differential PA can boost up the output power and minimize the ground noise generated by the source current. To further enlarge the output power, the supply voltage $V_{\rm DDPA}$ can be elevated according to the time-dependent dielectric breakdown condition [23], i.e., the rms voltage across each transistor must not exceed the rated values given in the Process $Design\ Rule\ Manual$ . Herein the triple-cascode topology eases the use of an elevated $V_{\rm DDPA}$ . The rms voltage of the full-swing single-ended output $(v_{opap})$ is calculated for reliability concerns. From simulations, $v_{opap}$ at angular frequency $\omega_s$ can be expressed as $$\mathbf{v}_{\text{opap}} = \begin{cases} V_{\text{OPA}} - V_{\text{OPA}} \sin \omega_s t, & 0 \le t \le \frac{\pi}{\omega_s} \\ V_{\text{OPA}}, & \frac{\pi}{\omega_s} \le t \le \frac{2\pi}{\omega_s} \end{cases}$$ (4) where $V_{\mathrm{OPA}}$ is the amplitude of the differential output voltage. The rms value of $v_{opap}$ is $0.783V_{\mathrm{OPA}}$ . $V_{\mathrm{DDPA}}$ (2 V) is equal to the dc value of $v_{opap}$ so that $V_{\mathrm{OPA}}$ is $1.467~V_{\mathrm{DDPA}}$ . Thus, the maximum rms output voltage is $1.149~V_{\mathrm{DDPA}}$ . With two thin-oxide devices $(M_{A1p}$ and $M_{A2p})$ and a thick-oxide device $(M_{A3p})$ stacked together, the maximum rms drain-source voltage can be theoretically up to $4.9~\mathrm{V}$ [24]. ## D. Wideband Low-Z Ground The differential PA induces a strong second harmonic current $(I_{2nd})$ to its ground, which results in a second harmonic voltage over the bondwire degrading the gain and linearity of the MCL-PA, while inducing more ground bounces to the substrate. Unlike the narrowband second harmonic termination in [10], the proposed wideband low-Z ground [see Fig. 6(a)] involves four dc and four ac grounds in parallel to lower the ground impedance at high frequency so as to reduce the ground effect at the second harmonic of the transmitted signal. Each dc ground is connected off-chip via a bondwire, whereas each ac ground features a MOS capacitor in series with the bondwire forming a low impedance path for $I_{2nd}$ . The bondwire is modeled as an inductor $(L_{\text{bond}})$ , and thus the equivalent model of the low-Z ground is shown in Fig. 6(b). The LC networks introduce four notches at 110, 450, 850, and 1150 MHz to achieve wideband-flat impedance, as shown in Fig. 6(c), where the result obtained with eight dc grounds is also plotted for comparison. For the former, the low-frequency ground is raised due to fewer dc grounds, but more importantly, the high-frequency ground impedance is flattened and reduced. The simulated $HD_3$ of the Fig. 6. (a) Schematic of the low-Z ground and (b) its equivalent circuit. Simulated: (c) impedance of the low-Z ground and (d) HD<sub>3</sub> using four ac-GNDs +4 dc-GNDs (proposed) and eight dc-GNDs (typical). MCL-PA is plotted in Fig. 6(d). Note that the input amplitude of the MCL-PA has to be increased with frequency to secure an output power >+10 dBm. The simulated HD<sub>3</sub> of the MCL-PA with low-Z ground ranges from -51.5 to -37.5 dBc, and is improved by 4.9 dB at 800 MHz in comparing with the eight dc grounds. # IV. I/Q-LOFT CALIBRATION ALGORITHM I/Q imbalance and LOFT can be calibrated by sensing the impairments via a receiver [25] or an envelope detector [26], [27]. The accuracy of the former is inherently limited by the I/Q imbalance of the receiver, while the latter requires taking fast Fourier transform (FFT) of the detected signal to extract the frequency domain information. For each calibration step, a 2048-point FFT has to be performed by an added-on DSP, complicating the realization. Moreover, a 2-D lookup table (LUT) has to be trained for calibration, which demands large power and long training time. In contrast, the proposed I/Q-LOFT calibration algorithm (Fig. 7) estimates the impairments directly in the time domain. Before the training stage, the estimated parameters of gain $(\alpha)$ and phase $(\theta)$ imbalances, and the magnitude $(\sigma)$ and phase $(\varphi)$ of the LOFT, are preset. $z^{-D1}$ indicates the total delay of the MOD and the envelope detector. A single-tone I/Q BB signal (I+jQ) is generated to train the I/Q imbalance and LOFT of the MOD. The distorted signal $w_{\rm dis}(n)$ can be modeled in BB as $$\begin{bmatrix} w_I(n) \\ w_Q(n) \end{bmatrix} = \begin{bmatrix} 1 & -\alpha \sin \theta \\ 0 & \alpha \cos \theta \end{bmatrix} \begin{bmatrix} I(n) \\ Q(n) \end{bmatrix} + \begin{bmatrix} \sigma \cos \varphi \\ \sigma \sin \varphi \end{bmatrix}. \quad (5)$$ Fig. 7. Block diagram of the digital I/Q-LOFT calibration. In order to accurately detect the small envelope signal generated by I/Q imbalance and LOFT, the envelope detector (Fig. 8) senses the envelope of $v_{\rm DAMOD}$ with a voltage gain of 11.8 dB over a 50- $\Omega$ load. The LOFT and I/Q image are mapped OUT<sub>DET</sub> Before cal. OUT<sub>DET</sub> After cal. Fig. 8. Envelop detector extracts the I/Q imbalance and LOFT. to $f_{\rm BB}$ and $2f_{\rm BB}$ in the envelope, respectively. $M_{d1}$ and $M_{d2}$ generate the envelope of the input signal and are followed by a common-source amplifier and passive-RC low-pass filter. The 3-dB BW of the latter is set at $\sim 1$ MHz to provide >65-dB rejection at the LO's second harmonic. A proper bias level $(V_{db2})$ ensures the detector is working in the linear region, and it can be modeled as $$s(n) = |v_{\text{DAMOD}}(n)|^2. \tag{6}$$ Expanding (6), $$s(n) = I(n)^{2} + \alpha^{2} Q(n)^{2} + \sigma^{2} + 2\alpha \sigma Q(n) \sin(\varphi - \theta) + 2\sigma I(n) \cos \gamma - 2\alpha I(n) Q(n) \sin \theta \quad (7)$$ the error term e(n) between the detected envelope and the input is defined as $$e(n) = s(n) - I(n)^{2} - [\chi(n)][\eta(n)]^{T}$$ (8) where $\chi(n)=[\alpha^2, \alpha\cos\gamma, \alpha\sin\theta, \sigma^2, \alpha\sigma\sin(\gamma-\theta)]$ and $\eta(n)=[Q(n)^2, 2I(n), -2I(n)Q(n), 1, 2Q(n)]$ . The "1" in $\eta(n)$ accounts for the magnitude of the LOFT. By minimizing the cost function: $J(n)=E[e(n)e(n)^*]$ , where $(\cdot)^*$ represents the complex conjugate. The vector $\chi(n)$ is trained by least mean square (LMS) algorithm with the step size $\mu$ , which can be expressed as $$\chi(n+1) = \chi(n) + \mu e(n)^* \eta(n).$$ (9) The impairment parameters are calculated as $$\alpha = \sqrt{\chi_1} \quad \theta = \sin^{-1} \left( \frac{\chi_3}{\sqrt{\chi_1}} \right)$$ $$\sigma = \sqrt{\chi_4} \quad \varphi = \cos^{-1} \left( \frac{\chi_2}{\sqrt{\chi_4}} \right). \tag{10}$$ Note that only the terms $\sigma^2$ in $\chi(n)$ is not correlated to the BB input, which implies the HD<sub>3</sub> of the detector output has most of the projection on $\sigma^2$ at the LMS training stage. Thus, the HD<sub>3</sub> of the detector mainly influences the accuracy of the LOFT estimation, but not that of the I/Q imbalance. The timing diagram for the entire calibration is shown in Fig. 9. $v_{\mathrm{DAMOD}(n)}$ and the delayed BB signal are the input of the LMS algorithm. $\chi(n)$ requires 15-bit resolution to estimate the impairment parameters. $\mu$ determines the convergence speed under a stable condition. The step-size $\mu$ is designed to be 1/128 to reduce the computational complexity. Updating $\chi$ in each training step entails eight cycles of an 80-MHz clock. $\chi$ converges after 12 000 training steps (1.2 ms), which is considerably faster than the adaptive decorrelation method that requires from 3 to 4 ms [28]. Two Coordinate Rotation Digital Computer (CORDIC) [29] operators are exploited to calculate the square root of $\chi_1$ and $\chi_4$ , as stated in (8). For hardware savings, the CORDICs are reused twice to calculate the two divisions, arcsine and arccosine. ### V. OUTPUT NOISE FLOOR The output noise of the MOD + DAMOD is simulated. The in-band noise floor is -149 dBc/Hz when delivering a -5-dBm RF at 54 MHz. It is dominated by the source–follower buffers (Buf $_{\rm LB}$ and Buf $_{\rm HB}$ ) and the output noise induced by MOD + DA $_{\rm MOD}$ can be derived as $$\overline{v_{n,\text{MOD+DAMOD}}^2} = A_{\text{DAMOD}}^2 \overline{v_{n,\text{MOD}}^2} = A_{\text{DAMOD}}^2 \frac{16kT}{3} \frac{(g_{mi} + g_{ml})}{(g_{mi} + r_{Oi}^{-1} + r_{Ol}^{-1})^2}$$ (11) where $A_{\rm DAMOD}$ is the gain of the ${\rm DA_{MOD}},~g_{mi}(g_{ml})$ and $r_{Oi}(r_{Ol})$ are the transconductance and output resistance of the input (load) transistors of the ${\rm Buf_{LB/HB}},$ respectively. The calculated output noise of ${\rm MOD}+{\rm DA_{MOD}}$ is -152 dBc/Hz at T=293 K. The output noise of the PA is dominated by the PDA. From simulations, the in-band noise floor of the MOD + PA is -146 dBc/Hz when delivering a +10-dBm RF at 54 MHz in which $\sim 56\%$ is contributed by Inv<sub>1</sub> ( $M_{D1}$ and $M_{D2}$ ). The output-referred noise of the MOD + PA is expressed as $$v_{n,\text{MOD+PA}}^{2} = A_{\text{PA}}^{2} \overline{v_{n,\text{MOD}}^{2}} + A_{SF1}^{2} A_{\text{DA+MCL-PA}}^{2} \frac{8kT(g_{mD1} + g_{mD2})}{3(g_{mDL1} + g_{mDL2})^{2}}$$ (12) where $A_{\rm PA}$ is the gain of the PA, $A_{\rm DA+MCL-PA}$ is the gain from $V_{\rm DA,in}$ to ${\rm OUT_{PA}}$ (see Fig. 4), $A_{SF1}$ is the gain of $SF_1$ , $g_{mD1,2}$ and $g_{mDL1,2}$ are the transconductances of $M_{D1,2}$ and $M_{DL1,2}$ , respectively. The output noise of the MOD + PA is calculated to be $-149~{\rm dBc/Hz}$ at $T=293~{\rm K}$ . Fig. 9. Timing diagram of the proposed I/Q-LOFT calibration using time-domain parameter estimation. Fig. 10. Chip micrograph of the fabricated TX in 65-nm CMOS. ### VI. EXPERIMENTAL RESULTS Prototypes of the TX were fabricated in ST 65-nm CMOS. The chip micrograph is depicted in Fig. 10. The active areas are 0.54 $\rm mm^2$ for MOD+DA\_{MOD}, and 1.03 $\rm mm^2$ for MOD+PA. The test setup is shown in Fig. 11. In-band HRR, output $P_{\rm 1~dB}(OP_{\rm 1~dB})$ , and power efficiency were measured by single-tone tests, whereas $OIP_{\rm 3}$ was measured by two-tone tests. The two tones are located at 500 and 600 kHz. The BB I/Q test signals are provided by the two off-chip DACs, with a 100-MHz sampling frequency for tone generation, or a 96-MHz sampling frequency for 6-MHz BW 64-QAM OFDM signal generation. The latter input signal is used to measure the drain efficiency (DE), EVM, and ACLR. The LO\_{ref} is generated by the Agilent E4438C signal generator. The wideband output spectrum and noise floor are measured by the Agilent N9030A Fig. 11. Test setup of the MOD + PA. Fig. 12. Measured HRR over 12 available samples. signal analyzer, and the EVM is computed in MATLAB, with the data captured from the Agilent DSO91304A oscilloscope. # $A. \text{MOD} + \text{DA}_{\text{MOD}}$ The MOD + DA $_{ m MOD}$ consumes 63.8 mW at 54 MHz, and up to 87.2 mW at 864 MHz. The frequency coverage of 6P-HRM and 14P-HRM overlaps at $\sim$ 432 MHz. To save power, 14P-HRM was chosen to cover from the 432-to 500-MHz range, but requiring a wider LO $_{ m ref}$ frequency range (from 0.432 to 1 GHz). The upper limit of 500 MHz Fig. 13. Measured far-out spectrum of the $\mathrm{MOD} + \mathrm{DA_{MOD}}$ at 54 MHz. Fig. 14. Measured IRR (upper) and LRR (lower) with and without calibration is related to the stopband attenuation of the $G_m - C$ filter. The uncalibrated in-band HRR measures >59.8 dB over 12 available samples (Fig. 12). The HRR is dominated by the seventh and ninth harmonics for 54 and 108 MHz, as they are up-converted by the reference LOFT with their amplitudes proportional to that of the LO<sub>ref</sub>. The HRR is dominated by the third harmonic for frequency >216 MHz since the LO phase accuracy degrades at higher frequencies as discussed. The wideband spectrum of the $MOD + DA_{MOD}$ at 54 MHz is shown in Fig. 13. A 55-dB-clean output spectrum is achieved, which is limited by the HD<sub>3</sub>. The displayed signal power in Fig. 13 should be corrected as -5 dBm after de-embedding the losses of the test buffer, transformer (JTX-2-10T), and printed circuit board (PCB). The measured $OIP_3$ and $OP_{1 \text{ dB}}$ range from +12.6 to +17.0 dBm and from +4.3 to +7.8 dBm, respectively. The power efficiency at $OP_{1 \text{ dB}}$ is from 3.1% to 9.4%. The EVM is 1.0% at 54 MHz, 1.4% at 216 MHz, and 1.5% at 804 MHz at the output power of -5 dBm, where the memory effect of the $MOD + DA_{MOD}$ is pre-compensated in the digital BB data. The IRR and LRR were measured under 1-MHz BB differential I/Q signals generated by 12-bit DACs and are plotted in Fig. 14. One-shot I/Q-LOFT calibration is adequate for the lower sub-band since the phase error of the LOG is more accurate under the configuration of 8PC/4PC plus frequency divider. Moreover, since the LO has a division ratio with respect to $LO_{\rm ref}$ , the LOFT is not influenced by the PCB coupling of $LO_{\rm ref}$ . Thus, the detector can sense the LOFT tone accurately for calibration. Differently, for the upper sub-band, the LOFT is Fig. 15. ${\rm HD_3}$ of the envelop detector in relationship with the noncalibrated LRR ( ${\rm LRR_{non-cal}}$ ) and calibrated ( ${\rm LRR_{cal}}$ ) for: (a) lower sub-band and (b) upper sub-band. Fig. 16. Measured far-out spectrum of the MOD + PA at 54 MHz. Fig. 17. Measured and simulated $OP_{1\ dB}$ and $OIP_{3}$ at different RF frequencies. | FPGA | Algorithm | Used Number of | Power | | Number of | Area | Number of | |-------------------------------------------|----------------|----------------|--------------|----------------|-----------|---------|-------------| | Operation | Algorium | Operator | Leakage (nW) | Switching (nW) | Gate | (µm²) | Clock cycle | | Updating Block | LMS | 1 | 340.07 | 234879.18 | 1099 | 4954.56 | 8 × 12000 | | Parameter<br>Estimator and<br>Compensator | CORDIC | 2 | 182.20 | 264631.89 | 2104 | 8107.32 | 25 × 6 | | | Division | 2 | 102.99 | 128233.72 | 421 | 1597.44 | 9 | | | Multiplication | 4 | 362.12 | 90595.89 | 371 | 1714.44 | 1 | | | Add | 4 | 7.31 | 10360.43 | 36 | 149.76 | 1 | TABLE I PERFORMANCE ESTIMATION OF THE I/Q-LOFT CALIBRATION ALGORITHM IN 65-nm CMOS at 1.2 V and 25 $^{\circ}\text{C}$ Fig. 18. Measured $P_{\rm out}$ and DE at: (a) 54 MHz, (b) 300 MHz, and (c) 600 MHz under a single-tone input. influenced by the nonlinearity of the detector and frequency-dependent direct $LO_{ref}$ coupling, necessitating the use of individual calibration for different frequencies. After calibration, both the IRR (from 18.9 to 29.0 dB $\rightarrow$ 41.3 to 51.1 dB) and the LRR (from 20.4 to 31.7 dB $\rightarrow$ 37.9 to 45.4 dB) are significantly improved, and safely meet the LRR specification (22 dB). As Fig. 19. Measured $P_{\rm out}$ and DE for ACLR =-30 dBc with a 6-MHz BW 64-QAM OFDM input. discussed before, the $HD_3$ of the detector influences the accuracy of the LOFT estimation, so as the noncalibrated LRR (LRR<sub>non-cal</sub>) and calibrated LRR (LRR<sub>cal</sub>). Their relationships are plotted in Fig. 15(a) and (b). The gain of the detector at upper sub-band is $\sim 1$ dB smaller than the one at the lower sub-band such that the $HD_3$ of the detector is <-35 dBc and the calibrated LRR is $\sim 40$ dB. Measured over multiple chips, the calibrated LRRs are in good agreement with the simulations. The algorithm was not implemented on chip, but the required power, area and calibration time can be estimated in Cadence Encounter given in Table I. The digital power for the LMS training and parameter estimation is 1.09 mW, which can be power down after calibration. The estimated add-on area is 0.02 mm<sup>2</sup>. ### B. MOD + PA The measured wideband power spectral density (PSD) at 54 MHz is plotted in Fig. 16. Without any off-chip filtering, all harmonic distortions are <-40.0 dBc when an RF power of +15.1 dBm is delivered. Similar to [16], those harmonics may reduce the usable white spaces by other users, and system-level optimization with spectrum sensing should be required to avoid them jamming with the incumbent users [30]. To this point, off-chip supply decoupling is also critical to reduce the second harmonic current at the ground node. The overall power efficiency calculated at $OP_{1 \text{ dB}}$ is from 7.4% to 18.5%. The in-band output noise is -143 dBc/Hz (specification: -129 dBc/Hz). The measured in-band $OIP_3$ (+16.1 to +27.0 dBm) and $OP_{1 \text{ dB}}$ (+12.5 to +16.3 dBm) are plotted in Fig. 17. They primarily match with the simulations with $\sim$ 2-dB differences, Fig. 20. Measured PSD (left) and EVM after DPD (right) at the PA's output under a 6-MHz BW 64-QAM OFDM signal (2048 subcarriers and PAPR $\approx 8.5$ dB) transmitted at: (a) 54 MHz, (b) 300 MHz, and (c) 600 MHz. All ACLR results meet the spectral mask at > + 10-dBm output power and are further improvable with DPD which could be affected by the PA's output impedance matching condition or process variations. Single-tone tests measure the output power and DE of the MCL-PA, as shown in Fig. 18(a)–(c). The saturated output power is from +16.3 to +20.0 dBm and its corresponding DE ranges from 17.5% to 44.0%. The DE drops from 44.0% to 17.1% when the output power has a 6-dB back-off at 54 MHz. Note that all PA measurements are carried up to 600 MHz since the available transformer for PA output matching at 50- $\Omega$ is from 50 to 500 MHz. Fig. 19 plots the average output power and DE for the OFDM-signal tests, which are measured at the ACLR to be <-30.0 dBc for a 6-MHz offset. The average output power is from +10.3 to +11.7 dBm and its corresponding average DE is from 7.7% to 10.6%. The PSDs at >+10-dBm output power meet the IEEE 802.11af spectral mask, as shown in Fig. 20(a)–(c). With DPD, both first ACLR (from -30.2 to -33.2 dB $\rightarrow$ -40.2 to -41.7 dB) and second ACLR (from -44.0 to -49.1 dB $\rightarrow$ -45.4 to -49.2 dB) are significantly improved and the constellation diagrams show $<\!3.7\%$ EVM. Fig. 21. Relationship of EVM versus back-off power at the PA's output (6-MHz BW 64-QAM OFDM signal). | TABLE II | |------------------------------------------------------------------------------| | CHIP SUMMARY AND PERFORMANCE BENCHMARK WITH THE STATE-OF-THE-ART SUB-GHZ TXs | | | | This Work | | JSSC'14 [16] | JSSC'13 [17] | TMTT'11 [15] | | |---------------------------------------------------------|----------------------------|---------------------------------------------------------|-----------------------------------------------------|---------------------------------------------|-------------------------------------------------------|------------------------------------------|--| | TX Architecture | | Two-Stage 6P/14P-HRM + MTPA + Auto I/Q-LOFT Calibration | | 8P-HRM +<br>Duty Cycle Control | Two-Stage 6P/<br>14P-HRM | One-Stage 4P/<br>6P-HRM | | | On-Chip Filtering | | 2 <sup>nd</sup> -order<br>G <sub>m</sub> -C Filter | | 1 <sup>st</sup> -order<br><i>RC</i> -Filter | Low-Q Passive-RC/<br>CLC Filter | Active LPF w/<br>Tunable <i>LC</i> Notch | | | Integration Level | | MOD + DA <sub>MOD</sub> | MOD + PA | MOD + DA <sub>MOD</sub> | MOD + DA <sub>MOD</sub> | MOD + DA <sub>MOD</sub> | | | RF Range (MHz) | | 54 to 864 | 54 to 600 | 100 to 800 | 54 to 864 | 54 to 862 | | | Required LO <sub>REF</sub> (MHz) | | 432 to 864 | 432 to 864 | 400 to 3200 | 432 to 864 | 1149 to 1724 | | | Total Power Consumption (mW) @ MHz | | 63.8 @ 54<br>87.2 @ 864 | 230.4 @ 54<br>240.3 @ 600 | 129 @ 100<br>151 @ 500 | 53.1 @ 54<br>75.2 @ 864 | 171 @ 54<br>131.4 @ 862 | | | OP <sub>1dB</sub> (dBm) | | 4.3 to 7.8 | 12.5 to 16.3 | 9.0 to 10.8 | -8.7 to -1.3 | 6.4 to 8.8 | | | Overall Power Efficiency at OP <sub>1dB</sub> (%) @ MHz | | 7.5 @ 54<br>3.1 @ 864 | 18.5 @ 54<br>7.4 @ 600 | 9.3 @ 100<br>5.2 @ 500 | 1.4 @ 54<br>0.2 @ 864 | 4.4 @ 54<br>3.3 @ 862 | | | HRR (dB) | | >59.8, 12 chips<br>(no cal.) | >59.8, 12 chips<br>(no cal.) | >40, 1 chip<br>(manual cal.) | >59.3, 16 chips<br>(no cal.) | >42, 1 chip<br>(no cal.) | | | All Harmonics (dBc) @<br>1-tone O/P Power (dBm) | | <-55 @ -5 | <-41 @ 15 | <-40 @ 10 | <-40 @ -2 | <-42 @ -3 | | | OIP <sub>3</sub> (dBm) | | 10.6 to 17.0 | 16.1 to 27.0 | 18 to 21 | 5.5 to 8.3 | 15.9 to 21.7 | | | IRR/LRR (dB) | | >37.9<br>(auto cal.) | >37.9<br>(auto cal.) | >45<br>(manual cal.) | >40.0<br>(manual cal.) | >41 | | | 6-MHz BW<br>64-QAM<br>OFDM | EVM (%)<br>@ MHz RF | 1.0 @ 54<br>1.5 @ 804<br>(P <sub>out</sub> = -5 dBm) | 3.7 @ 54<br>2.8 @ 600<br>(P <sub>out</sub> >10 dBm) | 3.2 @ 128<br>(P <sub>out</sub> = 4.6 dBm) | 2.9 @ 96<br>4.0 @ 600<br>(P <sub>out</sub> = -14 dBm) | Not Available | | | | ACLR (dBc)<br>@ MHz offset | <-47 @ 6<br><-67 @ 12<br>(P <sub>out</sub> = -5 dBm) | <-40 @ 6<br><-45 @ 12<br>(P <sub>out</sub> >10 dBm) | <-35 @ 9.14<br>(P <sub>out</sub> = 4.6 dBm) | <-46 @ 6<br><-43 @ 12<br>(P <sub>out</sub> = -14 dBm) | Not Available | | | Noise (dBc/Hz) | | -149 (Simulated) | -143 | -153 | -141 | -122 | | | Active Area (mm²) | | 0.54 | 1.03 | 0.32 | 0.93 | 2.35 | | | Supply Voltage (V) | | 1.2 ,1.8 | 1.2, 2 | 1.5 | 1.2, 2 | 1.8 | | | Technology | | 65 nm CMOS | | 160 nm CMOS | 65 nm CMOS | 180 nm CMOS | | Since the proposed MOD + PA is relatively linear, the EVM can still be limited by the memory effect and I/Q mismatch. The EVMs versus the back-off power with and without DPD are plotted in Fig. 20. The EVM is 7.1% at 54 MHz, 8.2% at 300 MHz, and 8.1% at 600 MHz without DPD under a 2-dB back-off from $OP_{1 \text{ dB}}$ . The EVM under <6-dB back-off has a slope of 1%/dB limited to the nonlinearity of the PA. The EVM under a >8-dB back-off is flat and is limited by the memory effect. Unlike the multilevel-LUT DPD in [31] that requires accurate fractional delay estimation and only works for memoryless nonlinearities, the proposed DPD also addresses the memory effect as the signal can experience significant group-delay variation at BB. On the other hand, as the wideband AM-AM/AM-PM responses of the MOD + PAare rather consistent (confirmed by simulations), the linearity profile extracted by a one-shot DPD training at 300-MHz RF (mid-band) can be applied to the whole band. This strategy provides a promising alternative to ease the calibration (see Fig. 21). Throughout the sub-GHz band, all EVMs meet the specification (4.5%) since a 0-dB back-off. The slope of the EVM is 0.5% per dB under a <6-dB back-off. The minimum EVM is 1.2% under a >6-dB back-off, which is limited by the resolution of the oscilloscope when capturing the data. # C. Performance Benchmarks The chip summary and performance benchmark are given in Table II. This work features an integrated PA and automatic I/Q-LOFT calibration. The 8P-HRM TX [16] with RF filtering can reject all harmonics by 40 dB. However, the enforced 8-path BB input requires the DAC to interleave at $8\times$ of the BB sampling frequency. Also, in order to sufficiently reject the noncancelled seventh and ninth harmonics, the duty cycle control of the LO requires manual calibration via monitoring the RF spectrum. In [15], an active-LC filter incorporating with a 6P-HRM can extend the HRR to 42 dB, but consuming 171 mW for high linearity. It also relies on an off-chip PA to deliver a +10-dBm output power. This work (MOD + PA) outperforms recent works in terms of the integration level and most performance metrics. This work also meets the key specifications of [1] such as IRR, output noise, EVM, and ACLR, as summarized in Table III. | | This Work | IEEE 802.11af<br>Specifications | |--------------------------|----------------------------------|---------------------------------| | O/P Power (dBm) | 12.5 to 16.3 *<br>10.3 to 11.6 # | <20 | | IRR (dB) | 37.9 | 22 | | O/P Noise<br>(dBc/Hz) | -143 | -129 | | EVM (%)<br>@ MHz RF | 3.7 @ 54<br>2.8 @ 600 | 4.5 | | ACLR (dBc) @ MHz offset | <-40 @ 6<br><-45 @ 12 | -25 @ 6<br>-40 @ 12 | TABLE III KEY PERFORMANCES OF THE CHIP MEET THE IEEE 802.11AF SPECIFICATIONS [1] ### VII. CONCLUSION A 65-nm CMOS TV-band white-space TX with an integrated PA for IEEE 802.11af WLAN has been described. The uncalibrated HRR reaches 59.8 dB by exploiting two-stage 6P-/14P-HRMs driven by an 8/16-phase LO, plus on-chip $G_m-C$ low-pass filtering. The MCL-PA with a dual-gate input and a wideband low-Z ground at the second harmonic enhances its power efficiency while reducing the harmonic distortion and ground bounces. The time-domain wideband-auto I/Q-LOFT calibration handles the wideband IRR/LRR (>37.9 dB). The TX + PA shows from +12.5- to +16.3-dBm $OP_{1}$ dB with the corresponding power efficiency ranging from 7.4% to 18.5%. When delivering a > + 10-dBm 64-QAM OFDM output, the first (second) ACLR is < -40 dBc (-45 dBc) and the EVM is within 2.8%-3.7% after DPD. The achieved performance metrics compare favorably with the state-of-the-art. # REFERENCES - [1] Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specifications Amendment 5: TV White Spaces Operation, IEEE Standard 802.11af, Dec. 2013. - [2] A. Flores, R. Guerra, E. Knightly, P. Ecclesine, and S. Pandey, "IEEE802.11af: A standard for TV white space spectrum sharing," *IEEE Commun. Mag.*, vol. 51, no. 10, pp. 92–100, Oct. 2013. - [3] M. Collados, H. Zhang, B. Tenbroek, and H.-H. Chang, "A low-current digitally predistorted direct-conversion transmitter with 25% duty-cycle passive mixer," *IEEE Trans. Microw. Theory Techn.*, vol. 62, no. 4, pp. 726–731, Apr. 2014. - [4] H. Cao, A. S. Tehrani, C. Fager, T. Eriksson, and H. Zirath, "I/Q imbalance compensation using a nonlinear modeling approach," *IEEE Trans. Microw. Theory Techn.*, vol. 57, no. 3, pp. 513–518, Mar. 2009. - [5] A. Zhu, P. J. Draxler, J. J. Yan, T. J. Brazil, D. F. Kinbal, and P. M. Asbeck, "Open-loop digital predistorter for RF power amplifiers using dynamic deviation reduction-based Volterra series," *IEEE Trans. Microw. Theory Techn.*, vol. 56, no. 7, pp. 1524–1534, Jul. 2008. - [6] C.-F. Cheang, K.-F. Un, W.-H. Yu, P.-I. Mak, and R. P. Martins, "A combinatorial impairment-compensation digital predistorter for a sub-GHz IEEE 802.11af-WLAN CMOS transmitter covering a 10x-wide RF bandwidth," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 62, no. 4, pp. 1025–1032, Apr. 2015. - [7] A. Behzad et al., "A 4.92-5.845 GHz direct-conversion CMOS transceiver for IEEE 802.11a wireless LAN," in Proc. IEEE RF Integr. Circuits Symp., Jun. 2004, pp. 335–338. - [8] A. Behzad et al., "A fully integrated MIMO multiband direct conversion CMOS transceiver for WLAN applications (802.11n)," IEEE J. Solid-State Circuits, vol. 42, no. 12, pp. 2795–2808, Dec. 2007. - [9] J. Kang, K. Lee, J. Yoon, Y. Chung, S. Hwang, and B. Kim, "Differential CMOS linear power amplifier with 2nd harmonic termination at common source node," in *Proc. IEEE RF Integr. Circuits Symp.*, Jun. 2005, pp. 443–446. - [10] J. Kang et al., "A highly linear and efficient differential CMOS power amplifier with harmonic control," *IEEE J. Solid-State Circuits*, vol. 41, no. 6, pp. 1314–1322, Jun. 2006. - [11] J. Weldon *et al.*, "A 1.75-GHz highly integrated narrow-band CMOS transmitter with harmonic-rejection mixers," *IEEE J. Solid-State Circuits*, vol. 36, no. 12, pp. 2003–2015, Dec. 2001. - [12] R. Shrestha, E. Mensink, E. Klumperink, G. Wienk, and B. Nauta, "A multipath technique for canceling harmonics and sidebands in a wideband power upconverter," in *Int. Solid-State Circuits Conf. Tech. Dig.*, Feb. 2006, pp. 1800–1809. - [13] S. Spiridon et al., "A 375 mW multimode DAC-based transmitter with 2.2 GHz signal bandwidth and in-band IM3 < -58 dBc in 40 nm CMOS," IEEE J. Solid-State Circuits, vol. 48, no. 7, pp. 1595–1604, Jul. 2013. - [14] M. Alavi, R. Staszewski, L. Vreede, and J. Long, "A wideband 2 × 13-bit all-digital I/Q RF-DAC," *IEEE Trans. Microw. Theory Techn.*, vol. 62, no. 4, pp. 732–752, Apr. 2014. - [15] J. Kim, S. Lee, S. Kim, J. Ha, Y. Eo, and H. Shin, "A 54–862-MHz CMOS transceiver for TV-band white-space device applications," *IEEE Trans. Microw. Theory Techn.*, vol. 59, no. 4, pp. 966–977, Apr. 2011. - [16] S. Subhan, E. Klumperink, A. Ghaffari, G. Wienk, and B. Nauta, "A 100–800 MHz 8-path polyphase transmitter with mixer duty-cycle control achieving < - 40 dBc for all harmonics," *IEEE J. Solid-State Circuits*, vol. 49, no. 3, pp. 595–607, Mar. 2014. - [17] K.-F. Un, P.-I. Mak, and R. P. Martins, "A 53-to75-mW, 59.3-dB HRR, TV-band white-space transmitter using a low-frequency reference LO in 65-nm CMOS," *IEEE J. Solid-State Circuits*, vol. 48, no. 8, pp. 2078–2089, Sep. 2013. - [18] Z. Ru, N. A. Moseley, E. Klumperink, and B. Nauta, "Digitally enhanced software-defined radio receiver robust to out-of-band interference," *IEEE J. Solid-State Circuits*, vol. 44, no. 12, pp. 3359–3375, Dec. 2009. - [19] G.-Z. Qi, K.-F. Un, W.-H. Yu, P.-I. Mak, and R. P. Martins, "A wideband multi-stage inverter-based driver amplifier for IEEE 802.22 WRAN transmitters," in *IEEE 5th Asia Quality Electron. Design Symp.*, Aug. 2013, pp. 6–9. - [20] Z. Ru, E. Klumperink, C. Saavedra, and B. Nauta, "A 300–800 MHz tunable filter and linearized LNA applied in a low-noise harmonic rejection RF-sampling receiver," *IEEE J. Solid-State Circuits*, vol. 45, no. 5, pp. 967–978, May 2010. - [21] C. Lee, W. Ma, and N. Wang, "Averaging and cancellation effect of high-order nonlinearity of a power amplifier," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 54, no. 12, pp. 2733–2740, Dec. 2007. - [22] S. Cripps, RF Power Amplifiers for Wireless Communications, 2nd ed. Norwood, MA, USA: Artech House, 2006. - [23] P.-I. Mak and R. P. Martins, "A $2 \times V_{DD}$ -enabled mobile-TV RF front-end with TV-GSM interoperability in 1-V 90-nm CMOS," *IEEE Trans. Microw. Theory Techn.*, vol. 58, no. 7, pp. 1664–1676, Jul. 2010. - [24] P.-I. Mak and R. P. Martins, High-/Mixed-Voltage Analog and RF Circuit Techniques for Nanoscale CMOS. Berlin, Germany: Springer, 2012. - [25] W.-H. Yu et al., "A nonrecursive digital calibration technique for joint elimination of transmitter and receiver I/Q imbalances with minimized add-on hardware," *IEEE Trans. Circuits Syst. II, Exp. Briefs*, vol. 60, no. 8, pp. 462–466, Aug. 2013. - [26] C. P. Lee et al., "A highly-linear direct-conversion transmit mixer transconductance stage with local oscillator feedthrough and I/Q imbalance cancellation scheme," in *Int. Solid-State Circuits Conf. Tech. Dig.*, Feb. 2006, pp. 1450–1459. - [27] E. Lopelli, S. Spiridon, and J. van der Tang, "A 40 nm wideband direct-conversion transmitter with sub-sampling-based output power, LO feedthrough and I/Q imbalance calibration," in *Int. Solid-State Circuits Conf. Tech. Dig.*, Feb. 2011, pp. 424–425. - [28] I. Elahi, K. Muhammad, and P. T. Balsara, "I/Q mismatch compensation using adaptive decorrelation in a low-IF receiver in 90-nm CMOS process," *IEEE J. Solid-State Circuits*, vol. 41, no. 2, pp. 395–404, Feb. 2006 - [29] R. Andraka, "A survey of CORDIC algorithms for FPGAs," in *Proc. ACM ISFPDA*, Mar. 1998, pp. 191–200. <sup>\*</sup> Test w/ single tone <sup>#</sup> Test w/ 64-QAM OFDM (6MHz BW) - [30] D. H. Mahrof, E. Klumperink, J. Haartsen, and B. Nauta, "On the effect of spectral location of interferers on linearity requirements for wideband cognitive radio receivers," in *IEEE New Frontiers Dynam. Spec*trum Symp., Apr. 2010, pp. 1–9. - [31] K. Kwon, H. Li, Y. Chang, R. Tseng, and Y. Chiu, "CMOS RF transmitter with integrated power amplifier utilizing digital equalization," in *Proc. IEEE Custom Integ. Circuits Conf.*, Sep. 2009, pp. 403–406. **Ka-Fai Un** (S'09) received the B.Sc. degree in electrical engineering from National Taiwan University (NTU), Taipei, Taiwan, in 2007, and the M.Sc. degree in electrical and electronics engineering and Ph.D. degree from the University of Macau (UM), Macao, China, in 2009 and 2014, respectively. He is currently a Post-Doctoral Researcher with the Faculty of Science and Technology (ECE), State-Key Laboratory of Analog and Mixed-Signal VLSI, UM. His research interests are switched-capacitor circuits and wireless circuits design. Dr. Un was the recipient of the 2003 Macau Mathematics Olympics and represented Macau in the Chinese Mathematics Olympics (CMO) and the International Mathematics Olympics (IMO), in Changsha, China, and Tokyo, Japan, respectively. He was also the recipient of the 2008 APCCAS Merit Student Paper Certificate. nications. **Wei-Han Yu** (S'09) received the B.Sc., and M.Sc. degrees in electrical and electronics engineering from the University of Macau (UM), Macao SAR, China, in 2010 and 2012, respectively, and is currently working toward the Ph.D. degree at UM. He is currently with the Faculty of Science and Technology (ECE), State-Key Laboratory of Analog and Mixed-Signal VLSI, UM. His research focus is on RF and millimeter-wave transmitters, power amplifiers, digital predistortion, and electromagnetic (EM) modeling for next-generation mobile commu- Chak-Fong Cheang (S'13) received the B.Sc. and M.Sc. degrees in engineering science from National Cheng Kung University (NCKU), Tainan, Taiwan, in 2008 and 2010, respectively, and is currently working toward the Ph.D. degree at the University of Macau (UM), Macao, China. He is currently with the Faculty of Science and Technology (ECE), State-Key Laboratory of Analog and Mixed-Signal VLSI, UM. His research interests are digital predistortion and digital mitigation on RF impairment and field-programmable gate-array (FPGA)-based embedded signal processing. Gengzhen Qi received the B.Sc., and M.Sc. degrees in electrical and electronics engineering from the University of Macau (UM), Macao SAR, China, in 2011 and 2013, respectively, and is currently working toward the Ph.D. degree at UM. He is currently with the Faculty of Science and Technology (ECE), State-Key Laboratory of Analog and Mixed-Signal VLSI, UM. His research focuses on wideband-tunable CMOS RF transmitters, receivers, and power amplifiers. **Pui-In Mak** (S'00–M'08–SM'11) received the Ph.D. degree from the University of Macau (UM), Macao, China, in 2006. He is currently an Associate Professor with the Faculty of Science and Technology—Electrical and Computer Engineering (ECE), and Associate Director (Research) of the UM State-Key Laboratory of Analog and Mixed-Signal VLSI, UM. His research interests are analog and RF circuits and systems for wireless, biomedical, and physical chemistry applications. His group contributed seven state-of-the-art chips at the International Solid-State Circuits Conference: wideband receivers (2011, 2014, 2015), micro-power amplifiers (2012, 2014), and ultra-low-power receivers (2013, 2014). The team also pioneered the world's first intelligent digital microfluidic technology (iDMF) with micro-nuclear magnetic resonance ( $\mu$ NMR) and polymerase chain reaction (PCR) capabilties. He has coauthored Analog-Baseband Architectures and Circuits for Multistandard and Low-Voltage Wireless Transceivers (Springer, 2007), High-Mixed-Voltage Analog and RF Circuit Techniques for Nanoscale CMOS (Springer, 2012), and Ultra-Low-Power and Ultra-Low-Cost Short-Range Wireless Receivers in Nanoscale CMOS (Springer, 2015). Dr. Mak has served the IEEE in numerous way, including as an Editorial Board member of IEEE Press (2014-2016), an IEEE Distinguished Lecturer (2014–2015), a member of the Board-of-Governors, IEEE Circuits and Systems Society (2009-2011), a senior editor of of IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS (2014-2015), guest editor of the IEEE RFIC VIRTUAL JOURNAL (2014), an associate editor of the IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS (2010–2011, 2014–2015) and the IEEE Transactions on Circuits and Systems—II: EXPRESS BRIEFS (2010-2013). He is the Technical Program Committee (TPC) vice co-chair of ASP-DAC 2016. He was the corecipient of the DAC/ISSCC Student Paper Award (2005), the CASS Outstanding Young Author Award (2010), the SSCS Pre-Doctoral Achievement Awards (2014 and 2015), the National Scientific and Technological Progress Award (2011), and the Best Associate Editor of the IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—II: EXPRESS BRIEFS (2012-2013). In 2005, he was bestowed the Honorary Title of Value for scientific merits by the Macau Government. **Rui P. Martins** (M'88–SM'99–F'08) was born on April 30, 1957. He received the Bachelor (five years), Masters, and Ph.D. degrees, as well as the Habilitation for Full-Professor in electrical engineering and computers from the Department of Electrical and Computer Engineering (DECE), Instituto Superior Técnico (IST), Technical University (TU) of Lisbon, Lisbon, Portugal, in 1980, 1985, 1992, and 2001, respectively. Since October 1980, he has been with the DECE, IST, TU of Lisbon. Since 1992, he has also been with the Faculty of Science and Technology (FST), Department of Electrical and Computer Engineering, University of Macau (UM), Macao, China, where, since August 2013, he has been a Chair-Professor. From 1994 to 1997, he was the Dean of the FST. Since 1997, he has been Vice-Rector of the UM, and since 2008, the Vice-Rector (Research). In 2003, he created the Analog and Mixed-Signal VLSI Research Laboratory, UM, which, in January 2011 was elevated to the State-Key Laboratory of China (the first in Engineering in Macao, China), and is its Founding Director. Prof. Rui Martins was the founding chairman of the IEEE Macau Section (2003–2005) and the IEEE Macau Joint-Chapter on CAS/COM (2005–2008) (2009 World Chapter of the Year of the IEEE Circuits and Systems Society). He was vice-president for Region 10 of IEEE Circuits and Systems Society (2009–2011) and vice-president (World) regional activities and membership of IEEE Circuits and Systems Society (2012–2013). He was an associate editor of the IEEE Transactions on Circuits AND Systems—II: Express Briefs (2010–2013) and was nominated as best associate editor 2012–2013. He was a member of the IEEE Circuits and Systems Society Fellow Evaluation Committee (Classes of 2013 and 2014). He was the recipient of two government decorations: the Medal of Professional Merit from the Macao Government (Portuguese Administration) in 1999 and the Honorary Title of Value from the Macao SAR Government (Chinese Administration) in 2001. In July 2010, he was unanimously elected as corresponding member of the Portuguese Academy of Sciences (in Lisbon), being the only Portuguese Academician living in Asia.