# A 0.016-mm<sup>2</sup> 144- $\mu$ W Three-Stage Amplifier Capable of Driving 1-to-15 nF Capacitive Load With >0.95-MHz GBW

Zushu Yan, Student Member, IEEE, Pui-In Mak, Senior Member, IEEE, Man-Kay Law, Member, IEEE, and Rui P. Martins, Fellow, IEEE

Abstract—A 0.016-mm<sup>2</sup> 144- $\mu$ W three-stage amplifier capable of driving 1-to-15-nF capacitive load ( $C_{\rm L}$ ) is described. It is optimized via combining *current-buffer Miller compensation and parasitic-pole cancellation* (via an active left-half-plane zero circuit) to extend the  $C_{\rm L}$  drivability with small power and area. Fabricated in 0.35- $\mu$ m CMOS, the minimum gain-bandwidth product (GBW), slew rate (SR) and phase margin measured over 1-to-15-nF  $C_{\rm L}$  are 0.95 MHz, 0.22 V/ $\mu$ s and 52.3°, respectively. The results at 15-nF  $C_{\rm L}$  correspond to 2.02x-improved small-signal FOM<sub>S</sub> (= GBW  $\cdot C_{\rm L}$ /Power), and 1.44x-improved large-signal FOM<sub>L</sub> (= SR  $\cdot C_{\rm L}$ /Power) with respect to prior art. The sizing and optimization are systematically guided by *Local Feedback Loop Analysis*. It is an insightful control-centric method allowing the pole-zero placements to be more analyzable and comparable at the system level.

*Index Terms*—Active LHP zero, CMOS, current buffer, current buffer Miller compensation, frequency compensation, Miller compensation, pole-zero cancellation, three-stage amplifier.

#### I. INTRODUCTION

**H** IGH-COLOR-DEPTH LCD drivers demand an extensive number of amplifiers to buffer the Gamma-corrected reference voltages, which have to be stabilized by nF-range capacitors to handle the glitch energy during the digital-to-analog conversion. To deal with such a large capacitive load  $(C_L)$ , most commercial buffer amplifiers require an external resistor (e.g.,  $20 \ \Omega$  for  $C_L = 10 \ nF[1]$ ) in series with the output for ringing reduction. This regrettably penalizes the cost, settling time and high-frequency gain droop.

Three-stage amplifiers are a suitable candidate for precision buffering for their speed, power and area efficiencies at low voltage [2]–[10]. Among the reported solutions, the one with damping factor control [2] has shown the highest  $C_{\rm L}$  drivability up to 1 nF, but already consuming substantial power (426  $\mu$ W)

Manuscript received July 26, 2012; revised September 25, 2012; accepted October 10, 2012. Date of publication January 08, 2013; date of current version January 24, 2013. This paper was approved by Associate Editor Boris Murmann. This work was supported by the Macao Science and Technology Development Fund (FDCT) and the Research Committee of University of Macau.

Z. Yan, P.-I. Mak, and M.-K. Law are with the State Key Laboratory of Analog and Mixed-Signal VLSI, University of Macau, Macao, China (e-mail: pimak@umac.mo).

R. P. Martins is with the State Key Laboratory of Analog and Mixed-Signal VLSI, University of Macau, Macao, China, and also with the Instituto Superior Técnico, TU of Lisbon, Portugal.

Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/JSSC.2012.2229070

and area (0.16 mm<sup>2</sup>). Although recent works feature advanced gain-bandwidth product (GBW) and slew rate (SR), the  $C_{\rm L}$  drivability has not been improved (e.g., 0.15 nF in [8], 0.8 nF in [9], and 0.5 nF in [10]). This paper describes a three-stage amplifier [11] managed to afford particularly large and wide range of  $C_{\rm L}$  (1 to 15 nF) with optimized power (144  $\mu$ W) and die size (0.016 mm<sup>2</sup>), being very suitable for compact LCD drivers [12] with different resolution targets.

Conventionally, design of frequency compensation using Direct Circuit Analysis hinges on the analysis of amplifier's open-loop transfer function, once a potential topology is conceived. Yet, the involvement cannot explicitly correlate the effects of each circuit element to the pole-zero composition of the transfer function. Return Ratio Analysis is another alternative that is particularly suitable for feedback circuits with unapparent feedforward and feedback networks [13]. However, when multiple feedback loops are present, different return ratio expressions for an identical loop may be generated, complicating the stability analysis. Also, at circuit level the well-established frequency compensation techniques based on the feedback model [14] cannot be applied. In this paper, a control-centric Local Feedback Loop (LFL) Analysis expanded from [14] is described, which enables effective analysis, comparison and design of three-stage amplifiers at the system level. Particularly, one crucial observation from this work is that the first non-dominant pole of an amplifier is determined by the unity-gain bandwidth (UGB) of the dominant LFL, which will be the key guiding the frequency compensation.

Guided by LFL analysis, an optimized scheme combining current-buffer Miller compensation (CBMC) and parasitic-pole cancellation is developed. Comparing the results with prior art in Fig. 1, the achieved small-signal FoM<sub>S</sub> (GBW  $\cdot C_L$ /Power) and large-signal FoM<sub>L</sub> (SR  $\cdot C_L$ /Power) are improved by at least 2.02x and 1.44x, respectively.

This paper is organized as follows. Section II compares LFL analysis with direct circuit analysis. In Section III, additional design insights that are not apparent from direct circuit analysis are revealed through LFL analysis on two recent works [8], [9] with their design tradeoffs outlined. The proposed three-stage amplifier is detailed in Section IV, and the experimental results are given in Section V. Section VI concludes the paper.

#### II. DIRECT CIRCUIT ANALYSIS VERSUS LFL ANALYSIS

Emerging applications urge for improved frequency compensation to deal with the stringent resistive-load and capacitive-



Fig. 1. Benchmark this work with the state-of-the-art in terms of  $FOM_S$  and  $FOM_L$ .

load drivabilities with small power and area. This section introduces LFL analysis as a better option for the design of threestage amplifiers, after describing the restrictions of direct circuit analysis.

#### A. Direct Circuit Analysis

Direct circuit analysis is a block-level verification-based design approach based on Kirchhoff's laws. After deducing the I/O transfer function of the amplifier, a full description of its smallsignal dynamics is obtained [15]. Investigating the pole-zero composition and frequency response determines the merits and deficiencies of different frequency compensation schemes [16]. Each trial of a new potential scheme has to undergo the same steps, rendering the whole process more trial-and-error than systematic. Especially, repeating the calculation of high-order (e.g.,  $\geq$  3) transfer function is tedious, while the gained insights are limited. This inefficiency is due to the fact that the correlation between the transfer function and circuit topology is weak in complicated frequency compensation. Although direct circuit analysis can determine the pole-zero position, it conveys little information on how to associate them with the circuit elements. Besides, direct circuit analysis is hard to differentiate certain schemes that are architecturally equivalent, but owning different transfer functions. A better analysis should be able to guide the pole-zero placements and eventually lead to the corresponding topology, but not the other way around as in direct circuit analysis.

# B. Local Feedback Loop (LFL) Analysis

In most amplifiers internal feedback loops are always present, but are seldom treated from the viewpoint of classic feedback control [14]. LFL analysis considers the inner loop toward the outer loop one at a time, which is a common methodology in the design of multi-loop control systems [17]. The evaluation of the *LFL's transfer function* is easier than the *amplifier's transfer*  *function* as a result of less number of elements involved. More importantly, the pole-zero composition of each LFL can be directly linked to its circuit topology. Pinpointing the pole or zero to the circuit elements becomes obvious, providing crucial insights which are not offered by direct circuit analysis.

Several key concepts of LFL analysis [18] are summarized as follows: 1) the LFL's unity-gain bandwidth (UGB) determines, up to which frequency, the feedback path of the LFL can still control the AC response; 2) the stability margins of the inner LFL reflects the high-frequency behavior of the next outer LFL; 3) the UGB of the upper-level LFL is mainly governed by the UGB of its inner LFL. Particularly, the UGB of the dominant LFL reveals which non-dominant pole limits the amplifier's GBW. A design example is given next to illustrate the above concepts.

## C. Design Example

Fig. 2(a) shows a two-stage amplifier with standard Miller compensation. If direct circuit analysis is employed, the I/O transfer function can be established by solving two KCL equations (at the outputs of the 1st and 2nd stages), showing that the amplifier's GBW is limited by the  $G_{\rm mL}/C_{\rm L}$ -pole [19]. Alternatively, LFL analysis can be applied. The LFL is cut at the input of the 2nd stage. Calculating its transfer function  $[T_{\rm SMC}(j\omega)]$ indicates that the LFL's UGB ( $\omega_{u,SMC}$ ) is  $G_{mL}/C_{L}$ , which is exactly the limiting pole obtained by direct circuit analysis. However, interpreting  $G_{\rm mL}/C_{\rm L}$ -pole as the LFL's UGB is certainly more important than only treating it as a limiting pole. Below  $\omega_{u,SMC}$ , the magnitude of  $T_{SMC}(j\omega)$  is larger than unity and thus the amplifier's AC response is well controlled by the feedback path of the LFL. Beyond  $\omega_{u,SMC}$  the LFL gradually becomes inactive and the feedforward path takes over. Thus, the LFL's UGB should be the factor to be maximized when conceiving new frequency compensation. Note that the other unity-gain frequency  $(g_{oL}g_{o1}/G_{mL}C_{a})$  can be ignored when



Fig. 2. Two-stage amplifier and its LFL magnitude response (a) Standard Miller compensation and (b) CBMC.

dealing with the GBW, since it affects only the low-frequency band of the amplifier.

CBMC can perform better frequency compensation than standard Miller compensation as shown in Fig. 2(b). Since there is no loading and voltage division effects, from  $C_{\rm a}$  and  $C_{\rm p1}, \omega_{\rm u,CBMC}$  surpasses  $\omega_{\rm u,SMC}$  by a factor of  $C_{\rm a}/C_{\rm p1}$  under the same power, area and  $C_{\rm L}[20]$ . Specifically, as shown in Fig. 3(a), the same GBW as that of standard Miller compensation with identical  $C_{\rm L}$  can be maintained by proportionally reducing  $G_{m1}$  and  $C_a$ . Although  $\omega_{u,CBMC}$  is sacrificed, CBMC still shows better stability, power and area efficiencies than the standard Miller compensation counterpart, as evidenced in [20]. Alternatively, a smaller  $C_{\rm a}$  can be selected to increase the GBW with smaller area, as shown in Fig. 3(b). As long as  $\omega_{u,CBMC} > \omega_{u,SMC}$ , the phase margin (PM) of CBMC is still higher than that of standard Miller compensation. Furthermore,  $\omega_{u,CBMC}$  can be consciously reduced to exchange the  $C_L$ drivability, as illustrated in Fig. 3(c).

# III. LFL ANALYSIS ON RECENT WORKS

In this section, we apply LFL analysis to two recent works [8], [9] to examine their inner pole-zero composition and reveal



Fig. 3. Design flexibilities: (a) CBMC for low power and small area; (b) CBMC with large GBW and small area; (c) CBMC with high  $C_{\rm L}$  drivability.

their pros and cons, which finally lead to the proposed solution to be described in Section IV.

# *A. Dual Active-Capacitive Feedback Compensation (DACFC)* [9]

The Dual Active Capacitive-Feedback Compensation (DACFC) scheme is shown in Fig. 4(a).  $G_{m1}$ ,  $G_{m2}$ , and  $G_{mL}$ 



Fig. 4. (a) DACFC three-stage amplifier and its (b) LFL bode plot with and without  $C_1$ .

denote the three gain stages. The output conductance and lumped parasitic capacitance of each stage are  $g_{\rm o1-2,L}$  and  $C_{\rm p1-3}$ , respectively.  $C_{\rm p3}$  is embedded into  $C_{\rm L}$ .  $G_{\rm mf1}$  and  $G_{\rm mf2}$  are two feedforward stages. The two LFLs are built around  $C_{\rm m}$ ,  $G_{\rm ma1}$  and  $G_{\rm ma2}$ .

LFL analysis can be first performed from the inner loop, which demonstrates that the inner LFL is almost inactive (inner LFL with  $G_{ma2}$  shows a loop gain < 1). Otherwise, the  $C_{\rm L}$  drivability will be reduced to a level similar to that of a two-stage CBMC amplifier, due to the limitation from the inner LFL's UGB ( $\omega_{\rm u,CBMC} = C_{\rm m}G_{\rm mL}/C_{\rm p2}C_{\rm L}$ ). The key merit of DACFC is the outer LFL made up of CBMC, benefiting its UGB ( $\omega_{\mu,DACFC}$ ). However,  $C_1$  has to be added to ensure the stability of the outer LFL, offsetting the increment of  $\omega_{\mu,DACFC}$ offered by CBMC. While this requirement is mathematically derived in [9], it can be explained via the outer LFL transfer function [ $T_{DACFC}(s)$ ] as

$$T_{DACFC}(s) \approx -\frac{sG_{m2}G_{mL}C_m \left(1 + s\frac{G_{mf2}C_{p2}}{G_{mL}G_{m2}}\right)}{g_{o1}g_{o2} \left(1 + s\frac{C_L}{g_{oL}}\right) \left(1 + s\frac{C_1}{g_{o1}}\right) \left(1 + s\frac{C_{p2}}{g_{o2}}\right) (1 + sR_mC_m)}.$$
(1)

During the calculation of  $T_{\text{DACFC}}(s)$ , three common assumptions are made: 1) DC gain of each stage is  $\gg$  1; 2)  $C_{\text{L}} \gg$ 

 $(C_{\rm m}, C_1) \gg (C_{\rm p1}, C_{\rm p2})$  and 3)  $G_{\rm ma1} = 1/R_{\rm m}$ . The impact of the inner loop is negligible when  $C_{\rm L}$  is sufficiently large. The outer LFL has four LHP poles:  $(\omega_{\rm p1} = g_{\rm o1}/C_1)$ ,  $(\omega_{\rm p2} = g_{\rm o2}/C_{\rm p2})$ ,  $(\omega_{\rm p3} = g_{\rm oL}/C_{\rm L})$  and  $(\omega_{\rm p4} = 1/R_{\rm m}C_{\rm m})$ , and two LHP zeros. One zero  $(\omega_{\rm z1} = 0)$  is at the origin. The other  $(\omega_{\rm z2} = G_{\rm mL}G_{\rm m2}/G_{\rm mf2}C_{\rm p2})$  is at a very high frequency being ignorable.  $\omega_{\mu,\rm DACFC}$  can be computed from (1) and given as

$$\omega_{\mu,DACFC} = \frac{G_{m2}}{g_{a2}} \frac{C_m}{C_1} \frac{G_{mL}}{C_L}.$$
 (2)

The resultant LFL gain response is depicted in Fig. 4(b). If  $C_1$  is not present,  $\omega_{p1}$  will be placed at a much higher position  $(g_{o1}/C_{p1})$ , which will degrade substantially the LFL's PM as  $\omega_{p2}$  cannot be shifted up. A worsen LFL's PM will result in a pair of complex conjugate poles with a small damping factor in the amplifier's transfer function, sacrificing the transient response [15]. Also,  $C_1$  cannot be oversized as  $\omega_{\mu,DACFC}$  will become smaller than the amplifier's GBW reducing the PM. Because of the lowered  $\omega_{p1}$ ,  $\omega_{\mu,DACFC}$  decreases by a factor  $C_1/C_{p1}$  when compared with the ideal case (dashed line), and ultimately bounded by  $\omega_{p2}$ . Pushing up  $\omega_{p2}$  can only be achieved at the transistor level, e.g., reduce  $C_{p2}$  or increase  $g_{o2}$  via the gain reduction of  $G_{m2}$ [4]. Since the impact of increasing  $C_L$  is the same as  $C_1$  [as seen from (2)], the  $C_L$  drivability of DACFC is constrained by the limited  $\omega_{\mu,DACFC}$ .

#### B. Impedance Adapting Compensation (IAC) [8]

The Impedance Adapting Compensation (IAC) scheme is shown in Fig. 5(a). Only one LFL is built by standard Miller compensation. Its key feature is a series network consisting of  $R_a$  and  $C_a$ , being attached at the 2nd stage's output. With similar assumptions as DACFC, as well as  $g_{o2} \ll 1/R_a$  and  $C_a \gg C_{p2}$ , the LFL transfer function  $[T_{IAC}(s)]$  is calculated as

$$T_{IAC}(s) \approx -\frac{sG_{m2}G_{mL}C_m \left(1 + sR_aC_a\right) \left(1 + s\frac{G_{mf}C_{p2}}{G_{mL}G_{m2}}\right)}{g_{o1}g_{o2} \left(1 + s\frac{C_L}{g_{oL}}\right) \left(1 + s\frac{C_m}{g_{o1}}\right) \left(1 + s\frac{C_a}{g_{o2}}\right) \left(1 + sR_aC_{p2}\right)}.$$
(3)

There are four poles and three zeros in  $T_{IAC}(s)$ . If the  $g_{o1}/C_m$ -pole is cancelled by the  $1/R_aC_a$ -zero and the effect of the high-frequency zero  $(G_{mL}G_{m2}/G_{mf}C_{p2})$  is neglected, (3) can be simplified into

$$T_{IAC}(s) \approx -\frac{sG_{m2}G_{mL}C_m}{g_{o1}g_{o2}\left(1 + s\frac{C_L}{g_{oL}}\right)\left(1 + s\frac{C_o}{g_{o2}}\right)(1 + sR_aC_{p2})}.$$
(4)

The magnitude response of  $T_{IAC}(s)$  is shown in Fig. 5(b). Also included for comparison are: 1) DACFC; 2) IAC without the series *RC* network (i.e., single Miller capacitor compensation in [4]), and 3) the case that the  $g_{o2}/C_{p2}$ -pole is eliminated. The series network not only generates a LHP zero  $(1/R_aC_a)$  to cancel the  $g_{o1}/C_m$ -pole, but also pushes the original  $g_{o2}/C_{p2}$ -pole to a lower position  $(g_{o2}/C_a)$  that increases the stability margin of the LFL. From (4), the LFL's UGB is given by

$$\omega_{\mu,IAC} = G_{m2} R_a \frac{G_{mL}}{C_L},\tag{5}$$



Fig. 5. (a) IAC three-stage amplifier and its (b) LFL bode plot. The DACFC's LFL and IAC's LFL without the series *RC* network are added for comparison.

which reveals that  $\omega_{\mu,\text{IAC}}$  can be increased by selecting a large  $R_{\rm a}$ . Although  $R_{\rm a}$  cannot be oversized due to the third  $1/R_{\rm a}C_{\rm p2}$ -pole, it decouples the UGB boosting factor  $(G_{\rm m2}/g_{\rm o2})$  and the limiting  $g_{\rm o2}/C_{\rm p2}$ -pole in DACFC, which allows  $\omega_{\mu,\text{IAC}}$  surpassing  $\omega_{\mu,\text{DACFC}}$  under the same design parameters. Besides, increasing  $R_{\rm a}$  involves no power penalty. However, IAC still suffers from two key pitfalls: 1) since the LFL utilizes standard Miller compensation rather than CBMC, it decreases the LFL's UGB at the outset. Even using standard Miller compensation, owing to the introduction of the RC network,  $\omega_{\mu,\text{IAC}}$  is significantly degraded in comparison with the scheme described by the dashed curve in Fig. 5(b); 2) when  $R_{\rm a}$  is comparable to  $1/g_{\rm o2}$ , it will destroy the LHP zero generation.

# IV. PROPOSED FREQUENCY-COMPENSATION SCHEME

Based on the analysis given in Section III, the proposed solution is intended to combine the benefit of CBMC in DACFC, while avoiding the bandwidth reduction originated from the RC network in IAC. As shown in Fig. 6(a), the outer LFL is built upon CBMC, whereas the parasitic  $g_{o2}/C_{p2}$ -pole is cancelled by a properly generated LHP zero for parasitic-pole cancellation. The proposed active LHP zero circuit for this parasitic-pole cancellation transforms the low-pass RC network into high-pass via negative feedback, offering the desired LHP zero without introducing unwanted low-frequency poles.  $G_{mb2}$  offers V-to-I conversion for driving  $G_{mL}$ , as well as isolation between  $V_2$  and  $V_3$  nodes, imposing  $C_{pb}$  to be much smaller than  $C_{p2}$  (i.e., shift up the parasitic pole). In this way, the problem of degraded  $\omega_{\mu,IAC}$  owing to the passive LHP zero circuit can be solved.

#### A. LFL Transfer Function and Comparison

With similar assumptions in DACFC and IAC, the LFL transfer function of the proposed scheme can be obtained as shown in equation (6) at the bottom of the page. Comparing (6) with (1) and (4), two new poles  $P_1(G_{\rm mb1}/C_z)$  and  $P_2(1/R_zC_{\rm pb})$  are introduced by the active LHP zero block. The LFL's UGB  $\omega_{\mu,\rm Proposed}$  is expressed by

$$\omega_{\mu,\text{Proposed}} = G_{m2} R_Z \frac{C_m}{C_{p1}} \frac{C_z}{C_{p2}} \frac{G_{mL}}{C_L}.$$
(7)

From (2), (5) and (7),  $\omega_{\mu,\text{Proposed}} > \omega_{\mu,\text{DACFC}}$  and  $\omega_{\mu,\text{IAC}}$  by one to two orders of magnitude is expected under the same conditions (e.g., transconductance, output conductance, parasitic, compensation and load capacitance). This extended  $\omega_{\mu,\text{Proposed}}$ can be exchanged for a higher  $C_{\text{L}}$  drivability without power and area penalty. According to the magnitude responses given in Fig. 6(b), the non-dominant poles of the proposed amplifier can be discussed and compared with those of DACFC and IAC as follows.

 $P_1$  is the main pole constraining  $\omega_{\mu, Proposed}$ . It should be located beyond the counterpart limiting poles:  $(g_{o2}/C_{p2})$  in DACFC and  $(1/R_a C_{p2})$  in IAC, when maximizing  $\omega_{\mu, \text{Proposed}}$ . Comparing with DACFC, the  $g_{o2}/C_{p2}$ -pole in our amplifier is cancelled with the active  $1/R_zC_z$ -zero, which is much lower than  $P_1$  to maintain the high-pass characteristic. Comparing with IAC,  $P_1$  is also much larger than its  $1/R_aC_{p2}$ -pole since a large  $\omega_{\mu,\text{IAC}}$  necessitates a big  $R_{\text{a}}$ , penalizing the position of  $1/R_{\rm a}C_{\rm p2}$ -pole under a large  $C_{\rm L}$ . For  $P_2$ , a very small  $C_{\rm pb}$ makes it ignorable to  $\omega_{\mu, \text{Proposed}}$ , as it stays at a rather high frequency (5x to 10x of  $\omega_{\mu, \text{Proposed}}$ ). Comparing with IAC,  $P_2$ is also much higher than its limiting  $1/R_a C_{p2}$ -pole if  $R_z = R_a$ . Finally,  $P_3(G_{\rm ma}/C_{\rm m})$  can be pushed sufficiently high too (e.g., 10x to 20x of  $\omega_{\mu,\text{Proposed}}$ ) by employing a tailored wideband current buffer (to be discussed in Section IV-E). All these facts are verified by simulations next.

$$T_{\text{Proposed}}(s) \approx -\frac{sG_{m2}G_{mL}C_m}{g_{o2}g_{oL}\left(1+s\frac{C_L}{g_{oL}}\right)\left(1+s\frac{C_{p1}}{g_{o1}}\right)\left(1+s\frac{C_Z}{G_{mb1}}\right)\left(1+sR_ZC_{pb}\right)\left(1+s\frac{C_m}{G_{ma}}\right)}$$
(6)



Fig. 6. (a) Proposed scheme using CBMC plus parasitic-pole cancellation, and its (b) LFL bode plot. The DACFC's LFL and IAC's LFL are added for comparison.

TABLE I BLOCK-LEVEL SIZING PARAMETERS OF IAC, DACFC, AND THE PROPOSED SCHEME

| G <sub>m1</sub> = 10 µS   | G <sub>ma</sub> = 500 µS  | g <sub>o2</sub> = 0.5 μS  | C <sub>pb</sub> = 0.01 pF | C <sub>z</sub> = 0.8 pF        |
|---------------------------|---------------------------|---------------------------|---------------------------|--------------------------------|
| G <sub>m2</sub> = 90 µS   | G <sub>ma1</sub> = 500 μS | <i>g</i> ₀∟ = 6.67 µS     | C <sub>m</sub> = 1.2 pF   | <i>R</i> <sub>m</sub> = 2 kΩ   |
| G <sub>mL</sub> = 640 µS  | G <sub>mb1</sub> = 120 µS | g <sub>ob</sub> = 0.5 μS  | C <sub>L</sub> = 10 nF    | <i>R</i> <sub>a</sub> = 1.5 MΩ |
| G <sub>mf</sub> = 640 µS  | G <sub>mb2</sub> = 120 µS | C <sub>p1</sub> = 0.4 pF  | C1 = 5 pF                 | <i>R</i> <sub>z</sub> = 240 kΩ |
| G <sub>mf2</sub> = 640 µS | g <sub>o1</sub> = 0.13 μS | C <sub>p2</sub> = 0.08 pF | Ca = 8 pF                 |                                |

### B. Block-Level Simulation Verification

Block-level simulations are employed to compare the performances between DACFC, IAC and the proposed schemes. The parameters are sized under the same power budget,  $C_{\rm L}$  and LFL's PM, as summarized in Table I.

Unlike DACFC and IAC that require large-sized components  $(C_1 = 5 \text{ pF}, C_a = 8 \text{ pF} \text{ and } R_a = 1.5 \text{ M}\Omega)$ , they are manageably smaller  $(C_z = 0.8 \text{ pF} \text{ and } R_z = 240 \text{ k}\Omega)$  in the proposed scheme. Their LFLs' gain and phase responses are

plotted in Fig. 7. With < 8° PM difference,  $\omega_{\mu,\text{Proposed}}$  exceeds  $\omega_{\mu,\text{DACFC}}$  and  $\omega_{\mu,\text{IAC}}$  by 14x and 10x, respectively. As mentioned earlier,  $\omega_{\mu,\text{Proposed}}$ ,  $\omega_{\mu,\text{DACFC}}$  and  $\omega_{\mu,\text{IAC}}$  are the limiting factors of their corresponding amplifier's GBW. Thus, comparing with DACFC and IAC, the proposed scheme should show higher  $C_{\text{L}}$  drivability under the same GBW or a larger GBW under the same  $C_{\text{L}}$ .

## C. Design Equations

For simplicity, the influences of  $P_2$  and  $P_3$  on the LFL are first ignored. Assuming that K is the ratio of  $P_1$  to  $\omega_{\mu,\text{Proposed}}$ , the LFL's PM can be approximately given by [21],

$$PM_{LFL} \approx 90^{\circ} - \arctan \frac{\omega_{\mu, \text{Proposed}}}{P_1}$$
$$= 90^{\circ} - \arctan \frac{1}{K}.$$
 (8)

The amplifier's transfer function can be obtained with the aid of the signal-flow graph (SFG) and driving-point impedance



Fig. 7. Simulated LFL gain and phase responses of DACFC, IAC, and the proposed scheme.

(DPI) methodology [22] as given by equation (9) shown at the bottom of the page, where  $A_{\rm f}$  is  $(G_{\rm m2}G_{\rm mL}/G_{\rm mf}g_{\rm o2})$ ,  $A_{\rm DC}$ is the DC gain  $(G_{\rm m1}G_{\rm m2}G_{\rm mL}/g_{\rm o1}g_{\rm o2}g_{\rm oL})$ , and  $\omega_{\rm pd}$  is the dominant pole  $(g_{\rm o1}g_{\rm o2}g_{\rm oL}/C_{\rm m}G_{\rm m2}G_{\rm mL})$ . Hence, the GBW is  $(G_{\rm m1}/C_{\rm m})$ . The damping factor  $\zeta$  and natural frequency  $\omega_{\rm n}$  of the second-order polynomial in the denominator of (9) can be characterized by the LFL parameters  $\omega_{\mu,\rm Proposed}$  and K, which are manifested as

$$\zeta = \frac{\sqrt{K}}{2} \tag{10}$$

$$\omega_n = \sqrt{K} \omega_{\mu, \text{Proposed}}.$$
 (11)

The exact relationship among GBW,  $\zeta$ , and  $\omega_n$  can be determined by a proper set of coefficients for the denominator of the 3rd-order closed-loop transfer function, which is obtained by configuring the amplifier in unity-gain feedback (e.g., Butterworth coefficients). Alternatively, a more design-oriented approach is to link up the LFL parameters ( $\omega_{\mu, Proposed}$  and  $PM_{LFL}$ ) to those of the amplifier (GBW and PM) as given by

$$PM_{\text{Overall}} \approx 90^{\circ} - \arctan \frac{2\zeta \left(\frac{GBW}{\omega_n}\right)}{1 - \left(\frac{GBW}{\omega_n}\right)^2}$$
$$= 90^{\circ} - \arctan \frac{GBW}{\omega_{\mu,\text{Proposed}}}$$
$$\times \frac{1 - \frac{1}{\tan(PM_{LFL})} \left(\frac{GBW}{\omega_{\mu,\text{Proposed}}}\right)^2. (12)$$

With the given GBW,  $PM_{LFL}$  and  $PM_{Overall}$ , it is possible to determine  $\omega_{\mu,Proposed}$  from (12). Other parameters should be optimized to achieve the desired GBW by pushing up other LFL non-dominant poles ( $P_{1-3}$ ). Here, to achieve 76°  $PM_{LFL}$  and  $PM_{Overall}$ ,  $P_1$  is located 4x higher than  $\omega_{\mu,Proposed}$  ( $\zeta = 1$ ), and  $\omega_{\mu,Proposed}$  is set as 4x of the GBW. If  $P_2(P_3)$  is 5x (10x) beyond  $\omega_{\mu,Proposed}$ ,  $R_z$  can be determined by the estimated

$$A_{\rm Proposed}(s) \approx \frac{A_{DC} \left(1 + \frac{s}{A_f \cdot P_1} + \frac{s^2}{A_f \cdot P_1 \cdot P_2}\right) \left(1 + \frac{s}{P_3}\right)}{\left(1 + \frac{s}{\omega_{pd}}\right) \left(1 + \frac{s}{\omega_{\mu,\rm Proposed}} + \frac{s^2}{\omega_{\mu,\rm Proposed} \cdot P_1} + \frac{s^3}{\omega_{\mu,\rm Proposed} \cdot P_1 \cdot P_2} + \frac{s^4}{\omega_{\mu,\rm Proposed} \cdot P_1 \cdot P_2 \cdot P_3}\right)}$$
$$\approx \frac{A_{DC}}{\left(1 + \frac{s}{\omega_{pd}}\right) \left(1 + 2\zeta \left(\frac{s}{\omega_n}\right) + \left(\frac{s}{\omega_n}\right)^2\right)}\right)}$$
$$= \frac{A_{DC}}{\left(1 + \frac{s}{\omega_{pd}}\right) \left(1 + \frac{s}{\omega_{\mu,\rm Proposed}} + \frac{s^2}{K \cdot \omega_{\mu,\rm Proposed}^2}\right)}$$
(9)



Fig. 8. Schematic of the proposed three-stage amplifier.

 $C_{\rm pb}$ .  $G_{\rm ma}$  is set as 40x of  $G_{\rm m1}$ . Although this arrangement degrades  ${\rm PM}_{\rm LFL}$  by 17.1°, the impact on  ${\rm PM}_{\rm Overall}$  is only 4.3° as long as  $\omega_{\mu,{\rm Proposed}}$  is 4x of the GBW. The optimization of  $G_{\rm m2}$  and  $G_{\rm mL}$  involves the power tradeoff between the 2nd and 3rd stages, and can be obtained by the estimated  $C_{\rm p1}$  and  $C_{\rm p2}$ . Finally,  $G_{\rm mf}$  should match  $G_{\rm mL}$  for realizing a symmetric output stage.

#### D. Transient Response

The transient response includes the slewing and linear settling periods [23]. The SR of the proposed amplifier is mainly constrained by those of the first and final stages since the lumped parasitic capacitance  $C_{p2}$  is much smaller than  $C_m$  and  $C_L$ . Like most three-stage amplifiers reported [2]–[10], the SR is not limited by the push-pull output stage if  $C_L < 5 \text{ nF}$  (in the designed amplifier) as given by

$$SR \approx \frac{I_1}{C_m},$$
 (13)

where  $I_1$  is the (dis)charging current for  $C_m$ . If  $C_L$  is further increased, the SR of the output stage dominates as its dynamic current is not adequate to support fast slewing [21], which is in line with the measured SR data (Section V). Thus, the SR of the proposed amplifier can be expressed as

$$SR \approx \frac{I_{o,\max}}{C_L},$$
 (14)

where  $I_{o,max}$  denotes the maximum output current available to (dis)charge  $C_{\rm L}$ .

In parasitic-pole cancellation any component variations can lead to pole-zero mismatch. As a consequence, if the resulting doublet is located well below the unity-gain frequency of the amplifier, it will introduce a slow-settling component whose magnitude is proportional to the doublet frequency, and inversely proportional to the doublet spacing [23]. Since the parasitic-pole cancellation is applied within the LFL, the doublet spacing is roughly compressed by the LFL's loop gain at the doublet frequency [24], which is 20 dB for  $C_{\rm L} = 10 \text{ nF}$ and increases as  $C_{\rm L}$  decreases in the designed amplifier. Hence, the impact of the parasitic-pole cancellation on the transient response is greatly suppressed. After the impact of the doublet is ignored, the simplified 3rdorder transfer function (9) can help to analyze the linear settling behavior, which can be fully determined by the three open-loop parameters: GBW,  $\zeta$ , and  $\omega_n$ [25]. As the gain margin (GM) of the amplifier can be given by

$$GM_{\text{Overall}} \approx 20 \log \frac{2\zeta}{\left(\frac{GBW}{\omega_n}\right)},$$
 (15)

together with the  $PM_{overall}$  (12) and the GBW they set the pattern for the linear settling. Specifically, for a given ratio of GBW to  $\omega_n$  a large  $GM_{overall}$  implies a large  $\zeta$ , thus introducing less ringing on the step response.

#### E. Circuit Implementation

Fig. 8 depicts the schematic of the proposed three-stage amplifier with the bias currents as shown. The 1st gain-stage  $G_{m1}(M_{1-10})$  is a folded cascode structure featuring a PMOS input differential pair  $(M_{1-2})$ . A wideband current buffer  $G_{\rm ma}(M_{3-8}, R_{1-2})$  is embedded inside  $G_{\rm m1}$ . The active LHP zero circuit  $(R_z, C_z, M_{13-14})$  is embodied in the 2nd gain stage  $G_{m2}(M_{11-14})$  to enhance the power efficiency.  $M_{13}$  and  $M_{14}$ realize  $G_{mb1}$  and  $G_{mb2}$ , respectively. Connecting the gate of  $M_9$  to that of  $M_{12}$  results in a push-pull 2nd stage enhancing the SR at the output of  $G_{m2}$  [26], and it also forms an undesired inverting current buffer  $(M_5, M_7, M_9, M_{12})$ . However, since the signal strength fed back from  $V_0$  to the source of  $M_7$  is much smaller than that at the source of  $M_8$  the impact of the inverting current buffer can be safely ignored. Besides, although the feedforward gain stage  $(M_1, M_9, M_{12})$  can create a LHP zero, its location is far beyond the amplifier's GBW. The 3rd gain-stage  $G_{mL}(M_{15})$  is combined with another feedforward stage  $G_{\rm mf}(M_{16})$  to form a push-pull structure. The optimization of  $G_{ma}$  and the active LHP zero are discussed as follows:

1) Wideband Current Buffer  $G_{ma}$ : A very large  $G_{ma}$  is desired to push  $P_3$  to a high frequency. The simple current buffer  $(M_8)$  in Fig. 9(a) will draw considerable power to achieve the required  $G_{ma}$ . The regulated current buffer  $(M_6 \text{ and } M_8)$  in Fig. 9(b) boosts  $G_{ma}$  by a factor of  $(g_{m6}r_{o6} + 1)$  as a result of the LFL formed by  $M_6$ , but it is hard to maintain a large  $G_{ma}$  at high frequencies due to the limited bandwidth of the



Fig. 9. Possible embedded current buffers for the PMOS folded cascode input stage. (a) Common gate. (b) Regulated.

LFL (i.e., the parasitic pole associated with the drain of  $M_6$  is significant) [20]. The employed  $G_{\rm ma}$  in Fig. 8 balances the tradeoff between  $G_{\rm ma}$  and bandwidth [27]. The LFL ( $M_{5-6}$  and  $R_{1-2}$ ) can provide a better controlled LFL gain ( $2g_{\rm m5}R_1 + 1$ ) with moderately sized  $R_1$ , while pushing the parasitic pole beyond the LFL's UGB. The drain output impedance of  $M_8$  is also boosted by the LFL gain.

2) Active LHP Zero Circuit: Locating  $P_2$  to a high frequency requires the minimization of  $C_{\rm pb}$  (a relatively large  $R_z$  is necessary to generate the  $1/R_zC_z$ -zero) and therefore the active LHP zero circuit should be as compact as possible. To accomplish this, both  $G_{\rm mb1}(M_{13})$  and  $G_{\rm mb2}(M_{14})$  are embodied in  $G_{\rm m2}$  to avoid extra parasitic capacitance. A current mirror ratio of 2:3 is designed for  $M_{14}$ :  $M_{13}$  so as to minimize the parasitic capacitance induced by  $M_{14}$  while shifting up  $P_1$ . As mentioned before, for nF-range  $C_{\rm L}$ , the slew rate of the amplifier is dominated by the maximum charging or discharging current at the output stage  $(M_{15-16})$ . The output stage can attain certain resistive drivability (e.g., 25 k $\Omega$  by a proper increment of its quiescent current (e.g., +44%).

# F. Performance Over PVT Variations

The effect of temperature and process variations on the amplifier's performance with  $C_{\rm L} = 15$  nF has been investigated via post-layout corner simulations. The results are summarized in Table II. At 27°C, the GBW variation remains within 17.4% of its typical value while the phase and gain margins over various corners deviate about 3° and 1 dB, respectively. The corresponding deviations of the average SR and settling time, from their nominal values, are 39.5% and 30.4%, respectively, whereas the quiescent current  $I_{\rm Q}$  changes less than 14% from its typical value. Roughly the same percentage of variations over the typical values is observed for  $-40^{\circ}$ C and  $125^{\circ}$ C. On the other hand, 10% reduction of the supply voltage has no significant impact on the overall performance. The minimum supply voltage is limited by the proper operation of the current buffer, which is around 1.7 V.

## V. EXPERIMENTAL RESULTS

The aim of this work is to maximize the  $C_{\rm L}$  drivability while maintaining the power and area comparable with the recent works [8], [9]. The optimized circuit parameters and size of each device are given in Table III and IV, respectively. The

TABLE II Post-Layout Simulations of the Proposed Amplifier at  $C_{\rm L}=15~{\rm nF}$ Over Process and Temperature Variations

| T = 27°C        |           |           |           |           |           |  |  |
|-----------------|-----------|-----------|-----------|-----------|-----------|--|--|
| Corner          | TT        | FF        | SS        | SNFP      | FNSP      |  |  |
| GBW (MHz)       | 0.92      | 1.08      | 0.78      | 1.04      | 0.82      |  |  |
| PM (Degree)     | 52.5      | 50.3      | 55.6      | 50.6      | 55.2      |  |  |
| GM (dB)         | 19.5      | 20.2      | 19.3      | 19.1      | 20.3      |  |  |
| SR+/SR- (V/µs)  | 0.18/0.20 | 0.26/0.27 | 0.14/0.15 | 0.21/0.16 | 0.16/0.25 |  |  |
| 1% Ts+/Ts- (μs) | 5.17/5.71 | 3.54/4.40 | 6.92/7.27 | 5.24/5.79 | 5.57/5.58 |  |  |
| lα (μA)         | 69.2      | 78.6      | 63.2      | 78.0      | 63.7      |  |  |

| T = -40°C       |           |            |           |           |           |  |  |
|-----------------|-----------|------------|-----------|-----------|-----------|--|--|
| Corner          | TT        | FF         | SS        | SNFP      | FNSP      |  |  |
| GBW (MHz)       | 0.98      | 1.17       | 0.83      | 1.11      | 0.87      |  |  |
| PM (Degree)     | 54.0      | 51.8       | 57.0      | 52.1      | 56.7      |  |  |
| GM (dB)         | 20.7      | 21.2       | 20.6      | 20.4      | 21.3      |  |  |
| SR+/SR- (V/µs)  | 0.18/0.26 | 0.26//0.36 | 0.14/0.20 | 0.21/0.22 | 0.16/0.33 |  |  |
| 1% Ts+/Ts- (µs) | 5.89/4.92 | 4.07/3.80  | 7.08/6.81 | 4.38/4.86 | 6.33/4.81 |  |  |
| Ια (μΑ)         | 62.9      | 72.1       | 57.2      | 71.5      | 57.7      |  |  |

| T = 125°C       |           |           |           |           |           |  |  |
|-----------------|-----------|-----------|-----------|-----------|-----------|--|--|
| Corner          | TT        | FF        | SS        | SNFP      | FNSP      |  |  |
| GBW (MHz)       | 0.82      | 0.94      | 0.70      | 0.92      | 0.72      |  |  |
| PM (Degree)     | 52.3      | 50.4      | 55.5      | 50.7      | 55.0      |  |  |
| GM (dB)         | 18.9      | 19.9      | 18.5      | 18.3      | 20.0      |  |  |
| SR+/SR- (V/µs)  | 0.18/0.14 | 0.27/0.20 | 0.14/0.11 | 0.22/0.12 | 0.15/0.18 |  |  |
| 1% Ts+/Ts- (μs) | 5.04/7.17 | 3.59/5.51 | 6.78/9.02 | 7.34/7.47 | 5.73/6.80 |  |  |
| lα (μΑ)         | 78.0      | 89.2      | 71.7      | 87.1      | 73.6      |  |  |

TT = Typical; FF = fast NMOS/fast PMOS; SS = slow NMOS/slow PMOS; FNSP = fast NMOS/slow PMOS.



Fig. 10. Chip micrograph.

prototype fabricated in 0.35- $\mu$ m CMOS occupies 0.016-mm<sup>2</sup> die size (Fig. 10).

## A. AC and Step Responses

The measured AC responses are plotted in Fig. 11.  $C_{\rm L}$  can be as large as 15 nF with 18.1-dB gain and 52.3° phase margins, and as small as 1 nF with 9.8-dB gain and 83.2° phase margins. The extrapolated DC gain is > 100 dB. The GBW is 0.95 MHz at 15-nF  $C_{\rm L}$ . For the step responses (Fig. 12), the averaged SR and 1% setting time ( $T_{\rm S}$ ) measured in unity-gain configuration are 0.22 V/ $\mu$ s and 4.49  $\mu$ s, respectively. The overshoot appearing at 15-nF  $C_{\rm L}$  is due to the SR limitation of the output stage [21].



Fig. 11. (a) Measured AC responses at  $C_{\rm L} = 1 n F$  and 15 nF (b) Measured variation of gain and phase margins versus  $C_{\rm L}$ .



Fig. 12. Measured step responses at (a)  $C_{\rm L} = 1 \text{ nF}$  and (b)  $C_{\rm L} = 15 \text{ nF}$ .

 TABLE III

 CIRCUIT PARAMETERS FOR THE PROPOSED AMPLIFIER

| G <sub>m1</sub> = 11 μS    | G <sub>mb1</sub> = 96.4 µS       | g <sub>m5</sub> = 20.1 µS | C∟ = 15 nF                        |
|----------------------------|----------------------------------|---------------------------|-----------------------------------|
| G <sub>m2</sub> = 93.8 µS  | $G_{mb2} = 64.2 \ \mu S$         | C <sub>m</sub> = 1.424 pF | <i>R</i> <sub>z</sub> = 239.9 kΩ  |
| G <sub>mL</sub> = 638.8 µS | <i>g</i> <sub>m8</sub> = 46.8 μS | Cz = 1.219 pF             | <i>R</i> <sub>1,2</sub> = 85.7 kΩ |
| G <sub>mf</sub> = 629 µS   | <i>g</i> <sub>mb8</sub> = 9.6 μS |                           |                                   |

TABLE IV TRANSISTOR SIZES

| Device                         | Size (µm/µm) | Device          | Size (µm/µm)  |
|--------------------------------|--------------|-----------------|---------------|
| $M_1/M_2$                      | 12/2 (x8)    | M13             | 1.5/0.6 (x3)  |
| M <sub>3</sub> /M <sub>4</sub> | 4/2 (x3)     | M14             | 1.5/0.6 (x2)  |
| M5/M6                          | 1/0.35 (x3)  | M15             | 1.5/0.6 (x20) |
| M7/M8                          | 1/0.35 (x4)  | M16             | 3/1(x40)      |
| M9/M10                         | 3/1 (x2)     | M <sub>b1</sub> | 6/2           |
| M11                            | 3/1 (x6)     | M <sub>b1</sub> | 6/2 (x2)      |
| M12                            | 3/1 (x4)     |                 |               |

## B. Noise, PSRR and Unity-Gain Responses

Configured as a unity-gain feedback amplifier the measured output noise density spectrum [Fig. 13(a)] shows that the 1/f noise corner is close to 4 kHz and the white noise is  $174 \text{ nV}/\sqrt{\text{Hz}}$  at 100 kHz, which is in good agreement with the simulated result. The discrepancy at low frequency (< 30 Hz) is due to the AC coupling capacitor (100  $\mu$ F) in the test setup. From simulations  $M_{3-4}$  and  $M_{9-10}$  (Fig. 8) are the major contributors to the noise, with 52.6% and 32.4%, respectively, at 100 kHz. The PSRR is around 80 dB at 1 kHz [Fig. 13(b)]. The unity-gain magnitude responses at 1-nF and 15-nF  $C_{\rm L}$  are shown in Fig. 13(c). The -3-dB bandwidth at 15-nF  $C_{\rm L}$  is larger due to the existence of the complex poles.

# C. Stability Versus $C_{\rm L}$ Variability

Although the measured gain (7.8 dB) and phase (79.5°) margins are not inferior when  $C_{\rm L}$  is downsized to 0.5 nF, a small  $(\sim 0.9 \text{ mV}_{pp})$ , long-lasting, high-frequency  $(\sim 12 \text{ MHz})$ ringing appears in the step response [Fig. 14(a)], which suggests that the closed-loop transfer function has a second-order polynomial with a very small damping factor and a high damping frequency. From an LFL analysis perspective this can be explained as follows: when  $C_{\rm L}$  is significantly reduced, the damping factor in (9) decreases considerably, as well as the closed-loop damping factor. For a certain reduced value of  $C_{\rm L}$  a long-lasting ringing occurs in the step response. The degradation on the LFL's PM and GM can capture the reduction in the damping factor from (9), since they are an indirect indicator of the ringing. When  $C_{\rm L}$  is further downsized to 0.1 nF [Fig. 14(b)], the amplifier becomes unstable both internally (LFL) and externally (unity-gain feedback), owing to the RHP poles appearing in the amplifier's transfer function. This observation is consistent with the simulated gain and phase margins of  $T_{\text{proposed}}$  (j $\omega$ ) as shown in Fig. 14(c). Consequently, the lower bound of  $C_{\rm L}$  should be determined by the LFL stability, while the upper bound of  $C_{\rm L}$  should be judged by the stability margins of the amplifier's transfer function. This criterion cannot be drawn from conventional direct circuit analysis.



Fig. 13. Measured (a) Output noise density (b) PSRR and (c) Gain response in unity-gain feedback.



Fig. 14. (a) Measured step response at  $C_{\rm L} = 0.5 \text{ nF}$ . A high-frequency small-amplitude is super-imposed on the step response, which is due to the reduced LFL stability (b) Unstable step response at  $C_{\rm L} = 0.1 \text{ nF}$  (c) The simulated gain and phase margins of  $T_{\rm proposed}$ .

# D. Performance Benchmark and Robustness of Results

Table V summarizes the performance of one chip measured over different  $C_{\rm L}$  and benchmarks with the three recent works.

This work not only succeeds in extending the  $C_{\rm L}$  drivability to 15 nF, but also shows improved FOM<sub>S</sub> (> 2.02x) and FOM<sub>L</sub> (> 1.44x). The merits are held for supply-current FOM versions, i.e., IFOM<sub>S</sub> (> 3.36x) and IFOM<sub>L</sub>, (> 2.4x). The

|                                                            | [9] [S<br>JSSC I | . Guo<br>Feb'11] | [8] [X. Peng<br>JSSC Feb'11] | [10] [C. Chong<br>JSSC Sep'12] | This Work |        |         |         |
|------------------------------------------------------------|------------------|------------------|------------------------------|--------------------------------|-----------|--------|---------|---------|
| Load C <sub>L</sub> (pF) [// R <sub>L</sub> (k $\Omega$ )] | 500 // 25        | 800 // 25        | 150                          | 500                            | 1,000     | 5,000  | 10,000  | 15,000  |
| GBW (MHz)                                                  | 4                | 3.6              | 4.4                          | 2                              | 1.37      | 1.24   | 1.06    | 0.95    |
| Phase Margin (°)                                           | 70               | 58               | 57                           | 52                             | 83.2      | 69.8   | 57.2    | 52.3    |
| Gain Margin (dB)                                           | 14*              | 16*              | 5*                           | 8*                             | 9.8       | 16.6   | 17.0    | 18.1    |
| Average SR (V/µs)                                          | 2.2              | 1.7              | 1.8                          | 0.65                           | 0.59      | 0.50   | 0.30    | 0.22    |
| Average 1% Ts (µs)                                         | 0.6              | 0.7              | 1.9                          | 1.23                           | 1.28      | 1.71   | 3.66    | 4.49    |
| DC Gain (dB) (extrapolated)                                | >1               | 00               | 110                          | >100                           | >100      |        |         |         |
| Power (µW) @ V <sub>DD</sub>                               | 260 (            | 0)2V             | 30 @ 1.5 V                   | 20.4 @ 1.2 V                   | 144 @ 2 V |        |         |         |
| Output Noise Density<br>(nV/ √ Hz@10kHz)                   | N                | /A               | N/A                          | N/A                            | 172       |        |         |         |
| Total Capacitance C <sub>t</sub> (pF)                      | 2                | .2               | 1.6                          | 1.15                           | 2.6       |        |         |         |
| Chip Area (mm <sup>2</sup> )                               | 0.0              | )14              | 0.02                         | 0.0088                         |           | 0.0    | 016     |         |
| Technology                                                 | 0.35µm           | CMOS             | 0.35µm CMOS                  | 65nm CMOS                      |           | 0.35µm | n CMOS  |         |
| FOM <sub>s</sub> [(MHz · pF)/mW]                           | 7,692            | 11,077           | 22,000                       | 49,020                         | 9,514     | 43,056 | 73,611  | 98,958  |
| FOM <sub>L</sub> [(V/µs · pF)/mW]                          | 4,231            | 5,231            | 9,000                        | 15,931                         | 4,097     | 17,361 | 20,833  | 22,917  |
| LC-FOM <sub>s</sub> (MHz/mW)                               | 3,497            | 5,035            | 13,750                       | 42,626                         | 3,659     | 16,560 | 28,311  | 38,061  |
| LC-FOM <sub>L</sub> [(V/µs)/mW]                            | 1,923            | 2,378            | 5,625                        | 13,853                         | 1,576     | 6,677  | 8,013   | 8,814   |
| IFOM <sub>s</sub> [(MHz · pF)/mA]                          | 15,384           | 22,154           | 33,000                       | 58,823                         | 19,028    | 86,112 | 147,722 | 197,916 |
| IFOM <sub>L</sub> [(V/µs ⋅ pF)/mA]                         | 8,462            | 10,462           | 13,500                       | 19,118                         | 8,194     | 34,722 | 41,666  | 45,834  |

TABLE V Performance Summary and Benchmark

\* denotes extracted values from plots

 $\mathsf{FOM}_{\mathsf{S}} = \frac{\mathsf{GBW} \cdot \mathsf{C}_{\mathsf{L}}}{\mathsf{Power}} + \mathsf{FOM}_{\mathsf{L}} = \frac{\mathsf{SR} \cdot \mathsf{C}_{\mathsf{L}}}{\mathsf{Power}} + \mathsf{LC} - \mathsf{FOM}_{\mathsf{S}} = \frac{\mathsf{GBW}}{\mathsf{Power}} \cdot \frac{\mathsf{C}_{\mathsf{L}}}{\mathsf{C}_{\mathsf{t}}} + \mathsf{LC} - \mathsf{FOM}_{\mathsf{L}} = \frac{\mathsf{SR}}{\mathsf{Power}} \cdot \frac{\mathsf{C}_{\mathsf{L}}}{\mathsf{C}_{\mathsf{t}}} + \mathsf{IFOM}_{\mathsf{S}} = \frac{\mathsf{GBW} \cdot \mathsf{C}_{\mathsf{L}}}{\mathsf{I}_{\mathsf{dd}}} + \mathsf{IFOM}_{\mathsf{L}} = \frac{\mathsf{SR} \cdot \mathsf{C}_{\mathsf{L}}}{\mathsf{I}_{\mathsf{dd}}} + \mathsf{I}_{\mathsf{dd}} + \mathsf{$ 

TABLE VI Measurement Results Over 20 Samples

| 20 chips, C <sub>L</sub> =15nF     | Mean    | σ      | $\frac{\sigma}{Mean}$ x 100% |
|------------------------------------|---------|--------|------------------------------|
| GBW (MHz)                          | 0.85    | 0.062  | 7.3%                         |
| Phase Margin (°)                   | 53.2    | 2.64   | 5.0%                         |
| Gain Margin (dB)                   | 19.96   | 1.42   | 7.1%                         |
| Average SR (V/μs)                  | 0.21    | 0.014  | 6.7%                         |
| Average 1% Ts (µs)                 | 4.77    | 0.21   | 4.4%                         |
| Power (µW)                         | 140     | 14     | 10.0%                        |
| FOM <sub>s</sub> [(MHz · pF)/mW]   | 89,290  | 10,888 | 12.2%                        |
| FOM <sub>L</sub> [(V/µs · pF)/mW]  | 22,528  | 2,920  | 13.0%                        |
| LC-FOM <sub>s</sub> (MHz/mW)       | 34,342  | 4,188  | 12.2%                        |
| LC-FOM <sub>L</sub> [(V/µs)/mW]    | 8,661   | 1,123  | 13.0%                        |
| IFOM <sub>s</sub> [(MHz · pF)/mA]  | 178,580 | 21,776 | 12.2%                        |
| IFOM <sub>L</sub> [(V/µs · pF)/mA] | 45,056  | 5,840  | 13.0%                        |

robustness of the measured results over 20 samples has been confirmed. At 15-nF  $C_{\rm L}$ , the standard deviation ( $\sigma$ ) of each key performance parameter is < 13% of its mean (Table VI).

#### VI. CONCLUSIONS

The design and implementation of a power-efficient (144  $\mu$ W) and compact (0.016 mm<sup>2</sup>) three-stage amplifier with large-and-wide  $C_{\rm L}$  drivability (1 to 15 nF) have been presented. The employed LFL analysis is much more insightful than traditional direct circuit analysis in terms of topology

selection, pole-zero placement, sizing of parameters and judging of  $C_{\rm L}$  variability. The optimized frequency compensation scheme is CBMC plus parasitic-pole cancellation. Its transistor-level implementation is made particularly effective via a wideband current buffer and an active LHP zero circuit. The fabricated prototype exhibits advanced small-signal FOM<sub>S</sub> (> 2.02x) and large-signal FOM<sub>L</sub> (> 1.44x) with respect to the state-of-the-art. Robust results have been achieved over 20 available samples.

#### REFERENCES

- "4-Channel, Rail-to-Rail, CMOS Buffer Amplifier," in *Rev. B Texas Instruments*, Jul. 2004 [Online]. Available: http://www.ti.com/product/ buf04701
- [2] K. N. Leung, P. K. T. Mok, W. H. Ki, and J. K. O. Sin, "Dampingfactor-control frequency compensation technique for low-voltage lowpower large capacitive load applications," in *IEEE ISSCC Dig. Tech. Papers*, 1999, pp. 158–159.
- [3] X. Peng and W. Sansen, "AC boosting compensation scheme for lowpower multistage amplifiers," *IEEE J. Solid-State Circuits*, vol. 39, no. 11, pp. 2074–2079, Nov. 2004.
- [4] X. Fan, C. Mishra, and E. Sanchez-Sinencio, "Single Miller capacitor frequency compensation technique for low-power multistage amplifiers," *IEEE J. Solid-State Circuits*, vol. 40, no. 3, pp. 584–592, Mar. 2005.
- [5] A. D. Grasso, G. Palumbo, and S. Pennisi, "Three-stage CMOS OTA for large capacitive loads with efficient frequency compensation scheme," *IEEE Trans. Circuits Syst. II, Exp. Briefs*, vol. 53, no. 10, pp. 1044–1048, Oct. 2006.
- [6] A. D. Grasso, D. Marano, G. Palumbo, and S. Pennisi, "Improved reversed nested Miller frequency compensation with voltage buffer and resistor," *IEEE Trans. Circuits Syst. II, Exp. Briefs*, vol. 54, no. 5, pp. 382–386, May 2007.
- [7] A. D. Grasso, G. Palumbo, and S. Pennisi, "Advances in reversed nested Miller compensation," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 54, no. 7, pp. 1459–1470, Jul. 2007.

- [8] X. Peng, W. Sansen, L. Hou, J. Wang, and W. Wu, "Impedance adapting compensation for low-power multistage amplifiers," *IEEE J. Solid-State Circuits*, vol. 46, no. 2, pp. 445–451, Feb. 2011.
- [9] S. Guo and H. Lee, "Dual active-capacitive-feedback compensation for low-power large-capacitive-load three-stage amplifiers," *IEEE J. Solid-State Circuits*, vol. 46, no. 2, pp. 452–464, Feb. 2011.
- [10] S. S. Chong and P. K. Chan, "Cross feedforward cascode compensation for low-power three-stage amplifier with large capacitive load," *IEEE J. Solid-State Circuits*, vol. 47, no. 9, pp. 2227–2234, Sep. 2012.
- [11] Z. Yan, P.-I. Mak, M.-K. Law, and R. Martins, "A 0.016 mm<sup>2</sup> 144 μW three-stage amplifier capable of driving 1-to-15 nF capacitive load with > 0.95 MHz GBW," in *IEEE ISSCC Dig. Tech. Papers*, 2012, pp. 368–369.
- [12] C. W. Lu, P. Y. Yin, C. W. Hsiao, and M. C. F. Chang, "A 10b resistor-resistor-string DAC with current compensation for compact LCD driver ICs," in *IEEE ISSCC Dig. Tech. Papers*, 2011, pp. 318–319.
- [13] E. S. Kuh and R. A. Rohror, *Theory of Linear Active Networks*. San Francisco, CA: Holden-Day, 1967.
- [14] K. H. Lundberg, "Internal and external Op-Amp compensation: A control-centric tutorial," in *Proc. Amer. Control Conf.*, Jun. 2004, pp. 5197–5211.
- [15] K. Ogata, *Modern Control Engineering*, 2nd ed. Englewood Cliffs, NJ: Prentice-Hall, 1990.
- [16] K. N. Leung and P. K. T. Mok, "Analysis of multistage amplifier-frequency compensation," *IEEE Trans. Circuits Syst. I*, vol. 48, no. 9, pp. 1041–1056, Sep. 2001.
- [17] B. C. Kuo and F. Golnaraghi, Automatic Control Systems, 8nd ed. New York: Wiley, 2003.
- [18] J. K. Roberge, Operational Amplifiers: Theory and Practice. New York: Wiley, 1975.
- [19] J. E. Solomon, "The monolithic op amp: A tutorial study," *IEEE J. Solid-State Circuits*, vol. SSC-9, no. 6, pp. 314–332, Dec. 1974.
- [20] Z. Yan, P.-I. Mak, and R. Martins, "Two-stage operational amplifiers: power- and area-efficient frequency compensation for driving a wide range of capacitive load," *IEEE Circuits Syst. Mag.*, vol. 12, no. 1, pp. 26–42, Jan.–Mar. 2011.
- [21] G. Palumbo and S. Pennisi, "Design methodology and advances in nested Miller compensation," *IEEE Trans. Circuits Syst. I*, vol. 49, no. 7, pp. 893–903, Jul. 2002.
- [22] A. Ochoa, "A systematic approach to the analysis of general and feedback circuits and systems using signal flow graphs and driving-point impedance," *IEEE Trans. Circuits Syst. II*, vol. 45, no. 2, pp. 187–195, Feb. 1998.
- [23] B. Y. Kamath, R. G. Meyer, and P. R. Gray, "Relationship between frequency response and settling time of operational amplifiers," *IEEE J. Solid-State Circuits*, vol. SSC-9, no. 6, pp. 347–352, Dec. 1974.
- [24] R. J. Apfel and P. R. Gray, "A fast settling monolithic operational amplifier using doublet compression techniques," *IEEE J. Solid-State Circuits*, vol. SSC-9, no. 6, pp. 332–340, Dec. 1974.
- [25] R. Nguyen and B. Murmann, "The design of fast-settling three-stage amplifiers using the open-loop damping factor as a design parameter," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 57, no. 6, pp. 1244–1254, Jun. 2010.
- [26] H. Lee and P. K. T. Mok, "Advances in active-feedback frequency compensation with power optimization and transient improvement," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 51, no. 9, pp. 1690–1696, Sep. 2004.
- [27] U. Dasgupta, "Ahuja compensation circuit for operational amplifiers," U.S. patent 7,646,247, Jan. 12, 2010.



**Zushu Yan** (S'10) received the B.S. degree in communications engineering from Beijing University of Posts and Telecommunications (BUPT), Beijing, China, and the M.S. degree in microelectronics from Beijing Microelectronics Technology Institute (with honors), Beijing, China, in 2003 and 2006, respectively. He is currently working towards the Ph.D. degree in electrical and computer engineering at University of Macau, Macao, China.

From April 2006 to August 2009, he was an analog IC engineer and team lead at Beijing Micro-

electronics Technology Institute, developing high-performance low-dropout (LDO) regulators and switched-mode DC-DC converters in CMOS, BiCMOS, and BCD technologies. His current research interests include low-voltage low-power analog circuit techniques, particularly, high-performance CMOS amplifiers, and analog techniques for wireless applications.



**Pui-In Mak** (S'00–M'08–SM'11) received the Ph.D. degree from University of Macau (UM), Macao, China, in 2006.

He is currently an Associate Professor at UM. He has been with the UM State Key Laboratory of Analog and Mixed-Signal VLSI as Research Assistant ('03-'06), Invited Research Fellow ('06-'08) and Research Line Coordinator ('08–present) of wireless and biomedical areas. He was a short-term worker in Chipidea Microelectronics ('03), and was a Visiting Fellow/Scholar at University of

Cambridge, UK ('09), INESC-ID, Portugal ('09) and University of Pavia, Italy ('10). His current research interests are on analog, mixed-signal and RF circuits and systems for wireless, biomedical and physical chemistry, and engineering education.

Prof. Mak has authored two books: Analog-Baseband Architectures and Circuits for Multistandard and Low-Voltage Wireless Transceivers (Springer, 2007), and High-/Mixed-Voltage Analog and RF Circuit Techniques for Nanoscale CMOS (Springer, 2012) and 100+ papers in journals and conferences. He holds four U.S. patents.

Prof. Mak has served in the following positions: Appointed/Elected Member of IEEE Circuits and Systems Society (CASS) Board of Governors ('07-'11); Associate Editor of IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I: REGULAR PAPERS ('10-'11); IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS ('10-'13), IEEE Circuits and Systems Society Newsletter ('10-present) and IEEE POTENTIALS ('12-'14); Member of CASS Publication Activities ('09-'11); Member of IEEE CASS CASCOM ('08-present) and CASEO ('09-present) Technical Committees; Organization or Technical Committee Member of AVLSIWS'04, APCCAS'08, PrimeAsia'09-11, ISCAS'10, VLSI-SoC'11-'12, RFIT'11, SENSORS'11, APCCAS'12, and ISCAS'15. He co-initiated the three special GOLD sessions in ISCAS'09-'11. His paper awards include: ASICON Student Paper Award'03; MWSCAS Student Paper Award'04; IEEJ VLSI Workshop Best Paper Award'04, DAC/ISSCC Student Paper Award'05 and CASS Outstanding Young Author Award'10. He was the (co)-recipient of University of Cambridge Visiting Fellowship'09; IEEE MGA GOLD Achievement Award'09; CASS Chapter-of-the-Year Award' 09; UM Research Award'10; UM Academic Staff Award'11 and the National Scientific and Technological Progress Award'11. Prof. Mak was decorated with the Honorary Title of Value for scientific merits by the Macau Government in 2005.



Man-Kay Law (M'11) received the B.Sc. degree in computer engineering and the Ph.D. degree in electronic and computer engineering from Hong Kong University of Science and Technology (HKUST), in 2006 and 2011, respectively. During his Ph.D. study, he performed research on ultra-low power/energy harvesting CMOS sensor designs for wireless sensing platforms.

From February 2011, he joined HKUST as a Visiting Assistant Professor. He is currently an Assistant Professor in the State Key Laboratory of Analog and

Mixed-Signal VLSI, Faculty of Science and Technology, University of Macau, Macao. His research interests are on the development of ultra-low power energy harvesting and sensing circuits for wireless sensing and biomedical systems, specializing in smart CMOS temperature sensors, CMOS image sensors, ultra-low power analog design techniques, and integrated energy harvesting techniques. Related applications include RFID with embedded sensors, energy harvesting systems and passively powered biomedical implants.

Dr. Law serves as a member of the IEEE Circuits and Systems Society (CASS) committee on Sensory Systems as well as Biomedical Circuits and Systems.



**Rui P. Martins** (M'88–SM'99–F'08) was born on April 30, 1957. He received the Bachelor (5 years), the Masters, and the Ph.D. degrees, as well as the *Habilitation* for Full Professor in electrical engineering and computers from the Department of Electrical and Computer Engineering, Instituto Superior Técnico (IST), TU of Lisbon, Portugal, in 1980, 1985, 1992 and 2001, respectively.

He has been with the Department of Electrical and Computer Engineering (DECE)/IST, TU of Lisbon, since October 1980. Since 1992, he has been on leave

from IST, TU of Lisbon, and is also with the Department of Electrical and Computer Engineering, Faculty of Science and Technology (FST), University of Macau (UM), Macao, China, where he is a Full-Professor since 1998. In FST he was the Dean of the Faculty from 1994 to 1997 and he has been Vice-Rector of the University of Macau since 1997. From September 2008, after the reform of the UM Charter, he was nominated after open international recruitment as Vice-Rector (Research) until August 31, 2013. Within the scope of his teaching and research activities he has taught 21 bachelor and master courses and has supervised (or co-supervised) 26 theses, Ph.D. (11) and Masters (15). He has published 12 books, co-authoring (5) and co-editing (7), plus 5 book chapters; 242 refereed papers, in scientific journals (50) and in conference proceedings (192); as well as other 70 academic works, in a total of 324 publications. He has co-authored 4 US Patents (1 issued in 2009, 2 in 2011, and 1 in 2012) and has also submitted other 6 that are currently under processing. He created the Analog and Mixed-Signal VLSI Research Laboratory of UM: http://www.fst.umac.mo/en/lab/ans\_vlsi/website/index.html, elevated in January 2011 to State Key Lab of China (the 1st in Engineering in Macao), being its Founding Director.

Prof. Martins was the Founding Chairman of the IEEE Macau Section from 2003 to 2005, and of the IEEE Macau Joint-Chapter on Circuits And Systems (CAS)/Communications (COM) from 2005 to 2008 [2009 World Chapter of the Year of the IEEE Circuits And Systems Society (CASS)]. He was the General Chair of the 2008 IEEE Asia-Pacific Conference on Circuits And Systems (APCCAS'2008), and was the Vice-President for the Region 10 (Asia, Australia, the Pacific) of the IEEE Circuits And Systems Society (CASS), for the period of 2009 to 2011. He is now the Vice-President (World) Regional Activities and Membership also of the IEEE CAS Society for the period 2012 to 2013. He is Associate Editor of the IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, since 2010 and until 2013. Plus, he is a member of the IEEE CASS Fellow Evaluation Committee (Class of 2013). He was the recipient of two government decorations: the Medal of Professional Merit from Macao Government (Portuguese Administration) in 1999, and the Honorary Title of Value from Macao SAR Government (Chinese Administration) in 2001. In July 2010 was elected, unanimously, as Corresponding Member of the Portuguese Academy of Sciences (in Lisbon), being the only Portuguese Academician living in Asia.