Mascot image.
#Math#Differentiation

Differentiation Techniques: Product, Quotient, and chain rules

The differentiation rules from Lesson 2PM and Recitation 2 — power, sum, constant multiple, general power — handle every polynomial we wrote down in Lesson 3. They run out the moment two functions are multiplied together, divided by one another, or composed in a way not already covered by the general power rule. Three expressions flag the gap:

F(x)=(x21)4(x2+1)5,G(x)=x3(x2+1)4,H(x)=4x2+1.F(x) = (x^2 - 1)^4 (x^2 + 1)^5, \qquad G(x) = \frac{x^3}{(x^2 + 1)^4}, \qquad H(x) = \sqrt{4 x^2 + 1}.

The first is a product, the second a ratio, the third a composition. Brute expansion of FF has many terms, and the answer we want, FF', factors back into a clean form anyway. Re-writing GG as x3(x2+1)4x^3 (x^2 + 1)^{-4} turns the ratio into a product but does not remove the need for a product rule. The expression HH can be handled by the general power rule because the outside function is a half-power; the full chain rule explains why that same outside-inside pattern works beyond powers.

This lesson develops three rules in turn: the product rule for expressions like FF, the quotient rule for expressions like GG, and the chain rule for expressions like HH. The chain rule turns out to be the umbrella under which the general power rule and the quotient-rule proof both sit; once it is in hand, every expression built from finitely many sums, differences, products, quotients, and compositions of differentiable elementary pieces is itself differentiable.

When the Naive Guess Fails

The sum rule stated that the derivative of a sum is the sum of the derivatives. A natural reflex is to try the same for products: guess that ddx[f(x)g(x)]=f(x)g(x)\frac{d}{dx}[f(x) g(x)] = f'(x) g'(x). A single counterexample buries the guess.

Take f(x)=x2f(x) = x^2 and g(x)=x3g(x) = x^3. Their product is x5x^5, and the power rule gives

ddx(x5)=5x4.\frac{d}{dx}(x^5) = 5 x^4.

The naive guess gives f(x)g(x)=(2x)(3x2)=6x3f'(x) g'(x) = (2 x)(3 x^2) = 6 x^3. Not even the same degree. Whatever the correct rule is, it must mix ff, gg, ff', and gg' in a more substantial way.

The rectangle picture below explains why. If f(x)f(x) and g(x)g(x) are the side lengths of a rectangle, then the product f(x)g(x)f(x) g(x) is its area. Increase xx slightly, so that ff grows by Δf\Delta f and gg grows by Δg\Delta g. The new area decomposes into four rectangles: the original fgfg, a thin top strip of area fΔgf \, \Delta g, a thin right strip of area gΔfg \, \Delta f, and a tiny corner of area ΔfΔg\Delta f \, \Delta g.

Rectangle of sides f(x) and g(x) enlarged by Δf horizontally and Δg vertically. The change in area decomposes into a top strip f·Δg, a right strip g·Δf, and a small corner Δf·Δg.

The change in area is fΔg+gΔf+ΔfΔgf \, \Delta g + g \, \Delta f + \Delta f \, \Delta g. Dividing by the change in input hh and letting h0h \to 0, the corner term carries an extra factor of hh and vanishes, leaving exactly fg+gff \, g' + g \, f'. The naive guess corresponds to the corner alone, which is precisely the term that drops out.

The Product Rule

Theorem 13 (Product Rule)

If ff and gg are differentiable at xx, then so is the product fgfg, and

ddx[f(x)g(x)]=f(x)g(x)+g(x)f(x).\frac{d}{dx}\bigl[f(x) \, g(x)\bigr] = f(x) \, g'(x) + g(x) \, f'(x).

In words: the derivative of a product is the first function times the derivative of the second, plus the second function times the derivative of the first.

Proof

Let P(x)=f(x)g(x)P(x) = f(x) \, g(x). By the limit definition of the derivative,

P(x)=limh0f(x+h)g(x+h)f(x)g(x)h.P'(x) = \lim_{h \to 0} \frac{f(x+h) \, g(x+h) - f(x) \, g(x)}{h}.

Add and subtract f(x+h)g(x)f(x+h) \, g(x) in the numerator:

P(x)=limh0f(x+h)g(x+h)f(x+h)g(x)+f(x+h)g(x)f(x)g(x)h=limh0[f(x+h)g(x+h)g(x)h+g(x)f(x+h)f(x)h].\begin{aligned} P'(x) &= \lim_{h \to 0} \frac{f(x+h)\,g(x+h) - f(x+h)\,g(x) + f(x+h)\,g(x) - f(x)\,g(x)}{h}\\ &= \lim_{h \to 0} \left[ f(x+h) \cdot \frac{g(x+h) - g(x)}{h} + g(x) \cdot \frac{f(x+h) - f(x)}{h} \right]. \end{aligned}

The two difference quotients tend to g(x)g'(x) and f(x)f'(x). Because ff is differentiable at xx, it is continuous there (Lesson 2PM), so limh0f(x+h)=f(x)\lim_{h \to 0} f(x+h) = f(x). Splitting the limit of the sum into a sum of limits and the limit of the product into a product of limits (both moves justified because every individual limit exists),

P(x)=f(x)g(x)+g(x)f(x).P'(x) = f(x) \, g'(x) + g(x) \, f'(x).

The trick of adding and subtracting f(x+h)g(x)f(x+h) \, g(x) is what isolates the two strips in the rectangle picture. Without that intermediate step, ff and gg change together inside a single difference quotient, and the rules from Lesson 2 do not separate them.

Example 102 (Verifying the product rule)

For f(x)=x2f(x) = x^2 and g(x)=x3g(x) = x^3, the product rule should reproduce ddx(x5)=5x4\frac{d}{dx}(x^5) = 5 x^4. Compute

ddx(x2x3)=x2ddx(x3)+x3ddx(x2)=x2(3x2)+x3(2x)=3x4+2x4=5x4.\begin{aligned} \frac{d}{dx}(x^2 \cdot x^3) &= x^2 \cdot \frac{d}{dx}(x^3) + x^3 \cdot \frac{d}{dx}(x^2)\\ &= x^2 (3 x^2) + x^3 (2 x)\\ &= 3 x^4 + 2 x^4 = 5 x^4. \checkmark \end{aligned}

The rule recovers the answer the power rule already supplied. The genuine work begins where the power rule alone cannot reach.

Example 103 (A polynomial product, kept factored)

Differentiate y=(2x35x)(3x+1)y = (2 x^3 - 5 x)(3 x + 1).

Set f(x)=2x35xf(x) = 2 x^3 - 5 x and g(x)=3x+1g(x) = 3 x + 1. Then f(x)=6x25f'(x) = 6 x^2 - 5 and g(x)=3g'(x) = 3, so

dydx=(2x35x)(3)+(3x+1)(6x25)=6x315x+18x315x+6x25=24x3+6x230x5.\begin{aligned} \frac{dy}{dx} &= (2 x^3 - 5 x)(3) + (3 x + 1)(6 x^2 - 5)\\ &= 6 x^3 - 15 x + 18 x^3 - 15 x + 6 x^2 - 5\\ &= 24 x^3 + 6 x^2 - 30 x - 5. \end{aligned}

Multiplying out the original product first and then differentiating term by term gives the same answer. The product rule simply skips the algebra.

Remark

The product rule is symmetric: fg+gf=gf+fgf \, g' + g \, f' = g \, f' + f \, g'. Some texts state it the other way round. Pick one ordering and keep it: the first function times the derivative of the second, plus the second function times the derivative of the first. Mixing orderings inside one calculation is where signs get lost.

Problem 102

Differentiate y=(4x3+x)(7x2)y = (4 x^3 + x)(7 - x^2) using the product rule. Leave the answer in the unsimplified form first, then expand and collect like terms. Verify by multiplying the two factors out before differentiating.

Problem 103

A rectangular sheet of metal expands under heat. After tt seconds, its length is L(t)=12+0.03tL(t) = 12 + 0{.}03 \, t centimeters and its width is W(t)=8+0.02tW(t) = 8 + 0{.}02 \, t centimeters. The area is A(t)=L(t)W(t)A(t) = L(t) \, W(t).

  1. Without expanding, write A(t)A'(t) using the product rule.
  2. Compute A(0)A'(0). Which side contributes more to the initial rate of growth, length or width? Tie the answer to the rectangle picture in When the Naive Guess Fails.

Recovering the General Power Rule

Apply the product rule to a function multiplied by itself.

Example 104 (A function squared)

Apply the product rule to y=g(x)g(x)y = g(x) \cdot g(x):

ddx[g(x)g(x)]=g(x)g(x)+g(x)g(x)=2g(x)g(x).\frac{d}{dx}\bigl[g(x) \cdot g(x)\bigr] = g(x) \cdot g'(x) + g(x) \cdot g'(x) = 2 \, g(x) \, g'(x).

The general power rule, applied to the same expression [g(x)]2[g(x)]^2, gives 2g(x)g(x)2 \, g(x) \, g'(x). The two rules agree.

For positive integer powers, the general power rule agrees with what the product rule would give if we applied it repeatedly to gggg \cdot g \cdot \cdots \cdot g. Recitation 2 stated the general power rule more broadly, including fractional and negative rational powers, so the product rule does not replace it. The point is narrower but useful: when a product contains powers of differentiable expressions, the two rules are designed to fit together.

Combining the Product Rule with the General Power Rule

The product rule rarely arrives alone. Most product expressions involve composite pieces, and the general power rule supplies ff' and gg' for those pieces.

Example 105 (Differentiate and simplify)

Find dydx\dfrac{dy}{dx} where y=(x21)4(x2+1)5y = (x^2 - 1)^4 (x^2 + 1)^5.

Set f(x)=(x21)4f(x) = (x^2 - 1)^4 and g(x)=(x2+1)5g(x) = (x^2 + 1)^5. Apply the general power rule to each:

f(x)=4(x21)3(2x),g(x)=5(x2+1)4(2x).f'(x) = 4 (x^2 - 1)^3 (2 x), \qquad g'(x) = 5 (x^2 + 1)^4 (2 x).

By the product rule,

dydx=(x21)45(x2+1)4(2x)+(x2+1)54(x21)3(2x).(1)\frac{dy}{dx} = (x^2 - 1)^4 \cdot 5 (x^2 + 1)^4 (2 x) + (x^2 + 1)^5 \cdot 4 (x^2 - 1)^3 (2 x). \tag{1}

Equation (1)(1) is sufficient if the goal is a numerical value at a single point — substituting x=2x = 2 is straightforward. The work begins when the goal is to find where the derivative vanishes, because (1)(1) as written is a sum of two products, not one expression set equal to zero.

Both terms share 2x2 x, a power of (x21)(x^2 - 1), and a power of (x2+1)(x^2 + 1). The largest factor common to both is 2x(x21)3(x2+1)42 x (x^2 - 1)^3 (x^2 + 1)^4. Pulling it out,

dydx=2x(x21)3(x2+1)4[5(x21)+4(x2+1)].\frac{dy}{dx} = 2 x (x^2 - 1)^3 (x^2 + 1)^4 \bigl[5 (x^2 - 1) + 4 (x^2 + 1)\bigr].

The bracketed factor simplifies: 5x25+4x2+4=9x215 x^2 - 5 + 4 x^2 + 4 = 9 x^2 - 1. Hence

dydx=2x(x21)3(x2+1)4(9x21).(2)\frac{dy}{dx} = 2 x (x^2 - 1)^3 (x^2 + 1)^4 (9 x^2 - 1). \tag{2}

Form (2)(2) exposes every critical number at a glance: dydx=0\frac{dy}{dx} = 0 when x=0x = 0, x=±1x = \pm 1, or 9x2=19 x^2 = 1 (giving x=±1/3x = \pm 1/3). The same information was buried inside (1)(1) but not visible.

Note (When to simplify)

The unsimplified form (1)(1) is right for substitution; the factored form (2)(2) is right for solving f(x)=0f'(x) = 0. Each new product–rule answer should usually be left in the form that matches the next step. Force-simplifying every derivative expands harmless factored expressions back into long polynomials and then needs them factored again to find critical numbers.

Problem 104

Find dydx\dfrac{dy}{dx} for y=(x2+3)3(2x1)4y = (x^2 + 3)^3 (2 x - 1)^4 and factor the result so that all critical numbers are visible without further work. Identify the values of xx at which dydx=0\dfrac{dy}{dx} = 0.

Problem 105

Let h(x)=(x24)2(x+1)3h(x) = (x^2 - 4)^2 (x + 1)^3.

  1. Compute h(x)h'(x) using the product rule and the general power rule.
  2. Factor h(x)h'(x) over the integers as far as possible, then use the quadratic formula for any remaining zeros.
  3. List the critical numbers and classify each as a relative maximum, relative minimum, or neither using the first derivative test.

Two Applied Settings

The product rule answers questions that Lesson 3 raised but could not finish: tracking how a product of two changing quantities evolves when neither factor is fixed.

Example 106 (Marginal revenue when both price and output drift)

A small workshop’s listed inventory after tt hours of a six-hour shift is q(t)=100+4tq(t) = 100 + 4 t units. The current unit price falls during the shift as competing suppliers post lower offers, modeled by p(t)=120.05t2p(t) = 12 - 0{.}05 \, t^2 pounds per unit. The marked value of the inventory at time tt is R(t)=p(t)q(t)R(t) = p(t) \, q(t) in pounds. Find R(2)R'(2) and interpret it.

By the product rule,

R(t)=p(t)q(t)+p(t)q(t).R'(t) = p'(t) \, q(t) + p(t) \, q'(t).

Differentiating each factor,

p(t)=0.1t,q(t)=4.p'(t) = -0{.}1 \, t, \qquad q'(t) = 4.

At t=2t = 2:

p(2)=120.2=11.8,q(2)=108,p(2)=0.2,q(2)=4.\begin{aligned} p(2) &= 12 - 0{.}2 = 11{.}8, & q(2) &= 108,\\ p'(2) &= -0{.}2, & q'(2) &= 4. \end{aligned}

Therefore

R(2)=(0.2)(108)+(11.8)(4)=21.6+47.2=25.6.R'(2) = (-0{.}2)(108) + (11{.}8)(4) = -21{.}6 + 47{.}2 = 25{.}6.

Two hours into the shift, the marked value of the inventory is rising at £25.60 per hour. The price slip contributes (-£21.60) per hour to the value rate, while inventory growth contributes (£47.20) per hour; the second effect dominates, but only just.

Three curves over the interval 0 ≤ t ≤ 6: the unit price p(t), the scaled inventory q(t)/10, and the scaled marked value R(t)/100 = p(t) q(t)/100. A vertical dotted line marks t = 2, where the slope of R(t)/100 is positive even though p(t) is decreasing.

The figure makes a structural point. The price curve p(t)p(t) slopes down throughout the shift, yet the value curve R(t)R(t) slopes up at t=2t = 2. Looking at pp alone or qq alone misses the answer. The product rule is what fuses the two contributions into a single rate.

Problem 106

A retailer’s sales volume tt weeks after launching a campaign is V(t)=200+30tt2V(t) = 200 + 30 t - t^2 units per week, while the unit margin (profit per unit) is m(t)=5+0.1tm(t) = 5 + 0{.}1 \, t pounds. The total weekly profit is Π(t)=m(t)V(t)\Pi(t) = m(t) \, V(t).

  1. Use the product rule to find Π(t)\Pi'(t).
  2. Compute Π(4)\Pi'(4) and explain in plain language whether weekly profit is rising or falling four weeks in, and at what rate, in pounds per week per week.
  3. Find the time at which weekly profit is maximized on the interval 0t300 \leq t \leq 30.
Example 107 (Peak drug concentration)

A bolus injection produces a bloodstream concentration of

C(t)=(3t+1)2(2t),0t2,C(t) = (3 t + 1)^2 (2 - t), \qquad 0 \leq t \leq 2,

with tt measured in hours. Find the time at which CC peaks.

Set f(t)=(3t+1)2f(t) = (3 t + 1)^2 and g(t)=2tg(t) = 2 - t. Then g(t)=1g'(t) = -1, and the general power rule gives f(t)=2(3t+1)(3)=6(3t+1)f'(t) = 2 (3 t + 1)(3) = 6 (3 t + 1). By the product rule,

C(t)=(3t+1)2(1)+(2t)6(3t+1)=(3t+1)[(3t+1)+6(2t)].C'(t) = (3 t + 1)^2 (-1) + (2 - t) \cdot 6 (3 t + 1) = (3 t + 1)\bigl[-(3 t + 1) + 6 (2 - t)\bigr].

The bracket simplifies to 3t1+126t=119t-3 t - 1 + 12 - 6 t = 11 - 9 t, so

C(t)=(3t+1)(119t).C'(t) = (3 t + 1)(11 - 9 t).

Critical numbers come from 3t+1=03 t + 1 = 0, giving t=1/3t = -1/3 (outside the domain), and 119t=011 - 9 t = 0, giving t=11/91.22t = 11/9 \approx 1{.}22. The endpoints contribute C(0)=2C(0) = 2 and C(2)=0C(2) = 0, while

C ⁣(119)=(113+1) ⁣2 ⁣(2119)=(143) ⁣2 ⁣(79)=196979=13728116.94.C\!\left(\tfrac{11}{9}\right) = \left(\tfrac{11}{3} + 1\right)^{\!2}\!\left(2 - \tfrac{11}{9}\right) = \left(\tfrac{14}{3}\right)^{\!2}\!\left(\tfrac{7}{9}\right) = \frac{196}{9} \cdot \frac{7}{9} = \frac{1372}{81} \approx 16{.}94.

The concentration peaks at t=11/9t = 11/9 hours, well above either endpoint.

Concentration curve C(t) = (3t+1)² (2 - t) on [0, 2]. The curve starts at (0, 2), peaks near (11/9, 16.94) with dotted reference lines, and falls to (2, 0).

The factored form of C(t)C'(t) also encodes the sign of the derivative directly: on 0t<11/90 \leq t < 11/9 both factors are positive so CC rises; past 11/911/9 the second factor flips sign and CC falls. The first derivative test confirms the maximum without a separate sign chart.

Problem 107

A particle moves so that its position at time tt is s(t)=(t2+1)(t3)2s(t) = (t^2 + 1)(t - 3)^2 meters for t0t \geq 0. The velocity is v(t)=s(t)v(t) = s'(t).

  1. Find v(t)v(t) using the product rule and the general power rule, factoring fully.
  2. Determine when the particle is momentarily at rest.
  3. On which interval is the particle moving in the positive direction?

A Three-Factor Product

Applying the product rule twice handles three-factor products. Write fgh=(fg)hf \, g \, h = (f g) h and differentiate:

ddx(fgh)=(fg)h+hddx(fg)=fgh+h(fg+gf)=fgh+fgh+fgh.\frac{d}{dx}(f g h) = (f g) h' + h \cdot \frac{d}{dx}(f g) = f g \, h' + h (f g' + g f') = f' g h + f g' h + f g h'.

Each factor, in turn, gets differentiated while the other two are held; the three contributions are summed. The same pattern extends to any number of factors.

Example 108 (A three-factor product)

Differentiate y=x(x2)(x2+1)y = x \, (x - 2)(x^2 + 1).

By the three-factor pattern with f=xf = x, g=x2g = x - 2, h=x2+1h = x^2 + 1:

dydx=(1)(x2)(x2+1)+x(1)(x2+1)+x(x2)(2x).\frac{dy}{dx} = (1)(x - 2)(x^2 + 1) + x(1)(x^2 + 1) + x(x - 2)(2 x).

Each term keeps two factors intact and replaces the third with its derivative.

For solving dydx=0\frac{dy}{dx} = 0, multiply out:

dydx=(x2)(x2+1)+x(x2+1)+2x2(x2)=(2x2)(x2+1)+2x2(x2).\frac{dy}{dx} = (x - 2)(x^2 + 1) + x(x^2 + 1) + 2 x^2 (x - 2) = (2 x - 2)(x^2 + 1) + 2 x^2 (x - 2).

Expanding gives 4x36x2+2x24 x^3 - 6 x^2 + 2 x - 2. The grouped form is more compact than expanding yy first into a degree-four polynomial and applying the power rule term by term, especially when one factor is itself a composite that the general power rule can absorb.

Problem 108

Differentiate y=x2(x1)(x+3)y = x^2 (x - 1)(x + 3) as a three-factor product. Then simplify, factor yy' over the integers as far as possible, and use the quadratic formula to list every xx for which dydx=0\dfrac{dy}{dx} = 0.

Problem 109

Let y=f(x)g(x)h(x)y = f(x) \, g(x) \, h(x) where f(2)=3f(2) = 3, g(2)=1g(2) = -1, h(2)=4h(2) = 4, f(2)=5f'(2) = 5, g(2)=2g'(2) = 2, h(2)=3h'(2) = -3. Compute dydxx=2\dfrac{dy}{dx}\bigg|_{x = 2} using the three-factor product rule.

Problem 110

Let f1,f2,f3,f4f_{1}, f_{2}, f_{3}, f_{4} be differentiable. Show that

ddx[f1(x)f2(x)f3(x)f4(x)]=f1f2f3f4+f1f2f3f4+f1f2f3f4+f1f2f3f4,\frac{d}{dx}\bigl[f_{1}(x) f_{2}(x) f_{3}(x) f_{4}(x)\bigr] = f_{1}' f_{2} f_{3} f_{4} + f_{1} f_{2}' f_{3} f_{4} + f_{1} f_{2} f_{3}' f_{4} + f_{1} f_{2} f_{3} f_{4}',

by applying the three-factor pattern from the A three-factor product example to (f1f2f3)f4(f_{1} f_{2} f_{3}) \cdot f_{4}.

Then use the four-factor formula to compute y(0)y'(0) for y=(x1)(x2)(x3)(x4)y = (x - 1)(x - 2)(x - 3)(x - 4) — without expanding the polynomial — and verify directly by expanding yy and using the power rule.

Products are now handled. Ratios remain. The denominator in G(x)=x3(x2+1)4G(x) = \dfrac{x^3}{(x^2 + 1)^4} from the opening can be moved upstairs as (x2+1)4(x^2 + 1)^{-4} and absorbed into a product, but the bookkeeping is fragile and the simplification messy. A direct rule for f/gf/g is cleaner and handles every applied ratio — cost per unit, density per area, return per pound invested, concentration per volume — without rewriting. The deeper payoff is structural: with a direct rule, the rate of change of an average has a clean relationship to the marginal, and that relationship falls out of the algebra automatically.

The Quotient Rule

Theorem 14 (Quotient Rule)

If ff and gg are differentiable at xx and g(x)0g(x) \neq 0, then f/gf/g is differentiable at xx, and

ddx ⁣[f(x)g(x)]=g(x)f(x)f(x)g(x)[g(x)]2.\frac{d}{dx} \!\left[ \frac{f(x)}{g(x)} \right] = \frac{g(x) \, f'(x) - f(x) \, g'(x)}{[g(x)]^2}.

The order of the two terms in the numerator matters: there is a minus sign between them. A standard mnemonic is “low d-high minus high d-low, square the bottom and away we go”: the denominator gg (“low”) times the derivative of the numerator ff (“d-high”), minus the numerator ff (“high”) times the derivative of the denominator gg (“d-low”), all over the denominator squared. Reversing the two products in the numerator changes the sign of the answer.

We postpone the proof until the rule has earned its keep through examples; the derivation appears later in this lesson, built directly on the product rule above together with the general power rule from Recitation 2.

Worked Quotient Examples

Example 109 (A first quotient)

Differentiate y=x2x+3y = \dfrac{x}{2 x + 3}.

Set f(x)=xf(x) = x and g(x)=2x+3g(x) = 2 x + 3. Then f(x)=1f'(x) = 1 and g(x)=2g'(x) = 2. By the quotient rule,

dydx=(2x+3)(1)(x)(2)(2x+3)2=2x+32x(2x+3)2=3(2x+3)2.\frac{dy}{dx} = \frac{(2 x + 3)(1) - (x)(2)}{(2 x + 3)^2} = \frac{2 x + 3 - 2 x}{(2 x + 3)^2} = \frac{3}{(2 x + 3)^2}.

The cancellation in the numerator is the feature: even though both ff and gg contribute, their linear pieces meet so cleanly that the derivative reduces to a single constant over the squared denominator. That is the rule behaving well.

Problem 111

Differentiate each function using the quotient rule. Where simplification reveals critical numbers, leave the answer in fully factored form.

  1. y=2x1x+4y = \dfrac{2 x - 1}{x + 4}.
  2. y=x2x2+1y = \dfrac{x^2}{x^2 + 1}.
  3. y=x2+1x21y = \dfrac{x^2 + 1}{x^2 - 1}.
  4. y=(x+2)3x1y = \dfrac{(x + 2)^3}{x - 1}.
Example 110 (Simplifying after the quotient rule)

Find dydx\dfrac{dy}{dx} where y=x3(x2+1)4y = \dfrac{x^3}{(x^2 + 1)^4}.

Set f(x)=x3f(x) = x^3 and g(x)=(x2+1)4g(x) = (x^2 + 1)^4. Then f(x)=3x2f'(x) = 3 x^2, and the general power rule gives g(x)=4(x2+1)3(2x)=8x(x2+1)3g'(x) = 4 (x^2 + 1)^3 (2 x) = 8 x (x^2 + 1)^3. The quotient rule supplies

dydx=(x2+1)43x2x38x(x2+1)3[(x2+1)4]2=(x2+1)43x28x4(x2+1)3(x2+1)8.\frac{dy}{dx} = \frac{(x^2 + 1)^4 \cdot 3 x^2 - x^3 \cdot 8 x (x^2 + 1)^3}{[(x^2 + 1)^4]^2} = \frac{(x^2 + 1)^4 \cdot 3 x^2 - 8 x^4 (x^2 + 1)^3}{(x^2 + 1)^8}.

Numerator and denominator share the factor (x2+1)3(x^2 + 1)^3. Cancelling it once,

dydx=(x2+1)(3x2)8x4(x2+1)5=3x4+3x28x4(x2+1)5=x2(35x2)(x2+1)5.\frac{dy}{dx} = \frac{(x^2 + 1)(3 x^2) - 8 x^4}{(x^2 + 1)^5} = \frac{3 x^4 + 3 x^2 - 8 x^4}{(x^2 + 1)^5} = \frac{x^2 (3 - 5 x^2)}{(x^2 + 1)^5}.

Critical numbers come from the numerator: x=0x = 0 (a double root) or 5x2=35 x^2 = 3, giving x=±3/5x = \pm \sqrt{3/5}. None of these would have been visible without the simplification.

Problem 112

Find dydx\dfrac{dy}{dx} where y=x2(x2+4)3y = \dfrac{x^2}{(x^2 + 4)^3}. Simplify the answer by cancelling the largest factor common to the entire numerator and denominator, and identify every xx at which dydx=0\dfrac{dy}{dx} = 0.

Remark

Cancelling here works because (x2+1)3(x^2 + 1)^3 is a factor of the whole numerator and the whole denominator. A common term that appears as a summand — not as a factor — never cancels. The next example exists to make this crystal clear.

Example 111 (When nothing cancels)

Differentiate y=xx+(x+1)3y = \dfrac{x}{x + (x + 1)^3}.

Set f(x)=xf(x) = x and g(x)=x+(x+1)3g(x) = x + (x + 1)^3. Then f(x)=1f'(x) = 1, and by the sum rule and general power rule g(x)=1+3(x+1)2g'(x) = 1 + 3 (x + 1)^2. The quotient rule gives

dydx=[x+(x+1)3](1)x[1+3(x+1)2][x+(x+1)3]2.\frac{dy}{dx} = \frac{\bigl[x + (x + 1)^3\bigr](1) - x \bigl[1 + 3 (x + 1)^2\bigr]}{\bigl[x + (x + 1)^3\bigr]^2}.

Expanding the numerator:

dydx=x+(x+1)3x3x(x+1)2[x+(x+1)3]2=(x+1)33x(x+1)2[x+(x+1)3]2.\frac{dy}{dx} = \frac{x + (x + 1)^3 - x - 3 x (x + 1)^2}{\bigl[x + (x + 1)^3\bigr]^2} = \frac{(x + 1)^3 - 3 x (x + 1)^2}{\bigl[x + (x + 1)^3\bigr]^2}.

The numerator does have a clean factor of (x+1)2(x + 1)^2:

dydx=(x+1)2[(x+1)3x][x+(x+1)3]2=(x+1)2(12x)[x+(x+1)3]2.\frac{dy}{dx} = \frac{(x + 1)^2 \bigl[(x + 1) - 3 x\bigr]}{\bigl[x + (x + 1)^3\bigr]^2} = \frac{(x + 1)^2 (1 - 2 x)}{\bigl[x + (x + 1)^3\bigr]^2}.

The denominator does not factor as (x+1)2(something)(x+1)^2 \cdot (\text{something}), even though (x+1)2(x+1)^2 shows up inside it after expansion: x+(x+1)3x + (x+1)^3 is a sum, not a product. A trainee instinct to “cancel the (x+1)2(x+1)^2 from top and bottom” would silently invent a wrong derivative. Cancellation is a property of factorizations, not of shared substrings.

Problem 113

Differentiate y=(x+1)2x+(x1)2y = \dfrac{(x + 1)^2}{x + (x - 1)^2}. Identify which factors do and do not cancel — recall the warning in the When nothing cancels example above — and simplify only as far as is justified.

Layered Rules

The quotient rule almost always cooperates with the product and general power rules. Decomposing the work cleanly is the only difficulty; the individual derivatives are routine.

Example 112 (A square root of a quotient)

Differentiate f(x)=x2+7x+1f(x) = \sqrt{\dfrac{x^2 + 7}{x + 1}}.

On the real differentiable domain x>1x > -1, write ff as a power: f(x)= ⁣(x2+7x+1) ⁣1/2f(x) = \!\left(\dfrac{x^2 + 7}{x + 1}\right)^{\!1/2}. The general power rule with r=1/2r = 1/2 gives

f(x)=12 ⁣(x2+7x+1) ⁣1/2ddx ⁣[x2+7x+1].(3)f'(x) = \frac{1}{2} \!\left(\frac{x^2 + 7}{x + 1}\right)^{\!-1/2} \frac{d}{dx} \!\left[\frac{x^2 + 7}{x + 1}\right]. \tag{3}

The inner derivative is the genuine quotient computation. Set u(x)=x2+7u(x) = x^2 + 7 and v(x)=x+1v(x) = x + 1. Then u=2xu' = 2 x and v=1v' = 1, so

ddx ⁣[x2+7x+1]=(x+1)(2x)(x2+7)(1)(x+1)2=x2+2x7(x+1)2.\frac{d}{dx} \!\left[\frac{x^2 + 7}{x + 1}\right] = \frac{(x + 1)(2 x) - (x^2 + 7)(1)}{(x + 1)^2} = \frac{x^2 + 2 x - 7}{(x + 1)^2}.

Substituting back into (3)(3):

f(x)=12 ⁣(x2+7x+1) ⁣1/2x2+2x7(x+1)2.f'(x) = \frac{1}{2} \!\left(\frac{x^2 + 7}{x + 1}\right)^{\!-1/2} \frac{x^2 + 2 x - 7}{(x + 1)^2}.

To simplify, use  ⁣(ab) ⁣1/2= ⁣(ba) ⁣1/2\!\left(\dfrac{a}{b}\right)^{\!-1/2} = \!\left(\dfrac{b}{a}\right)^{\!1/2}:

f(x)=12 ⁣(x+1x2+7) ⁣1/2x2+2x7(x+1)2=12(x+1)1/2(x2+7)1/2x2+2x7(x+1)2.f'(x) = \frac{1}{2} \!\left(\frac{x + 1}{x^2 + 7}\right)^{\!1/2} \frac{x^2 + 2 x - 7}{(x + 1)^2} = \frac{1}{2} \cdot \frac{(x + 1)^{1/2}}{(x^2 + 7)^{1/2}} \cdot \frac{x^2 + 2 x - 7}{(x + 1)^2}.

Combining the factors of (x+1)(x + 1):

f(x)=x2+2x72(x2+7)1/2(x+1)3/2.f'(x) = \frac{x^2 + 2 x - 7}{2 (x^2 + 7)^{1/2} (x + 1)^{3/2}}.

The full computation took three rules layered: the general power rule on the outside, the quotient rule on the inside, and elementary exponent identities to put the answer in standard form. The architecture — outer rule first, inner pieces afterwards — is the same for every nested differentiation problem in this course.

Problem 114

Differentiate f(x)=x+1x2+1f(x) = \sqrt{\dfrac{x + 1}{x^2 + 1}} by writing ff as a power and combining the general power rule with the quotient rule. State f(x)f'(x) with no negative exponents, writing fractional powers as square roots.

Problem 115

The tangent line to the curve y=x2+9y = \sqrt{x^{2} + 9} at the point with x=4x = 4 is the line y=mx+by = mx + b for unique constants mm and bb.

  1. Use the general power rule and the chain rule to compute mm, then find bb from the point-slope form.
  2. Show that the equation x2+9=mx+b\sqrt{x^{2} + 9} = mx + b — equating the curve to its tangent line — has x=4x = 4 as a double root by squaring both sides and reducing to a polynomial. (A simple root would mean the tangent crosses the curve transversally; a double root reflects the tangent’s first-order contact with the curve.)
  3. State, in one sentence, why a double root at x=4x = 4 is the algebraic version of the geometric fact that the tangent line touches the curve there.

At a Minimum, Average Cost Equals Marginal Cost

The quotient rule’s most repeated economic application is to averages. Average cost is, by definition, the ratio of total cost to output. Marginal cost is the derivative of total cost. The quotient rule produces a precise relationship between the two at the optimum.

Example 113 (Average cost meets marginal cost)

Suppose the total cost of producing xx units is C(x)C(x). Define the average cost per unit AC(x)=C(x)/xAC(x) = C(x)/x and the marginal cost MC(x)=C(x)MC(x) = C'(x). Show that, at any positive output level where ACAC has a local minimum, the equality AC=MCAC = MC holds.

Differentiate ACAC using the quotient rule with f(x)=C(x)f(x) = C(x) and g(x)=xg(x) = x:

ddx(AC)=xC(x)C(x)1x2=xC(x)C(x)x2.\frac{d}{dx} (AC) = \frac{x \cdot C'(x) - C(x) \cdot 1}{x^2} = \frac{x \, C'(x) - C(x)}{x^2}.

At such a minimum of ACAC, we must have ddx(AC)=0\dfrac{d}{dx}(AC) = 0. With x>0x > 0 the denominator x2x^2 is positive, so the equation reduces to

xC(x)C(x)=0.x \, C'(x) - C(x) = 0.

Adding C(x)C(x) to both sides and then dividing by xx gives

C(x)=C(x)x,C'(x) = \frac{C(x)}{x},

that is, MC(x)=AC(x)MC(x) = AC(x).

The two cost curves cross at any positive output where the average cost is minimized.

Average cost AC(x) = x + 16 + 100/x and marginal cost MC(x) = 2x + 16 plotted on the same axes. AC is U-shaped with minimum at x = 10; MC is a rising line that crosses AC exactly at the minimum, with both curves equal to 36 there.

The figure makes the geometry concrete. With C(x)=x2+16x+100C(x) = x^2 + 16 x + 100, the average cost AC(x)=x+16+100/xAC(x) = x + 16 + 100/x is the structurally familiar curve of Lesson 3PM (a linear term plus a reciprocal term, single minimum at x=10x = 10, where AC=36AC = 36). The marginal cost MC(x)=2x+16MC(x) = 2 x + 16 is a straight line with slope 22, hitting 3636 exactly at x=10x = 10. The intersection point and the minimum point coincide.

The economic reading is immediate: while the cost of producing the next unit is below the average, additional production pulls the average down; when the next unit costs more than the average, additional production pushes the average up. Equality is the fixed point.

Problem 116

A factory’s daily total cost in pounds, when producing xx units per day, is

C(x)=0.5x2+30x+200.C(x) = 0{.}5 \, x^2 + 30 \, x + 200.
  1. Write AC(x)AC(x) and MC(x)MC(x).
  2. Use the quotient rule to find AC(x)AC'(x) and the production level minimizing ACAC.
  3. Verify directly that AC=MCAC = MC at the optimum, in line with the Average cost meets marginal cost result.
Problem 117

The result AC=MCAC = MC at a positive-output minimum of ACAC used only that AC=C/xAC = C/x and that x>0x > 0. Adapt the argument to show that, for any differentiable function h(x)=u(x)/xh(x) = u(x)/x with x>0x > 0, every relative extremum of hh occurs where u(x)=u(x)/xu'(x) = u(x)/x. State, in plain English, what this says about averages of any non-negative quantity that varies with xx.

A Quotient Optimization

Example 114 (Maximizing return on investment)

A small firm’s monthly return on each pound invested, tt months after the initial outlay, is modeled by

R(t)=50tt2+25,t0,R(t) = \frac{50 \, t}{t^2 + 25}, \qquad t \geq 0,

in pence per pound (i.e. units of 1/1001/100 pound). Find the time at which the return is highest.

Set f(t)=50tf(t) = 50 \, t and g(t)=t2+25g(t) = t^2 + 25, so f(t)=50f'(t) = 50 and g(t)=2tg'(t) = 2 t. By the quotient rule,

R(t)=(t2+25)(50)(50t)(2t)(t2+25)2=50t2+1250100t2(t2+25)2=50(25t2)(t2+25)2.R'(t) = \frac{(t^2 + 25)(50) - (50 \, t)(2 t)}{(t^2 + 25)^2} = \frac{50 \, t^2 + 1250 - 100 \, t^2}{(t^2 + 25)^2} = \frac{50 \, (25 - t^2)}{(t^2 + 25)^2}.

The denominator is positive for every tt, so the sign of R(t)R'(t) is determined by 25t225 - t^2. Setting R(t)=0R'(t) = 0 gives t2=25t^2 = 25, so t=5t = 5 (the negative root is outside the domain). For 0t<50 \leq t < 5, 25t2>025 - t^2 > 0 and RR is increasing; for t>5t > 5, RR is decreasing. By the first derivative test, t=5t = 5 is the absolute maximum on [0,)[0, \infty).

The peak return is

R(5)=50525+25=25050=5.R(5) = \frac{50 \cdot 5}{25 + 25} = \frac{250}{50} = 5.

The investment yields its highest monthly return — 5 pence per pound — exactly five months after the outlay.

Curve R(t) = 50t/(t² + 25) on the displayed interval. The curve rises sharply from the origin, peaks at (5, 5) with dotted reference lines, and decays slowly afterwards as the denominator grows faster than the numerator.

The shape has a structural lesson. For small positive tt, the linear numerator 50t50 \, t outpaces the denominator t2+25t^2 + 25 (which is dominated by its constant 2525). For large tt, the denominator’s quadratic term takes over and the ratio decays like 50/t50/t. In this model, those two effects balance at one peak, and the quotient rule locates it.

Problem 118

Pollutant concentration in a tank, tt minutes after a spill, is modeled by

P(t)=60tt2+9 ppm,t0.P(t) = \frac{60 \, t}{t^2 + 9} \text{ ppm}, \qquad t \geq 0.
  1. Find P(t)P'(t) using the quotient rule and factor it.
  2. Determine the time at which concentration peaks and the peak value.
  3. The clean-up team must intervene before PP exceeds 99 ppm. Has the peak been reached at the moment PP first hits 99 ppm, or is it still rising?
Problem 119

A daily delivery service models its profit per kilometer driven by

Π(d)=40dd2d+5,0d40,\Pi(d) = \frac{40 \, d - d^2}{d + 5}, \qquad 0 \leq d \leq 40,

in pounds per kilometer.

  1. Use the quotient rule to find Π(d)\Pi'(d) and simplify.
  2. Find every critical number on the open interval and classify each.
  3. Compare each critical value with the endpoint values and identify the absolute maximum.

Proof of the Quotient Rule

The quotient rule does not need its own limit-definition argument. It follows from the product rule above and the general power rule applied to [g(x)]1[g(x)]^{-1}.

Proof

Step 1: a reciprocal derivative. The general power rule from Recitation 2 extends to negative integer exponents. Since g(x)0g(x) \neq 0 and gg is continuous at xx, the reciprocal is defined for inputs near xx. With r=1r = -1,

ddx ⁣[1g(x)]=ddx[g(x)]1=(1)[g(x)]2g(x)=g(x)[g(x)]2.\frac{d}{dx} \!\left[\frac{1}{g(x)}\right] = \frac{d}{dx} [g(x)]^{-1} = (-1) [g(x)]^{-2} \cdot g'(x) = -\frac{g'(x)}{[g(x)]^2}.

Step 2: write the quotient as a product and apply the product rule. For g(x)0g(x) \neq 0,

f(x)g(x)=f(x)1g(x),\frac{f(x)}{g(x)} = f(x) \cdot \frac{1}{g(x)},

so by the product rule,

ddx ⁣[f(x)g(x)]=1g(x)f(x)+f(x)ddx ⁣[1g(x)].\frac{d}{dx} \!\left[\frac{f(x)}{g(x)}\right] = \frac{1}{g(x)} \cdot f'(x) + f(x) \cdot \frac{d}{dx} \!\left[\frac{1}{g(x)}\right].

Substituting Step 1:

ddx ⁣[f(x)g(x)]=f(x)g(x)f(x)g(x)[g(x)]2=g(x)f(x)[g(x)]2f(x)g(x)[g(x)]2=g(x)f(x)f(x)g(x)[g(x)]2.\frac{d}{dx} \!\left[\frac{f(x)}{g(x)}\right] = \frac{f'(x)}{g(x)} - \frac{f(x) \, g'(x)}{[g(x)]^2} = \frac{g(x) \, f'(x)}{[g(x)]^2} - \frac{f(x) \, g'(x)}{[g(x)]^2} = \frac{g(x) \, f'(x) - f(x) \, g'(x)}{[g(x)]^2}.

The minus sign in the numerator now has a clear origin: it tracks back to the (1)(-1) exponent in [g(x)]1[g(x)]^{-1}, which is the only minus sign in the entire chain of substitutions. Forgetting the minus sign in the rule is forgetting that one is differentiating the reciprocal, not multiplying by it.

Problem 120

Use the reciprocal-derivative formula from Step 1 of the proof above to differentiate

y=1(x32x+5)2y = \frac{1}{(x^3 - 2 x + 5)^2}

in two ways: first by writing yy as [g(x)]2[g(x)]^{-2} and applying the general power rule directly, second by writing yy as 1/[g(x)]21 / [g(x)]^2 and applying the quotient rule. Confirm that the two answers agree.

Quotients are now handled. The remaining structural gap from the opening is composition: an outer function ff applied to an inner function gg, written f(g(x))f(g(x)). The general power rule above is the special case where the outer function is a power, f(x)=xrf(x) = x^r; for example, H(x)=4x2+1H(x) = \sqrt{4 x^2 + 1} is already covered by taking r=1/2r = 1/2. The chain rule is the same idea generalized: any differentiable outer function applied to any differentiable inner function gets a clean derivative.

Composition

Two functions ff and gg compose into a third by feeding the output of gg into the input of ff.

Definition 34 (Composition)

Given functions ff and gg, the composition fgf \circ g is the function defined by

(fg)(x)=f(g(x)),(f \circ g)(x) = f(g(x)),

on the set of xx for which g(x)g(x) lies in the domain of ff. We call ff the outside function and gg the inside function.

Recognizing a function as a composite is half the work; the chain rule then differentiates it mechanically.

Example 115 (Building a composition)

Let f(x)=x1x+1f(x) = \dfrac{x - 1}{x + 1} and g(x)=x3g(x) = x^3. Then

f(g(x))=g(x)1g(x)+1=x31x3+1.f(g(x)) = \frac{g(x) - 1}{g(x) + 1} = \frac{x^3 - 1}{x^3 + 1}.

Each occurrence of xx in f(x)f(x) is replaced by g(x)g(x).

Example 116 (Decomposing into outer and inner pieces)

Identify each function as f(g(x))f(g(x)) and name the outer and inner pieces.

  1. h(x)=(x5+9x+3)8h(x) = (x^5 + 9 x + 3)^8. Outside: f(x)=x8f(x) = x^8. Inside: g(x)=x5+9x+3g(x) = x^5 + 9 x + 3.

  2. k(x)=4x2+1k(x) = \sqrt{4 x^2 + 1}. Outside: f(x)=xf(x) = \sqrt{x}. Inside: g(x)=4x2+1g(x) = 4 x^2 + 1.

The decomposition is rarely unique — hh above can also be read as (x5+9x+3)4(x^5 + 9 x + 3)^4 to the power 22, with f(x)=x2f(x) = x^2 — but every decomposition leads to the same derivative when the chain rule is applied.

A function of the form [g(x)]r[g(x)]^r is the composite f(g(x))f(g(x)) with outside f(x)=xrf(x) = x^r. The general power rule already gave its derivative,

ddx[g(x)]r=r[g(x)]r1g(x),\frac{d}{dx} [g(x)]^r = r [g(x)]^{r-1} \, g'(x),

and the chain rule has the same shape, with ff free to be anything differentiable.

The Chain Rule

Theorem 15 (Chain Rule)

If gg is differentiable at xx and ff is differentiable at g(x)g(x), then fgf \circ g is differentiable at xx, and

ddxf(g(x))=f(g(x))g(x).\frac{d}{dx} f(g(x)) = f'(g(x)) \cdot g'(x).

In words: differentiate the outside, leave the inside alone, then multiply by the derivative of the inside.

The two factors play different roles. f(g(x))f'(g(x)) asks how fast the outside function is changing at the value the inside has just produced; g(x)g'(x) asks how fast the inside function is changing at xx. The product is the rate of change of the composition.

Example 117 (The general power rule recovered)

Use the chain rule on f(g(x))f(g(x)) with f(x)=x8f(x) = x^8 and g(x)=x5+9x+3g(x) = x^5 + 9 x + 3.

We have f(x)=8x7f'(x) = 8 x^7 and g(x)=5x4+9g'(x) = 5 x^4 + 9, so

f(g(x))=8(x5+9x+3)7,f'(g(x)) = 8 (x^5 + 9 x + 3)^7,

and the chain rule gives

ddxf(g(x))=8(x5+9x+3)7(5x4+9).\frac{d}{dx} f(g(x)) = 8 (x^5 + 9 x + 3)^7 (5 x^4 + 9).

The outside is a power, so the answer matches what the general power rule produces directly. The chain rule does not change the answer; it labels the structure of the calculation in a way that survives when the outside is no longer a power.

Example 118 (Differentiating an unspecified outer function)

Let h(x)=f(x)h(x) = f(\sqrt{x}) for some differentiable ff (no formula given). Find h(x)h'(x) in terms of ff'.

Set g(x)=xg(x) = \sqrt{x}, so h=fgh = f \circ g. The general power rule gives g(x)=12xg'(x) = \dfrac{1}{2 \sqrt{x}}, and the chain rule supplies

h(x)=f(g(x))g(x)=f(x)12x=f(x)2x.h'(x) = f'(g(x)) \cdot g'(x) = f'(\sqrt{x}) \cdot \frac{1}{2 \sqrt{x}} = \frac{f'(\sqrt{x})}{2 \sqrt{x}}.

The outside function is opaque, yet the chain rule still tells us exactly how the rate of change of hh depends on the rate of change of ff. This pattern — passing through an unknown outer function — is the central maneuver when implicit and inverse functions enter the course.

Problem 121

Let h(x)=f(x2+1)h(x) = f(x^2 + 1) where ff is differentiable.

  1. Express h(x)h'(x) in terms of ff'.
  2. Given f(5)=7f'(5) = 7, compute h(2)h'(2).
Problem 122

Let F(x)=[g(x)]3F(x) = \bigl[ g(x) \bigr]^3 where g(2)=1g(2) = -1 and g(2)=4g'(2) = 4. Use the chain rule to compute F(2)F'(2).

Problem 123

For each function, identify the outside ff and the inside gg, then differentiate using the chain rule.

  1. h(x)=(3x25)7h(x) = (3 x^2 - 5)^7.
  2. h(x)=x3+x+1h(x) = \sqrt{x^3 + x + 1}.
  3. h(x)=1(x2+4)3h(x) = \dfrac{1}{(x^2 + 4)^3}.
  4. h(x)=((2x+1)4+7)1/2h(x) = \bigl( (2 x + 1)^4 + 7 \bigr)^{1/2} (a chain of two compositions; treat the inner composite as gg).

Leibniz Form

Setting u=g(x)u = g(x) and y=f(u)y = f(u) converts the chain rule into a notation that mirrors how rates compose physically. Then dudx=g(x)\dfrac{du}{dx} = g'(x) and dydu=f(u)=f(g(x))\dfrac{dy}{du} = f'(u) = f'(g(x)), and the chain rule becomes

dydx=dydududx.(4)\frac{dy}{dx} = \frac{dy}{du} \cdot \frac{du}{dx}. \tag{4}

The derivative symbols are not literal fractions, but the apparent cancellation of the dudu symbols is a faithful mnemonic.

The interpretation is direct. If yy varies three times as fast as uu and uu varies twice as fast as xx, then yy varies six times as fast as xx:

dydx=32=6.\frac{dy}{dx} = 3 \cdot 2 = 6.

Rates compose by multiplication, just as scaling factors do.

Two graphs side by side. Left: u = x² + 1 with the tangent at x = 1 marked, slope du/dx = 2. Right: y = u³ with the tangent at u = 2 marked, slope dy/du = 12. The composition gives dy/dx = 12 × 2 = 24 at x = 1.

The figure makes the rule visible. With u=x2+1u = x^2 + 1, at x=1x = 1 we have u=2u = 2 and du/dx=2du/dx = 2. With y=u3y = u^3, at u=2u = 2 we have y=8y = 8 and dy/du=12dy/du = 12. The chain rule says dy/dx=122=24dy/dx = 12 \cdot 2 = 24 at x=1x = 1, and substituting the composition y=(x2+1)3y = (x^2 + 1)^3 confirms it: dydx=3(x2+1)2(2x)=6x(x2+1)2=614=24\dfrac{dy}{dx} = 3 (x^2 + 1)^2 (2 x) = 6 x (x^2 + 1)^2 = 6 \cdot 1 \cdot 4 = 24 at x=1x = 1.

Example 119 (The chain rule via uu)

Find dydx\dfrac{dy}{dx} if y=u52u3+8y = u^5 - 2 u^3 + 8 and u=x2+1u = x^2 + 1.

The relation y(x)y(x) is not given directly. Differentiate yy with respect to uu and uu with respect to xx:

dydu=5u46u2,dudx=2x.\frac{dy}{du} = 5 u^4 - 6 u^2, \qquad \frac{du}{dx} = 2 x.

By (4)(4),

dydx=(5u46u2)2x.\frac{dy}{dx} = (5 u^4 - 6 u^2) \cdot 2 x.

Expressing in xx alone via u=x2+1u = x^2 + 1:

dydx=[5(x2+1)46(x2+1)2]2x.\frac{dy}{dx} = \bigl[ 5 (x^2 + 1)^4 - 6 (x^2 + 1)^2 \bigr] \cdot 2 x.

Substituting uu into yy first and then differentiating using only the rules above gives the same answer; it is just a longer calculation. The Leibniz form is preferred because the next two examples — and most of the applied work in later lessons — present the variables in exactly the layered way that (4)(4) exploits.

Problem 124

Find dydx\dfrac{dy}{dx} in two ways: first by substituting and differentiating directly, second by the chain rule via uu. Confirm the answers agree.

  1. y=u4+u2y = u^4 + u^2, u=2x+3u = 2 x + 3.
  2. y=uy = \sqrt{u}, u=x2+7u = x^2 + 7.
  3. y=1uy = \dfrac{1}{u}, u=x3+xu = x^3 + x.
Problem 125

Suppose y=f(u)y = f(u), u=g(v)u = g(v), v=h(x)v = h(x), so that yy depends on xx through two compositions. Show that

dydx=dydududvdvdx,\frac{dy}{dx} = \frac{dy}{du} \cdot \frac{du}{dv} \cdot \frac{dv}{dx},

by applying the two-link chain rule twice. Use this three-link form to differentiate y=(3x2)6+5y = \sqrt{(3 x - 2)^6 + 5}.

Time Rates of Change

The chain rule’s most repeated applied use is to convert a rate of change with respect to one variable into a rate of change with respect to time. If xx is a function of tt and RR is a function of xx, then RR is a function of tt, and

dRdt=dRdxdxdt.\frac{dR}{dt} = \frac{dR}{dx} \cdot \frac{dx}{dt}.

The factor dRdx\dfrac{dR}{dx} is the marginal quantity already studied in Lessons 2 and 3; the factor dxdt\dfrac{dx}{dt} is how fast the input is moving in real time. Their product is the real-time rate of change of RR.

Example 120 (Marginal revenue and time)

A shop sells ties for £12 each. Let xx be the cumulative number of ties sold by time tt, and let R=12xR = 12 x be the corresponding revenue. If sales are rising at four ties per day, how fast is revenue rising?

Each extra tie brings in £12, and four extras a day brings in £48 a day; the answer is intuitively £48 per day. The chain rule reproduces it:

dRdt=dRdxdxdt=124=48 pounds per day.\frac{dR}{dt} = \frac{dR}{dx} \cdot \frac{dx}{dt} = 12 \cdot 4 = 48 \text{ pounds per day}.

The factor dRdx=12\dfrac{dR}{dx} = 12 is the marginal revenue per tie; dxdt=4\dfrac{dx}{dt} = 4 is the sales rate in ties per day. The chain rule asserts that the time rate of revenue is always the product of these two — the marginal multiplied by the time rate of the input.

Example 121 (Time rate of revenue at a production target)

The demand equation for a brand of graphing calculator is p=860.002xp = 86 - 0{.}002 \, x pounds per unit, where xx is the cumulative number of calculators produced and sold during the current run. The firm is currently at x=6000x = 6000 and is increasing production by 200200 calculators per day. Find the time rate of change of total revenue at this production level.

Total revenue is

R(x)=xp=x(860.002x)=0.002x2+86x.R(x) = x \cdot p = x (86 - 0{.}002 \, x) = -0{.}002 \, x^2 + 86 \, x.

The marginal revenue is

dRdx=0.004x+86.\frac{dR}{dx} = -0{.}004 \, x + 86.

By the chain rule,

dRdt=dRdxdxdt=(0.004x+86)dxdt.\frac{dR}{dt} = \frac{dR}{dx} \cdot \frac{dx}{dt} = (-0{.}004 \, x + 86) \cdot \frac{dx}{dt}.

At x=6000x = 6000 and dxdt=200\dfrac{dx}{dt} = 200:

dRdt=(0.004(6000)+86)200=(24+86)200=62200=12400.\frac{dR}{dt} = \bigl(-0{.}004 (6000) + 86\bigr) \cdot 200 = (-24 + 86) \cdot 200 = 62 \cdot 200 = 12\,400.

Revenue is rising at £12,400 per day at this production level.

Revenue parabola R(x) = -0.002x² + 86x for x in [0, 43000], with R rescaled by 1/1000. The current production point at x = 6000 is marked, with R = £444,000 and a tangent line of slope dR/dx = 62 (£ per unit).

The figure isolates the two factors visually. The slope of the tangent line at x=6000x = 6000 is dRdx=62\dfrac{dR}{dx} = 62 pounds per unit. That number is fixed by the demand model and the current production level; it has nothing to do with time. Plugging in dxdt=200\dfrac{dx}{dt} = 200 — the production schedule — converts that marginal into a daily rate. Different schedules yield different time rates while leaving the marginal unchanged: doubling the production speed to 400400 per day would double the time rate of revenue to £24,800 per day.

Example 122 (A balloon inflating)

Air is being pumped into a spherical balloon, and the radius is observed to grow at drdt=0.5\dfrac{dr}{dt} = 0{.}5 centimeters per second when r=4r = 4 centimeters. How fast is the volume changing at that instant?

The volume of a sphere is V=43πr3V = \dfrac{4}{3} \pi r^3. Differentiating with respect to rr,

dVdr=4πr2.\frac{dV}{dr} = 4 \pi r^2.

By the chain rule,

dVdt=dVdrdrdt=4πr2drdt.\frac{dV}{dt} = \frac{dV}{dr} \cdot \frac{dr}{dt} = 4 \pi r^2 \cdot \frac{dr}{dt}.

At r=4r = 4 and drdt=0.5\dfrac{dr}{dt} = 0{.}5:

dVdt=4π(16)(0.5)=32π100.5 cm3/s.\frac{dV}{dt} = 4 \pi (16)(0{.}5) = 32 \pi \approx 100{.}5 \text{ cm}^3/\text{s}.

The factor 4πr24 \pi r^2 is the surface area of the sphere — geometrically, the rate at which volume changes per unit increase in radius is the area of the boundary. The chain rule turns that geometric fact into a time rate.

Problem 126

A circular oil slick is spreading. Its radius grows at drdt=2\dfrac{dr}{dt} = 2 meters per minute when r=50r = 50 meters. The slick’s area is A=πr2A = \pi r^2.

  1. Write dAdt\dfrac{dA}{dt} using the chain rule.
  2. Compute dAdt\dfrac{dA}{dt} at the moment r=50r = 50.
  3. The slick’s circumference is C=2πrC = 2 \pi r. Find dCdt\dfrac{dC}{dt} at the same instant. Why does the circumference rate not depend on the current value of rr?
Problem 127

A factory’s daily revenue at production level xx is R(x)=200x0.01x2R(x) = 200 x - 0{.}01 \, x^2 pounds per day. The production schedule is x(t)=500+30tx(t) = 500 + 30 t units per day, where tt is in days from the start of the schedule.

  1. Compute dRdx\dfrac{dR}{dx} as a function of xx, and state the marginal revenue at x=500x = 500 and at x=1000x = 1000.
  2. Use the chain rule to find dRdt\dfrac{dR}{dt}.
  3. At what value of tt does the time rate of daily revenue equal zero? Interpret what this means in plain English.
Problem 128

A spherical raindrop evaporates so that its radius decreases at drdt=0.02\dfrac{dr}{dt} = -0{.}02 millimeters per second when r=1r = 1 millimeter. Find dVdt\dfrac{dV}{dt} at that instant, with V=43πr3V = \dfrac{4}{3} \pi r^3. Why is the rate negative?

Problem 129

Two ships leave the same harbor at t=0t = 0. Ship AA travels due east at 3030 km/h; ship BB travels due north at 4040 km/h. Let a(t)a(t) and b(t)b(t) denote their distances from the harbor at time tt, and let D(t)=a(t)2+b(t)2D(t) = \sqrt{a(t)^{2} + b(t)^{2}} denote the distance between the two ships.

  1. Use the chain rule on D=a2+b2D = \sqrt{a^{2} + b^{2}} to express dDdt\dfrac{dD}{dt} in terms of aa, bb, dadt\dfrac{da}{dt}, and dbdt\dfrac{db}{dt}.
  2. Compute dDdt\dfrac{dD}{dt} at t=1t = 1 hour. Then compute it again at t=2t = 2 hours. Show the answers are equal.
  3. Prove that dDdt\dfrac{dD}{dt} is constant for every t>0t > 0, and explain in one sentence why the constancy is forced by the geometry of two perpendicular constant-speed motions.

Proof of the Chain Rule

The chain rule does not follow from earlier rules the way the quotient rule did. It needs its own limit argument, built around the trick of relating the difference quotient of fgf \circ g at x=ax = a to the difference quotient of ff at g(a)g(a).

Proof

Suppose gg is differentiable at aa and ff is differentiable at g(a)g(a). Set b=g(a)b = g(a). Since ff is differentiable at bb, its change near bb can be written as

f(b+k)f(b)=(f(b)+E(k))k,f(b+k)-f(b)=\bigl(f'(b)+E(k)\bigr)k,

where E(k)E(k) tends to 00 as kk tends to 00. For k0k \neq 0, this is just the difference quotient rewritten:

E(k)=f(b+k)f(b)kf(b),E(k)=\frac{f(b+k)-f(b)}{k}-f'(b),

and at k=0k=0 we may set E(0)=0E(0)=0, so the displayed change formula remains true.

Now put k=g(a+h)g(a)k = g(a+h)-g(a). Since gg is differentiable at aa, it is continuous at aa, so this kk tends to 00 as hh tends to 00. Also g(a+h)=b+kg(a+h)=b+k. Therefore

f(g(a+h))f(g(a))=(f(b)+E(g(a+h)g(a)))(g(a+h)g(a)).f(g(a+h))-f(g(a)) = \bigl(f'(b)+E(g(a+h)-g(a))\bigr)\bigl(g(a+h)-g(a)\bigr).

Divide by hh:

f(g(a+h))f(g(a))h=(f(b)+E(g(a+h)g(a)))g(a+h)g(a)h.\frac{f(g(a+h))-f(g(a))}{h} = \bigl(f'(b)+E(g(a+h)-g(a))\bigr)\frac{g(a+h)-g(a)}{h}.

Taking the limit as hh tends to 00, the first factor tends to f(b)f'(b) and the second factor tends to g(a)g'(a). Hence

ddxf(g(x))x=a=f(b)g(a)=f(g(a))g(a),\left. \frac{d}{dx} f(g(x)) \right|_{x=a} = f'(b)g'(a) = f'(g(a))g'(a),

which is the chain rule at x=ax=a.

The proof’s central trick is the same one that powered the product rule: a single difference quotient is too tangled to evaluate, so we rewrite it using a bridge term whose limit we understand. For the product rule, the bridge term was f(x+h)g(x)f(x + h) g(x); for the chain rule, the bridge is the inner change g(a+h)g(a)g(a + h) - g(a).

Note (Differentiation toolkit so far)

Combining the rules from Lessons 2, Recitation 2, and this lesson, every differentiable expression in this course can be reduced. Apply the rules from the outside in.

RuleForm
Constant multipleddx[cf]=cf\frac{d}{dx}[c \, f] = c \, f'
Sumddx[f±g]=f±g\frac{d}{dx}[f \pm g] = f' \pm g'
Powerddxxn=nxn1\frac{d}{dx} x^n = n \, x^{n-1}
Productddx[fg]=fg+gf\frac{d}{dx}[f \, g] = f \, g' + g \, f'
Quotientddx ⁣[fg]=gffgg2\frac{d}{dx}\!\left[\frac{f}{g}\right] = \frac{g \, f' - f \, g'}{g^2}
Chainddxf(g(x))=f(g(x))g(x)\frac{d}{dx} f(g(x)) = f'(g(x)) \, g'(x)
General power (chain ∘ power)ddx[g(x)]r=r[g(x)]r1g(x)\frac{d}{dx} [g(x)]^r = r \, [g(x)]^{r-1} g'(x)

The general power rule is now formally what it always was structurally: the chain rule with f(x)=xrf(x) = x^r.