Mascot image.
#Math#Differentiation
← Back to MA0A: Differentiation

Lesson assets

No linked assets.

Rules for Differentiation

Lesson 2PM verified the power rule for several special exponents and stated it for every rational rr, but the Rule reaches only the bare power function xrx^{r}. Most functions of interest in this course are built from such powers by two further operations: multiplication by a constant, as in 7x47 x^{4}, and addition, as in x3+5x2x^{3} + 5 x - 2. Each operation interacts with differentiation in a way fixed by the limit definition itself, and the resulting two rules are enough to differentiate every polynomial by inspection. A third rule, generalising the power rule from xx to an arbitrary differentiable expression g(x)g(x), is recorded at the end of the section for use later in the course.

The constant multiple rule

Theorem 1 (Constant-Multiple Rule)

Let ff be differentiable at xx in the sense of Lesson 2PM, and let kk be a real number independent of xx. Then kfkf is differentiable at xx and

ddx(kf(x))=kddxf(x).\frac{d}{dx}\bigl(k\,f(x)\bigr) = k \cdot \frac{d}{dx}\,f(x).

A constant multiplier rides along through differentiation, untouched. The factor kk scales the rise of every secant of ff by exactly the same amount kk without altering its run, so the slope of every secant of kfkf is kk times the slope of the corresponding secant of ff, and the same factor survives the limit. The verification at the end of the section makes the remark precise.

Example 1 (A constant multiplier on a single power)

For f(x)=7x4f(x) = 7 x^{4} take k=7k = 7 and the inner function x4x^{4}, whose derivative is 4x34 x^{3} by the power rule of Lesson 2PM. The constant multiple rule then gives

f(x)=7ddx(x4)=74x3=28x3.f'(x) = 7 \cdot \frac{d}{dx}\bigl(x^{4}\bigr) = 7 \cdot 4 x^{3} = 28 x^{3}.

The same step handles a negative scalar without ceremony: ddx(13x5)=53x4\frac{d}{dx}\bigl(-\tfrac{1}{3} x^{5}\bigr) = -\tfrac{5}{3} x^{4}.

The cubic curve y equals x cubed drawn alongside its vertical scaling y equals 2 x cubed in a contrasting colour. At the point with x equal to 1 each curve carries a short dashed tangent segment, the lower one labelled with slope 3 and the upper one with slope 6, illustrating that vertical scaling by 2 doubles the slope of the tangent at every point.
Example 2 (Carrying a fractional multiplier through a root)

For f(x)=25xf(x) = \tfrac{2}{5} \sqrt{x} on x>0x > 0, write x=x1/2\sqrt{x} = x^{1/2} as in Recitation 1 and apply the power rule with r=12r = \tfrac{1}{2},

f(x)=25ddx(x1/2)=2512x1/2=15x.f'(x) = \tfrac{2}{5} \cdot \frac{d}{dx}\bigl(x^{1/2}\bigr) = \tfrac{2}{5} \cdot \tfrac{1}{2} x^{-1/2} = \frac{1}{5\sqrt{x}}.

The constant 25\tfrac{2}{5} rides along to the end and never participates in the differentiation itself.

Problem 1

Compute the derivative of each function below by the constant multiple rule and the power rule, stating the inputs at which the derivative is defined.

  1. f(x)=9x6f(x) = 9 x^{6}.
  2. f(x)=12x3f(x) = -\tfrac{1}{2} x^{-3}.
  3. f(x)=4/x2f(x) = 4 / x^{2}.
  4. f(x)=34x2/3f(x) = -\tfrac{3}{4} x^{2/3}.

The Sum Rule

Theorem 2 (Sum Rule)

Let ff and gg be differentiable at xx in the sense of Lesson 2PM. Then f+gf + g is differentiable at xx and

ddx(f(x)+g(x))=ddxf(x)+ddxg(x).\frac{d}{dx}\bigl(f(x) + g(x)\bigr) = \frac{d}{dx}\,f(x) + \frac{d}{dx}\,g(x).

The same identity, with the sign of the second term reversed throughout, gives the difference rule

ddx(f(x)g(x))=ddxf(x)ddxg(x).\frac{d}{dx}\bigl(f(x) - g(x)\bigr) = \frac{d}{dx}\,f(x) - \frac{d}{dx}\,g(x).

The derivative of a sum is the sum of the derivatives, and the same is true for differences. Applied to a polynomial term by term, the sum rule together with the constant multiple rule and the power rule reduces every differentiation to the mechanical procedure of dropping each exponent by one and bringing it round to the front.

Example 3 (Differentiating a polynomial term by term)

For f(x)=2x45x3+x7f(x) = 2 x^{4} - 5 x^{3} + x - 7 the Sum and Difference Rules apply across the four terms, the constant multiple rule strips each leading coefficient, and the power rule handles each remaining power. The constant term contributes 00 by the constant case from Lesson 2PM, so

f(x)=24x353x2+110=8x315x2+1.f'(x) = 2 \cdot 4 x^{3} - 5 \cdot 3 x^{2} + 1 \cdot 1 - 0 = 8 x^{3} - 15 x^{2} + 1.

With practice the intermediate steps fold into a single line of writing, and the derivative of any polynomial is read off by inspection.

Three curves drawn on a single set of axes: the cubic y equals x cubed, the line y equals 3 x, and the sum y equals x cubed plus 3 x. The sum curve sits above the cubic for positive x by exactly the height of the line, and below it for negative x by the same amount, showing that the graph of f plus g is obtained by stacking the heights of f and g vertically.
Example 4 (A sum of non-integer powers)

For f(x)=6x+3/xf(x) = 6 \sqrt{x} + 3 / x on x>0x > 0, rewrite the function in power notation as 6x1/2+3x16 x^{1/2} + 3 x^{-1} and apply the same machinery,

f(x)=612x1/2+3(1)x2=3x3x2.f'(x) = 6 \cdot \tfrac{1}{2} x^{-1/2} + 3 \cdot (-1) x^{-2} = \frac{3}{\sqrt{x}} - \frac{3}{x^{2}}.

Care with the natural domain matches the power rule itself: the first term restricts to x>0x > 0, the second to x0x \neq 0, and the intersection (0,)(0, \infty) is the domain of ff'.

Remark (A sum is not a product)

The sum rule splits a derivative across ++ and -, and across nothing else. The corresponding statement for a product is false: writing h(x)=xx=x2h(x) = x \cdot x = x^{2} and treating each factor separately would give h(x)=11=1h'(x) = 1 \cdot 1 = 1, while the power rule supplies the correct h(x)=2xh'(x) = 2 x. Products are governed by a separate rule developed in a later MA0A lesson, and until that rule is in place every product must be either expanded into a sum first, as in hh above, or left alone.

Problem 2

Compute the derivative of each function by combining the Sum, Constant-Multiple, and power rules.

  1. f(x)=x5+4x36x+2f(x) = x^{5} + 4 x^{3} - 6 x + 2.
  2. f(x)=13x612x4+9f(x) = \tfrac{1}{3} x^{6} - \tfrac{1}{2} x^{4} + 9.
  3. f(x)=2x1/25x1/2f(x) = 2 x^{1/2} - 5 x^{-1/2}.
  4. f(x)=(x2+1)(x3)f(x) = (x^{2} + 1)(x - 3), by expanding before differentiating.

The General Power Rule

The power rule of Lesson 2PM differentiates xrx^{r}, the rrth power of the bare identity xx. When the inner expression is replaced by a more complicated differentiable function g(x)g(x), the form of the answer survives essentially unchanged, with one extra factor.

Theorem 3 (General Power Rule)

Let gg be differentiable at xx and let rr be a rational number for which the power g(x)rg(x)^{r} is defined on inputs near xx and at xx itself. Assume also that the outer power function is differentiable at g(x)g(x); this is automatic when rr is a positive integer or when g(x)>0g(x) > 0, but zeros of gg require separate checking. Then g(x)rg(x)^{r} is differentiable at xx and

ddx(g(x)r)=rg(x)r1ddxg(x).\frac{d}{dx}\bigl(g(x)^{r}\bigr) = r\,g(x)^{r - 1} \cdot \frac{d}{dx}\,g(x).

The proof is deferred to a later MA0A lesson, where the chain-like manipulation needed to obtain it is treated systematically.

The factor rg(x)r1r\,g(x)^{r - 1} is the answer the power rule would supply if g(x)g(x) were itself the variable; the additional g(x)g'(x) corrects for the fact that the inner expression is itself changing with xx at its own rate. With g(x)=xg(x) = x the correction collapses to 11 and the General Rule reduces to the power rule with no information gained; the gain comes when gg is genuinely non-trivial.

Example 5 (Verification of the rule on a tractable case)

The Rule was stated without proof, but a small case can be checked directly. Take f(x)=(x2+3)2f(x) = (x^{2} + 3)^{2}, with g(x)=x2+3g(x) = x^{2} + 3 and r=2r = 2. The General Rule supplies

f(x)=2(x2+3)2x=4x(x2+3)=4x3+12x.f'(x) = 2\,(x^{2} + 3) \cdot 2 x = 4 x\,(x^{2} + 3) = 4 x^{3} + 12 x.

On the other hand, expanding ff first as (x2+3)2=x4+6x2+9(x^{2} + 3)^{2} = x^{4} + 6 x^{2} + 9 and differentiating term by term by the rules of the previous two sections,

f(x)=4x3+12x+0=4x3+12x.f'(x) = 4 x^{3} + 12 x + 0 = 4 x^{3} + 12 x.

The two procedures agree. The expansion method always works for an integer exponent and a polynomial inner function but becomes prohibitive even at modest powers; the General Rule is the closed-form answer to which the binomial expansion would, after pages of arithmetic, eventually collapse.

Example 6 (A power of a binomial)

For f(x)=(3x+5)4f(x) = (3 x + 5)^{4} take g(x)=3x+5g(x) = 3 x + 5 and r=4r = 4. The inner derivative is g(x)=3g'(x) = 3 by the linear case from Lesson 2PM, and the General Rule gives

f(x)=4(3x+5)33=12(3x+5)3.f'(x) = 4\,(3 x + 5)^{3} \cdot 3 = 12\,(3 x + 5)^{3}.

Expanding (3x+5)4(3 x + 5)^{4} first by the binomial theorem and differentiating term by term reproduces the same answer at greater length; the General power rule packages the entire expansion into a single line.

Example 7 (A fractional power of a quadratic)

For f(x)=(x2+1)3/2f(x) = (x^{2} + 1)^{3/2} on the whole real line take g(x)=x2+1g(x) = x^{2} + 1 and r=32r = \tfrac{3}{2}. The Sum and power rules give g(x)=2xg'(x) = 2 x, and the General Rule supplies

f(x)=32(x2+1)1/22x=3xx2+1.f'(x) = \tfrac{3}{2}\,(x^{2} + 1)^{1/2} \cdot 2 x = 3 x \sqrt{x^{2} + 1}.

The base x2+1x^{2} + 1 is positive at every real xx, so the half-power and the derivative are defined throughout.

Example 8 (A negative power of a polynomial)

For f(x)=1/(x24)f(x) = 1 / (x^{2} - 4) on x2|x| \neq 2, write f(x)=(x24)1f(x) = (x^{2} - 4)^{-1} and apply the General Rule with r=1r = -1 and g(x)=2xg'(x) = 2 x:

f(x)=(1)(x24)22x=2x(x24)2.f'(x) = (-1)(x^{2} - 4)^{-2} \cdot 2 x = -\frac{2 x}{(x^{2} - 4)^{2}}.

The natural domain of ff' is the same x2|x| \neq 2 that ff itself carried, the squaring of the denominator preserving the exclusion.

Example 9 (Combining the Constant-Multiple Rule with the General Power Rule)

For f(x)=4(2x3+1)1/2f(x) = 4\,(2 x^{3} + 1)^{1/2} on the inputs at which 2x3+102 x^{3} + 1 \geq 0, the outermost operation is multiplication by the constant 44, and the constant multiple rule pulls it through differentiation untouched. The remaining derivative is supplied by the General Rule with g(x)=2x3+1g(x) = 2 x^{3} + 1, g(x)=6x2g'(x) = 6 x^{2}, and r=12r = \tfrac{1}{2},

f(x)=4ddx((2x3+1)1/2)=412(2x3+1)1/26x2=12x22x3+1.f'(x) = 4 \cdot \frac{d}{dx}\bigl((2 x^{3} + 1)^{1/2}\bigr) = 4 \cdot \tfrac{1}{2}\,(2 x^{3} + 1)^{-1/2} \cdot 6 x^{2} = \frac{12 x^{2}}{\sqrt{2 x^{3} + 1}}.

The derivative formula is defined only where 2x3+1>02x^{3} + 1 > 0; at the boundary 2x3+1=02x^{3}+1=0 the original function is defined but the derivative is not. The cancellation of 4124 \cdot \tfrac{1}{2} to 22 is part of the calculation; the rules themselves do nothing more than authorise the chain of equalities.

Note (Constant times a power, in a single line)

The combination just performed arises often enough to be worth recording as a single identity. For any constant kk, any function gg differentiable at xx, and any rational rr for which g(x)rg(x)^{r} is defined,

ddx(kg(x)r)=krg(x)r1g(x),\frac{d}{dx}\bigl(k\,g(x)^{r}\bigr) = k\,r\,g(x)^{r - 1} \cdot g'(x),

the right-hand side reading exactly as the General Rule output with the constant kk retained at the front. The identity is a corollary of the Constant-Multiple and General power rules and adds no new content beyond their conjunction; the cost-function example below is one further application of the same line.

Problem 3

Differentiate each function using the General power rule, stating the natural domain of the result.

  1. f(x)=(x32x)5f(x) = (x^{3} - 2 x)^{5}.
  2. f(x)=4x+1f(x) = \sqrt{4 x + 1}.
  3. f(x)=(1x2)1/2f(x) = (1 - x^{2})^{-1/2}.
  4. f(x)=(x2+x+1)4f(x) = (x^{2} + x + 1)^{4}.

Verifications

The Sum and constant multiple rules follow directly from the limit definition together with two of the limit theorems of Lesson 2PM. The General power rule needs heavier machinery and is left to a later MA0A lesson.

Proof

constant multiple rule. Let ff be differentiable at xx, with f(x)=limh0(f(x+h)f(x))/hf'(x) = \lim_{h \to 0} (f(x + h) - f(x))/h, and let kk be a real number independent of hh. The difference quotient of kfkf at xx is

kf(x+h)kf(x)h=kf(x+h)f(x)h\frac{k\,f(x + h) - k\,f(x)}{h} = k \cdot \frac{f(x + h) - f(x)}{h}

for every h0h \neq 0, the factor kk pulled out by ordinary arithmetic. Limit Theorem I of Lesson 2PM, applied with the difference quotient of ff in place of the inner function, gives

limh0kf(x+h)kf(x)h=klimh0f(x+h)f(x)h=kf(x),\lim_{h \to 0} \frac{k\,f(x + h) - k\,f(x)}{h} = k \cdot \lim_{h \to 0} \frac{f(x + h) - f(x)}{h} = k\,f'(x),

which is the claim.

Proof

sum rule. Let ff and gg be differentiable at xx. The difference quotient of f+gf + g at xx separates into two pieces by ordinary arithmetic,

(f+g)(x+h)(f+g)(x)h=f(x+h)f(x)h+g(x+h)g(x)h,\frac{(f + g)(x + h) - (f + g)(x)}{h} = \frac{f(x + h) - f(x)}{h} + \frac{g(x + h) - g(x)}{h},

valid for every h0h \neq 0. Both summands have a limit as h0h \to 0, namely f(x)f'(x) and g(x)g'(x), and Limit Theorem III of Lesson 2PM gives

limh0(f+g)(x+h)(f+g)(x)h=f(x)+g(x),\lim_{h \to 0} \frac{(f + g)(x + h) - (f + g)(x)}{h} = f'(x) + g'(x),

which is the claim. The difference rule follows on replacing gg by g-g and applying the constant multiple rule with k=1k = -1.

Applications

With the three rules in hand, every problem from Lesson 2PM that called for a derivative through the limit definition is settled by inspection, and the time freed is spent on the geometry the derivative is meant to capture.

Example 10 (Tangent line to a cubic)

Find the tangent line to y=x36x+4y = x^{3} - 6 x + 4 at the point with x=2x = 2.

The Sum, Constant-Multiple, and power rules give the derivative in a single line,

f(x)=3x26,f'(x) = 3 x^{2} - 6,

so the slope at a=2a = 2 is f(2)=126=6f'(2) = 12 - 6 = 6. The height there is f(2)=812+4=0f(2) = 8 - 12 + 4 = 0, locating the point of contact at (2,0)(2, 0). By the equation of the tangent line at x=ax = a from Lesson 2PM, the tangent line is

y0=6(x2),equivalently,y=6x12.y - 0 = 6\,(x - 2), \qquad \text{equivalently,} \qquad y = 6 x - 12.
Example 11 (Horizontal tangents on a quartic)

Locate every point at which the curve y=x44x3y = x^{4} - 4 x^{3} has a horizontal tangent.

The derivative is f(x)=4x312x2f'(x) = 4 x^{3} - 12 x^{2} by the rules of this section, and a horizontal tangent occurs exactly when f(x)=0f'(x) = 0. Factoring,

4x312x2=4x2(x3),4 x^{3} - 12 x^{2} = 4 x^{2}\,(x - 3),

the product-zero principle of Recitation 1 gives x=0x = 0 or x=3x = 3. The corresponding heights are f(0)=0f(0) = 0 and f(3)=81108=27f(3) = 81 - 108 = -27, so the curve has horizontal tangents at (0,0)(0, 0) and (3,27)(3, -27) and nowhere else.

The curve y equals x to the fourth minus four x cubed plotted from x slightly less than zero to x just past four, with a dashed horizontal tangent line drawn at the origin and another dashed horizontal tangent drawn at the point with coordinates three and minus twenty-seven. The two tangent points are marked with filled circles, showing the only two locations on this curve where the tangent line is horizontal.
Example 12 (A point with prescribed slope)

Find every point on y=xy = \sqrt{x} at which the tangent line is parallel to y=16x+1y = \tfrac{1}{6} x + 1.

The given line has slope 16\tfrac{1}{6}, and parallelism by slope property 4 of Lesson 2AM forces the curve to have the same slope at the point of contact. The Constant-Multiple and power rules give

f(x)=12x,x>0,f'(x) = \frac{1}{2 \sqrt{x}}, \qquad x > 0,

and the condition 1/(2x)=1/61/(2 \sqrt{x}) = 1/6 rearranges to x=3\sqrt{x} = 3, that is x=9x = 9. The single point of contact is (9,3)(9, 3), and no other point on the curve has the required slope.

Example 13 (A non-linear refinement of a cost function)

The publisher of Lesson 2AM treated a linear total cost C(x)=10,000+25xC(x) = 10{,}000 + 25 x, for which the marginal cost is the constant slope 2525 at every production level. A non-linear refinement of the same model replaces CC by

C(x)=100(2x+25)3/2,x0,C(x) = 100\,(2 x + 25)^{3/2}, \qquad x \geq 0,

where additional production wears the equipment progressively faster. By the constant multiple rule and the General power rule with g(x)=2x+25g(x) = 2 x + 25, g(x)=2g'(x) = 2, and r=32r = \tfrac{3}{2},

C(x)=10032(2x+25)1/22=3002x+25.C'(x) = 100 \cdot \tfrac{3}{2}\,(2 x + 25)^{1/2} \cdot 2 = 300\,\sqrt{2 x + 25}.

The marginal cost is now an increasing function of xx, in line with the wear interpretation: the cost of the next copy at production level xx exceeds the cost at any lower level. At x=0x = 0 the marginal cost is 3005=1500300 \cdot 5 = 1500, and at x=12x = 12 it has risen to 3007=2100300 \cdot 7 = 2100, the linear model’s constant 2525 losing all meaning.

Problem 4

For each curve below, compute f(x)f'(x) by the rules of this section, then locate every point at which the tangent line is horizontal.

  1. f(x)=x33x2f(x) = x^{3} - 3 x^{2}.
  2. f(x)=x48x2f(x) = x^{4} - 8 x^{2}.
  3. f(x)=(x21)3f(x) = (x^{2} - 1)^{3}.

Other Variables and the Second Derivative

The rules above were written in xx and yy, but the slope formula does not depend on the letters chosen for the input and output. Two minor extensions of the apparatus cover the situations that arise in practice: the input may carry a different name, and the derivative is itself a function and so may be differentiated again.

Independent variables other than xx

When the input is called tt rather than xx, the operator ddx\frac{d}{dx} is replaced by ddt\frac{d}{dt} throughout, and the prime notation f(t)f'(t) is read as the derivative of ff with respect to tt. Transferring the slope formula for x2x^{2},

ddt(t2)=2t,\frac{d}{dt}\bigl(t^{2}\bigr) = 2 t,

the same arithmetic as before with tt in place of xx. The name of the input alters only the labelling of the axes; the geometry of the slope at a point is unchanged. The slope formula for the cubic, written in any letter,

ddu(u3)=3u2,dds(s3)=3s2,ddθ(θ3)=3θ2,\frac{d}{du}(u^{3}) = 3 u^{2}, \qquad \frac{d}{ds}(s^{3}) = 3 s^{2}, \qquad \frac{d}{d\theta}(\theta^{3}) = 3 \theta^{2},

records exactly the same statement three times over.

Example 14 (Differentiating with respect to a specific variable)

Compute ddq((q32q+4)7)\dfrac{d}{dq}\bigl((q^{3} - 2 q + 4)^{7}\bigr).

The General power rule with g(q)=q32q+4g(q) = q^{3} - 2 q + 4 and r=7r = 7 gives

ddq((q32q+4)7)=7(q32q+4)6ddq(q32q+4)=7(q32q+4)6(3q22),\frac{d}{dq}\bigl((q^{3} - 2 q + 4)^{7}\bigr) = 7\,(q^{3} - 2 q + 4)^{6} \cdot \frac{d}{dq}\bigl(q^{3} - 2 q + 4\bigr) = 7\,(q^{3} - 2 q + 4)^{6}(3 q^{2} - 2),

the inner derivative supplied by the Sum, Constant-Multiple, and power rules in qq.

When several letters appear in one expression, the operator ddt\frac{d}{dt} singles out tt as the variable and treats every other letter as a constant. The constant multiple rule then carries those letters through differentiation untouched, and the constant case from Lesson 2PM annihilates any term that contains no tt at all.

Example 15 (Several letters, one variable)

Compute ddt(bt4+ct2+d3)\dfrac{d}{dt}\bigl(b\,t^{4} + c\,t^{-2} + d^{3}\bigr), where bb, cc, dd are real numbers independent of tt.

Treating bb, cc, dd as constants, the Sum, Constant-Multiple, and power rules apply in turn,

ddt(bt4+ct2+d3)=b4t3+c(2)t3+0=4bt32ct3,\frac{d}{dt}\bigl(b\,t^{4} + c\,t^{-2} + d^{3}\bigr) = b \cdot 4 t^{3} + c \cdot (-2)\,t^{-3} + 0 = 4 b\,t^{3} - \frac{2 c}{t^{3}},

the term d3d^{3} vanishing because it contains no tt and so is constant from the standpoint of ddt\frac{d}{dt}.

Problem 5

Compute each derivative, treating every letter other than the variable indicated by the operator as a constant.

  1. ddp((2p25)4)\dfrac{d}{dp}\bigl((2 p^{2} - 5)^{4}\bigr).
  2. dds(αs3βs1+γ)\dfrac{d}{ds}\bigl(\alpha\, s^{3} - \beta\, s^{-1} + \gamma\bigr) for constants α\alpha, β\beta, γ\gamma.
  3. ddu(k(1+u2)3/2)\dfrac{d}{du}\bigl(k\,(1 + u^{2})^{3/2}\bigr) for a constant kk.

The Second Derivative

The derivative ff' produced by differentiating ff is itself a function, and may therefore be differentiated again. The result is the second derivative.

Definition 1 (Second Derivative)

Let ff be a function such that ff' is itself differentiable on a set of inputs containing xx. The second derivative of ff at xx is the derivative of ff' at xx, written f(x)f''(x):

f(x)=ddx(f(x)).f''(x) = \frac{d}{dx}\bigl(f'(x)\bigr).

The function ff'' whose value at each such input is f(x)f''(x) is itself called the second derivative of ff.

The first derivative records the slope of the graph of ff at each input; the second derivative records the rate at which that slope is itself changing, and so reads off the bending of the curve. The geometric significance is taken up systematically in the lesson on concavity to come.

Example 16 (Second derivatives by repeated differentiation)

Compute f(x)f''(x) for each of the following.

  1. f(x)=5x3f(x) = 5 x - 3. Lesson 2PM gives f(x)=5f'(x) = 5, a constant function, and the constant case gives f(x)=0f''(x) = 0 in turn.
  2. f(x)=x42x2f(x) = x^{4} - 2 x^{2}. Two applications of the Power and sum rules give f(x)=4x34xf'(x) = 4 x^{3} - 4 x and then f(x)=12x24f''(x) = 12 x^{2} - 4.
  3. f(x)=xf(x) = \sqrt{x} on x>0x > 0. Writing f(x)=x1/2f(x) = x^{1/2}, the power rule with r=12r = \tfrac{1}{2} gives f(x)=12x1/2f'(x) = \tfrac{1}{2}\,x^{-1/2}, and a second application with r=12r = -\tfrac{1}{2} gives f(x)=12(12)x3/2=14x3/2f''(x) = \tfrac{1}{2} \cdot \bigl(-\tfrac{1}{2}\bigr) x^{-3/2} = -\tfrac{1}{4}\,x^{-3/2}.

Other notation

Differentiation does not enjoy a single standard notation. The two systems below denote the same objects throughout, and one should expect to read both fluently.

Note (Equivalent notations for the first and second derivatives)

For y=f(x)y = f(x), the first derivative may be written

f(x)=ddxf(x)=dydx,f'(x) = \frac{d}{dx}\,f(x) = \frac{dy}{dx},

and the second derivative

f(x)=d2dx2f(x)=d2ydx2.f''(x) = \frac{d^{2}}{dx^{2}}\,f(x) = \frac{d^{2} y}{dx^{2}}.

The placement of the exponent 22 on top of the dd in the numerator and on the xx in the denominator is purely symbolic, recording that the operator ddx\frac{d}{dx} has been applied twice; it is not a square in the algebraic sense.

The two systems coexist because each is shorter than the other in different circumstances. Prime notation is convenient when the function carries a name and the variable is implicit; the operator notation ddx\frac{d}{dx} is convenient when no name has been given, or when several letters are in play and the variable of differentiation needs to be made explicit, as in the multi-letter example above.

Evaluating a derivative at a specific input

Two notations for the same number are in use. The first writes the value of the derivative at x=ax = a as f(a)f'(a), the slope of the curve y=f(x)y = f(x) at (a,f(a))(a, f(a)) in the sense of Lesson 2PM. The second uses a vertical bar,

f(a)=dydxx=a,f(a)=d2ydx2x=a.f'(a) = \left.\frac{dy}{dx}\right|_{x = a}, \qquad f''(a) = \left.\frac{d^{2} y}{dx^{2}}\right|_{x = a}.

The bar carries no operational content; it is shorthand for first compute the derivative, then substitute x=ax = a.

Example 17 (A second derivative at a specific input)

For y=2x43x2+5y = 2 x^{4} - 3 x^{2} + 5, compute d2ydx2x=2\left.\dfrac{d^{2} y}{dx^{2}}\right|_{x = 2}.

Differentiating once,

dydx=8x36x.\frac{dy}{dx} = 8 x^{3} - 6 x.

Differentiating again,

d2ydx2=24x26.\frac{d^{2} y}{dx^{2}} = 24 x^{2} - 6.

Substituting x=2x = 2,

d2ydx2x=2=2446=90.\left.\frac{d^{2} y}{dx^{2}}\right|_{x = 2} = 24 \cdot 4 - 6 = 90.
Example 18 (First and second derivatives at a specific input)

For s=t3+t24ts = t^{3} + t^{2} - 4 t, compute dsdtt=1\left.\dfrac{ds}{dt}\right|_{t = -1} and d2sdt2t=1\left.\dfrac{d^{2} s}{dt^{2}}\right|_{t = -1}.

The first derivative is

dsdt=3t2+2t4,\frac{ds}{dt} = 3 t^{2} + 2 t - 4,

and substitution gives dsdtt=1=324=3\left.\frac{ds}{dt}\right|_{t = -1} = 3 - 2 - 4 = -3. Differentiating again,

d2sdt2=6t+2,\frac{d^{2} s}{dt^{2}} = 6 t + 2,

and substitution gives d2sdt2t=1=6+2=4\left.\frac{d^{2} s}{dt^{2}}\right|_{t = -1} = -6 + 2 = -4.

Problem 6

For each pair below, compute the first and second derivatives in operator notation, then evaluate each at the given input.

  1. y=2x3x2+4y = 2 x^{3} - x^{2} + 4 at x=1x = -1.
  2. u=r4+5ru = r^{4} + 5 r at r=2r = 2.
  3. w=1/vw = 1/v at v=1v = 1.

The Derivative as a Rate of Change

Lesson 2AM read the slope of a linear function as the rate of change of its output per unit step of its input, a single number valid at every point. Lesson 2PM defined f(a)f'(a) for a non-linear ff as the slope of the tangent line at (a,f(a))(a, f(a)), a number that varies with aa. Combining the two readings,

f(a)=the rate of change of f(x) at x=a,f'(a) = \text{the rate of change of } f(x) \text{ at } x = a,

the qualifier at x=ax = a now genuinely needed because the rate is no longer constant. As the graph of ff passes through P=(a,f(a))P = (a, f(a)) it changes at a rate of f(a)f'(a) units in the yy direction for every one unit step in xx, the same reading the slope already supplied for a line.

Linear approximation by the tangent line

The tangent line at PP is the straight-line approximation of the graph near PP, in the sense of Lesson 2AM. For one unit step from aa to a+1a + 1 the tangent rises by exactly f(a)f'(a), by slope property 1 of Lesson 2AM. The graph rises by approximately the same amount, the approximation closer the smaller the step and the less the curve bends over [a,a+1][a, a + 1].

Note (Linear approximation at a point)

For ff differentiable at x=ax = a,

f(a+1)f(a)f(a),equivalently,f(a+1)f(a)+f(a).f(a + 1) - f(a) \approx f'(a), \qquad \text{equivalently,} \qquad f(a + 1) \approx f(a) + f'(a).

The right-hand side replaces the curve over the interval [a,a+1][a, a + 1] by its tangent line at aa, supplying an approximate value of ff one unit ahead from values already in hand at aa. The approximation is exact when ff is itself linear and progressively worse the more ff bends over the interval.

A later MA0A lesson generalises the formula to displacements other than one and quantifies the error.

Example 19 (Approximating 10\sqrt{10} from 9\sqrt{9})

Estimate 10\sqrt{10} using the tangent line to y=xy = \sqrt{x} at x=9x = 9, and compare with the exact value.

By the power rule, f(x)=12x1/2=1/(2x)f'(x) = \tfrac{1}{2}\,x^{-1/2} = 1/(2 \sqrt{x}), so at a=9a = 9,

f(9)=3,f(9)=16.f(9) = 3, \qquad f'(9) = \tfrac{1}{6}.

The linear approximation supplies

10=f(10)f(9)+f(9)=3+16=3.16.\sqrt{10} = f(10) \approx f(9) + f'(9) = 3 + \tfrac{1}{6} = 3.1\overline{6}.

The exact value is 10=3.1623\sqrt{10} = 3.1623 to four decimal places, so the approximation overshoots by about 0.0050.005. The tangent line, drawn at (9,3)(9, 3) with slope 16\tfrac{1}{6}, sits a little above the curve over [9,10][9, 10] because x\sqrt{x} bends downwards there.

The curve y equals square root of x drawn from x near zero to x equals sixteen, with a tangent line of slope one sixth drawn at the marked point nine comma three. A vertical dotted segment at x equals ten meets two close points, one on the tangent line at height three plus one sixth and one slightly lower on the curve at the height square root of ten, illustrating that the tangent value approximates the curve value with a small overshoot.
Example 20 (Falling sign-ups for a launched product)

A small platform records the number of new sign-ups per day after launch. The empirical fit, valid for the first fortnight, is

N(t)=4+16(t+1)2,0t14,N(t) = 4 + \frac{16}{(t + 1)^{2}}, \qquad 0 \leq t \leq 14,

with tt measured in days since launch and treated as a continuous variable, and NN in thousands of sign-ups per day. Compute N(3)N(3) and N(3)N'(3), interpret each, and use the linear approximation to estimate the daily sign-ups on day 44.

The height supplies the actual count on day 33,

N(3)=4+1616=5 thousand sign-ups.N(3) = 4 + \frac{16}{16} = 5 \text{ thousand sign-ups}.

The slope is computed by writing the second term as 16(t+1)216\,(t + 1)^{-2} and applying the constant multiple rule together with the General power rule with g(t)=t+1g(t) = t + 1 and r=2r = -2,

N(t)=16(2)(t+1)3=32(t+1)3.N'(t) = 16 \cdot (-2)(t + 1)^{-3} = -\frac{32}{(t + 1)^{3}}.

At t=3t = 3 this gives N(3)=32/64=12N'(3) = -32/64 = -\tfrac{1}{2}, that is 500-500 sign-ups per day at that instant: the platform is losing half a thousand sign-ups per day from one day to the next. The linear approximation then estimates the day-44 count as

N(4)N(3)+N(3)=512=4.5 thousand.N(4) \approx N(3) + N'(3) = 5 - \tfrac{1}{2} = 4.5 \text{ thousand}.

The exact value is N(4)=4+16/25=4.64N(4) = 4 + 16/25 = 4.64 thousand, so the approximation falls short by 0.140.14 thousand, or 140140 sign-ups, the discrepancy explained by the bending of NN over [3,4][3, 4].

The decreasing curve representing N of t equals 4 plus 16 over the square of t plus 1, drawn from t equals zero to t equals fourteen days. A filled marker sits at the point with coordinates three and five, with a dashed tangent line of slope minus one half drawn at that point, the tangent visibly capturing the local rate at which the daily sign-up count is dropping.

Marginal cost

The publisher of Lesson 2AM had a linear total cost C(x)=10,000+25xC(x) = 10{,}000 + 25 x, and the marginal cost defined there was the slope 2525, the additional cost of the next copy at every production level. For a non-linear cost the same reading carries through, but only at one production level at a time, and only as an approximation to the actual additional cost.

Definition 2 (Marginal Cost)

Let C(x)C(x) be the total cost of producing xx units of a commodity. The marginal cost function is the derivative C(x)C'(x). The marginal cost of producing aa units, C(a)C'(a), is by the linear approximation above approximately equal to C(a+1)C(a)C(a + 1) - C(a), the actual additional cost incurred when production is raised by one unit from aa to a+1a + 1.

The units of C(x)C'(x) follow from the rate-of-change reading: when CC is measured in pounds and xx is a number of items, C(x)C'(x) is measured in pounds per item, the same units the slope of a linear cost function carried in Lesson 2AM. The non-linear definition specialises to the linear one when CC is itself linear, in which case C(x)=mC'(x) = m is constant and the approximation becomes exact.

Example 21 (Marginal cost on a non-linear cost function)

A small ceramics studio has total cost

C(x)=0.004x30.6x2+35x+240 poundsC(x) = 0.004\,x^{3} - 0.6\,x^{2} + 35\,x + 240 \text{ pounds}

for a daily production of xx pieces. Compare the actual additional cost of raising production from 4040 to 4141 pieces with the marginal cost at x=40x = 40.

The actual additional cost is C(41)C(40)C(41) - C(40), computed directly,

C(40)=256960+1400+240=936,C(41)=275.6841008.6+1435+240=942.084,C(40) = 256 - 960 + 1400 + 240 = 936, \qquad C(41) = 275.684 - 1008.6 + 1435 + 240 = 942.084,

so C(41)C(40)=6.084C(41) - C(40) = 6.084 pounds. The marginal cost at x=40x = 40 is the value of the derivative there, computed by the rules of the previous sections,

C(x)=0.012x21.2x+35,C'(x) = 0.012\,x^{2} - 1.2\,x + 35,

giving C(40)=19.248+35=6.2C'(40) = 19.2 - 48 + 35 = 6.2 pounds per piece. The marginal cost 6.26.2 approximates the actual increment 6.0846.084 to within about 0.10.1 pound, the residual reflecting the small but non-zero bending of CC over [40,41][40, 41].

Marginal revenue and marginal profit

The same construction applies to revenue and profit. If R(x)R(x) is the revenue from the sale of xx units and C(x)C(x) the cost of producing them, the profit is P(x)=R(x)C(x)P(x) = R(x) - C(x). The Sum and Difference Rules give

P(x)=R(x)C(x),P'(x) = R'(x) - C'(x),

so marginal profit is marginal revenue minus marginal cost without further work.

Definition 3 (Marginal Revenue, Marginal Profit)

For a revenue function R(x)R(x) and a profit function P(x)P(x), the marginal revenue function is R(x)R'(x) and the marginal profit function is P(x)P'(x). The marginal revenue of producing aa units, R(a)R'(a), approximates R(a+1)R(a)R(a + 1) - R(a), and the marginal profit P(a)P'(a) approximates P(a+1)P(a)P(a + 1) - P(a).

The decision whether to raise production by one unit reduces, in linear-approximation terms, to the sign of the marginal profit at the current level: a positive P(a)P'(a) predicts that the next unit increases profit, a negative one that it decreases it.

Example 22 (Deciding on a production increase)

A workshop’s revenue from the sale of xx tables per week is R(x)R(x) thousand pounds, and its cost is

C(x)=2+0.1x2 thousand pounds.C(x) = 2 + 0.1\,x^{2} \text{ thousand pounds}.

Direct measurement at the current production level x=4x = 4 gives R(4)=9R(4) = 9 and R(4)=0.6R'(4) = -0.6 thousand pounds per table; the revenue is falling because the workshop is saturating its small local market. Estimate the change in revenue, the change in cost, and the change in profit on raising production to x=5x = 5, and decide whether the increase is worthwhile.

The estimated additional revenue, by the linear approximation, is R(5)R(4)R(4)=0.6R(5) - R(4) \approx R'(4) = -0.6 thousand pounds, so revenue is predicted to fall by about £600. The marginal cost is C(x)=0.2xC'(x) = 0.2\,x by the rules of the previous sections, giving C(4)=0.8C'(4) = 0.8 thousand pounds per table, and the additional cost is approximately £800. The marginal profit is therefore

P(4)=R(4)C(4)=0.60.8=1.4 thousand pounds per table,P'(4) = R'(4) - C'(4) = -0.6 - 0.8 = -1.4 \text{ thousand pounds per table},

predicting a profit drop of about £1400 if production is raised to x=5x = 5. Despite the workshop running at the level profit P(4)=R(4)C(4)=93.6=5.4P(4) = R(4) - C(4) = 9 - 3.6 = 5.4 thousand pounds, the increase is not worthwhile: the next table is forecast to remove £1400 from the weekly profit. Level and rate of change tell different stories, and the marginal calculation reads only the second.

Problem 7

A small bakery’s daily total cost is

C(x)=0.01x30.9x2+30x+80 poundsC(x) = 0.01\,x^{3} - 0.9\,x^{2} + 30\,x + 80 \text{ pounds}

for a daily production of xx loaves. Compute C(x)C'(x) by the rules of the previous sections, evaluate the marginal cost at x=30x = 30, and compare with the exact additional cost C(31)C(30)C(31) - C(30).

Problem 8

A streaming service’s monthly revenue from the sale of xx thousand subscriptions is R(x)R(x) million pounds, with R(20)=12R(20) = 12 and R(20)=0.4R'(20) = 0.4 million pounds per thousand subscriptions. The corresponding cost is C(x)=6+12xC(x) = 6 + \tfrac{1}{2} \sqrt{x} million pounds.

  1. Estimate R(21)R(21) by the linear approximation.
  2. Compute C(20)C(20) and C(21)C(20)C(21) - C(20) exactly, and compare the second with the marginal cost C(20)C'(20).
  3. Compute the marginal profit P(20)P'(20) and decide whether raising production to 2121 thousand subscriptions is worthwhile.

Average Rates of Change

The previous section read f(a)f'(a) as the rate of change of ff at x=ax = a, the slope of the tangent line at (a,f(a))(a, f(a)). A second rate of change is sometimes more natural: the average rate of change of ff over a whole interval, computed by dividing the total change in the output by the length of the interval. The two readings are linked by the secant construction of Lesson 2PM.

Definition 4 (Average Rate of Change)

The average rate of change of f(x)f(x) over an interval axba \leq x \leq b with a<ba < b is the ratio

f(b)f(a)ba,\frac{f(b) - f(a)}{b - a},

the change in ff divided by the length of the interval. Geometrically, it is the slope of the secant line through (a,f(a))(a, f(a)) and (b,f(b))(b, f(b)).

When b=a+hb = a + h the length of the interval is hh, the change in ff is f(a+h)f(a)f(a + h) - f(a), and the ratio collapses to the difference quotient of Lesson 2PM,

f(a+h)f(a)h,\frac{f(a + h) - f(a)}{h},

the same expression whose limit as h0h \to 0 is f(a)f'(a). Letting the interval shrink to a single point therefore turns the average rate into the instantaneous rate, and the two are one construction read at its two extremes. From this point onwards, unless the qualifier average is used explicitly, the phrase rate of change will mean the instantaneous rate f(a)f'(a).

Example 23 (Average rates approaching the instantaneous rate)

For f(x)=x2f(x) = x^{2}, compute the average rate of change of ff over each of the intervals [1,2][1, 2], [1,1.1][1, 1.1], and [1,1.01][1, 1.01], and compare with the instantaneous rate f(1)f'(1).

The power rule gives f(x)=2xf'(x) = 2 x and f(1)=2f'(1) = 2. The three average rates are

f(2)f(1)21=411=3,\frac{f(2) - f(1)}{2 - 1} = \frac{4 - 1}{1} = 3,f(1.1)f(1)1.11=1.2110.1=2.1,\frac{f(1.1) - f(1)}{1.1 - 1} = \frac{1.21 - 1}{0.1} = 2.1,f(1.01)f(1)1.011=1.020110.01=2.01.\frac{f(1.01) - f(1)}{1.01 - 1} = \frac{1.0201 - 1}{0.01} = 2.01.

The averages drop from 33 to 2.12.1 to 2.012.01 as the right endpoint approaches 11, in agreement with f(1)=2f'(1) = 2 in the limit. The pattern is the secant slopes of Lesson 2PM tending to the tangent slope as the second point slides towards the first.

The parabola y equals x squared drawn near the point one comma one. Three dashed secant lines are drawn from one comma one to the points two comma four, one point one comma one point two one, and one point zero one comma one point zero two zero one, with the second points of each secant marked. A solid black tangent line of slope two passes through one comma one as well, with the secant slopes labelled three, two point one, and two point zero one approaching the tangent slope of two as the second point slides towards the first.
Problem 9

For f(x)=x3f(x) = x^{3}, compute the average rate of change over each of the intervals [2,3][2, 3], [2,2.1][2, 2.1], and [2,2.01][2, 2.01], then compute f(2)f'(2) by the power rule and verify that the averages approach it.

Reading rates from a graph

When ff is supplied by a graph rather than a formula, both rates can be read off without any algebra. The average rate over an interval is the slope of the secant line connecting the two endpoints, supplied by slope property 2 of Lesson 2AM. The instantaneous rate at a point is the slope of the tangent line at that point, read either by slope property 1 or, when a second point on the tangent is available, by slope property 2 again.

Example 24 (A population model read off a graph)

The function f(t)f(t) records the population of a coastal city, in thousands of inhabitants, tt years after 19001900. The graph of ff shows

f(20)=25,f(60)=73,f(20) = 25, \qquad f(60) = 73,

and a tangent line drawn at (20,25)(20, 25) passes through the further point (70,50)(70, 50).

(a) Average rate of growth from 19201920 to 19601960. By the definition,

f(60)f(20)6020=732540=1.2 thousand inhabitants per year,\frac{f(60) - f(20)}{60 - 20} = \frac{73 - 25}{40} = 1.2 \text{ thousand inhabitants per year},

so over those forty years the city’s population grew on average at 1,2001{,}200 inhabitants per year.

(b) Rate of growth in 19201920. The tangent line at (20,25)(20, 25) passes through (70,50)(70, 50), so by slope property 2 of Lesson 2AM its slope is

f(20)=50257020=0.5 thousand inhabitants per year,f'(20) = \frac{50 - 25}{70 - 20} = 0.5 \text{ thousand inhabitants per year},

that is, in 19201920 the population was growing at 500500 inhabitants per year.

(c) Comparison. The average over [20,60][20, 60], 1,2001{,}200 per year, exceeds the instantaneous rate at the left endpoint, 500500 per year, indicating that the rate of growth was higher later in the interval than at the start: the curve is steepening as tt increases. The average rate is the slope of the secant from (20,25)(20, 25) to (60,73)(60, 73); the instantaneous rate at 19201920 is the slope of the tangent at (20,25)(20, 25) alone. The two coincide only when ff is linear over the interval.

An increasing population curve drawn against years after 1900, with the two endpoint values twenty comma twenty-five and sixty comma seventy-three marked. A dashed secant line connects the two endpoints with slope one point two thousand inhabitants per year, and a solid tangent line drawn at the leftmost point twenty comma twenty-five has the smaller slope zero point five thousand inhabitants per year, the disparity between the two readings recording that the curve is steepening as t increases.
Problem 10

A graph of fuel remaining against time over a four-hour generator run shows the values

F(0)=200,F(2)=168,F(4)=144,F(0) = 200, \qquad F(2) = 168, \qquad F(4) = 144,

in litres after tt hours, and a tangent line drawn at (2,168)(2, 168) passes through (4,142)(4, 142).

  1. Compute the average rate of change of FF over the first two hours.
  2. Compute the average rate over the second two hours.
  3. Compute the instantaneous rate of change of FF at t=2t = 2.
  4. Order the three numbers and explain what the ordering says about the bending of FF.

Velocity, Acceleration, and Estimating Changes

The rate-of-change reading of the derivative covers a family of physical settings in which one quantity is a function of another. Two recurring instances, important enough to fix names for, are the position of a moving object as a function of time, whose rate of change is velocity, and the velocity itself, whose rate of change is acceleration. A second extension, parallel to the marginal-cost discussion two sections back, generalises the one-unit linear approximation to a step of arbitrary length hh.

Velocity and acceleration

Suppose an object moves along a straight line, and let s(t)s(t) denote its directed position from a fixed reference point at time tt, with the convention that one direction along the line is positive and the other negative. Over a short time interval from tt to t+ht + h the object’s average velocity is

s(t+h)s(t)h,\frac{s(t + h) - s(t)}{h},

the average rate of change of position over [t,t+h][t, t + h] in the sense of the previous section. Letting hh shrink to zero turns the average into an instantaneous velocity at the instant tt. Its absolute value is the speed.

Definition 5 (Velocity and Acceleration)

Let s(t)s(t) be the position of an object moving along a straight line at time tt. The velocity of the object at time tt is the derivative v(t)=s(t)v(t) = s'(t). The acceleration at time tt is the derivative of the velocity, a(t)=v(t)a(t) = v'(t), equivalently the second derivative of the position,

a(t)=s(t).a(t) = s''(t).

A negative velocity records motion in the negative direction along the line, and a negative acceleration records that the velocity is itself decreasing in the chosen sign convention.

The acceleration uses the second derivative of two sections back: differentiating the position once gives the rate at which the position is changing, and differentiating again gives the rate at which that rate is itself changing.

Example 25 (A stone thrown vertically upwards)

A stone is launched vertically upwards from a height of 11 metre above the ground with an initial velocity of 4040 metres per second. Taking the upward direction as positive and ignoring air resistance, the height after tt seconds is

s(t)=5t2+40t+1 metres,s(t) = -5\,t^{2} + 40\,t + 1 \text{ metres},

the coefficient 5-5 being half the gravitational acceleration of about 1010 metres per second per second.

(a) Velocity at t=2t = 2. The Sum and power rules give v(t)=s(t)=10t+40v(t) = s'(t) = -10\,t + 40, so v(2)=20v(2) = 20 metres per second; the stone is rising at 2020 metres per second two seconds into its flight.

(b) Acceleration at t=2t = 2. Differentiating again, a(t)=v(t)=10a(t) = v'(t) = -10 metres per second per second for every tt. The acceleration is negative because gravity acts downwards, the convention having taken upward as positive, and is the same number at every tt because the gravitational pull does not vary over the flight.

(c) When is the velocity 20-20 metres per second? Setting v(t)=20v(t) = -20 gives 10t+40=20-10\,t + 40 = -20, so t=6t = 6 seconds. The negative sign records that the stone is now falling at 2020 metres per second.

(d) When is the stone at a height of 7676 metres? Setting s(t)=76s(t) = 76 gives

5t2+40t+1=76,t28t+15=0,-5\,t^{2} + 40\,t + 1 = 76, \qquad t^{2} - 8\,t + 15 = 0,

which factors as (t3)(t5)=0(t - 3)(t - 5) = 0 by the inspection of Recitation 1, so t=3t = 3 or t=5t = 5. The stone passes through the height 7676 metres twice: once on the way up at t=3t = 3 seconds and once on the way down at t=5t = 5 seconds.

The downward-opening parabolic position curve s of t equals minus five t squared plus forty t plus one drawn from t equals zero to t equals about eight seconds. A dashed tangent line of slope twenty drawn at the marked point two comma sixty-one captures the velocity at that instant, and a horizontal dotted line at height seventy-six crosses the curve at the marked points three comma seventy-six and five comma seventy-six, recording the two times at which the stone passes through that height.
Problem 11

A bead slides along a straight wire with directed position s(t)=t36t2+9ts(t) = t^{3} - 6\,t^{2} + 9\,t metres at time tt seconds, t0t \geq 0.

  1. Compute the velocity and the acceleration as functions of tt.
  2. Find every time at which the bead is momentarily at rest, v(t)=0v(t) = 0.
  3. Find the time at which the acceleration is zero, and state the velocity at that instant.

The change in ff over a step of length hh

The marginal-cost reading two sections back approximated the change in cost across a step of one unit by C(a)C'(a). The same construction extends to a step of arbitrary length, the only modification a multiplication of the rate by the length of the step.

Note (Linear approximation over a step of length $h$)

For ff differentiable at x=ax = a and a small displacement hh, positive or negative,

f(a+h)f(a)f(a)h,equivalently,f(a+h)f(a)+f(a)h.f(a + h) - f(a) \approx f'(a) \cdot h, \qquad \text{equivalently,} \qquad f(a + h) \approx f(a) + f'(a) \cdot h.

Setting h=1h = 1 recovers the formula of two sections back. The right-hand side is the change in yy along the tangent line at (a,f(a))(a, f(a)) over a horizontal step of hh, by slope property 1 of Lesson 2AM scaled by hh; for small hh the curve y=f(x)y = f(x) stays close to the tangent line, and the change along the curve is well approximated by the change along the line.

A smooth curve y equals f of x drawn alongside its tangent line at the marked point a comma f of a. A horizontal step of length h is shown along the x-axis from a to a plus h. Two vertical double-headed arrows on the right show the actual change f of a plus h minus f of a along the curve in one colour, and the linear approximation f prime of a times h along the tangent line in a contrasting colour, with the two endpoints marked separately to make the small discrepancy between the curve value and the tangent value visible.

The same identity may be derived from the equation of the tangent line at x=ax = a written out in Lesson 2PM,

yf(a)=f(a)(xa).y - f(a) = f'(a)\,(x - a).

Replacing the tangent value yy by the curve value f(x)f(x), valid only as an approximation, and substituting x=a+hx = a + h produces f(a+h)f(a)f(a)hf(a + h) - f(a) \approx f'(a) \cdot h, the same formula obtained directly.

Example 26 (A textile mill's production function)

A textile mill’s daily output is q(L)q(L) garments when LL person-hours of labour are employed. Direct measurement at the current level L=800L = 800 gives

q(800)=240,q(800)=0.6,q(800) = 240, \qquad q'(800) = 0.6,

the slope 0.60.6 measured in garments per person-hour. Interpret each value, and use the linear approximation to estimate the daily output at L=801L = 801, at L=800.25L = 800.25, and at L=799L = 799.

The height q(800)=240q(800) = 240 records that 800800 person-hours of labour currently produce 240240 garments per day. The slope q(800)=0.6q'(800) = 0.6 records that, at the current level, output rises at the rate of 0.60.6 garments for each additional person-hour of labour.

For L=801L = 801 the displacement is h=1h = 1, and the linear approximation gives q(801)240+0.6=240.6q(801) \approx 240 + 0.6 = 240.6 garments. For L=800.25L = 800.25 the displacement is h=0.25h = 0.25, and

q(800.25)q(800)+q(800)0.25=240+0.15=240.15 garments.q(800.25) \approx q(800) + q'(800) \cdot 0.25 = 240 + 0.15 = 240.15 \text{ garments}.

For L=799L = 799 the displacement is h=1h = -1, and the same formula gives

q(799)q(800)+q(800)(1)=2400.6=239.4 garments,q(799) \approx q(800) + q'(800) \cdot (-1) = 240 - 0.6 = 239.4 \text{ garments},

so reducing labour by one person-hour drops daily output by about 0.60.6 garments. The negative displacement is handled by the formula without further work.

Marginal cost over a non-unit step

Specialising the new approximation to a cost function C(x)C(x) with displacement hh,

C(a+h)C(a)C(a)h.C(a + h) - C(a) \approx C'(a) \cdot h.

For h=1h = 1 this collapses to the marginal-cost formula of two sections back; for h1h \neq 1 the marginal cost is scaled in proportion to the length of the step.

Example 27 (Marginal cost on a quadratic cost function)

A workshop’s total cost of producing xx units of a commodity is

C(x)=4x2+5x+20 thousand pounds.C(x) = 4\,x^{2} + 5\,x + 20 \text{ thousand pounds}.

Find the marginal cost function, evaluate the cost and the marginal cost at the production level x=8x = 8, estimate the cost of the ninth unit, and estimate the additional cost of raising production from 88 to 8.58.5 units.

The Sum, Constant-Multiple, and power rules give C(x)=8x+5C'(x) = 8\,x + 5, the marginal cost function in thousand pounds per unit. At x=8x = 8,

C(8)=256+40+20=316,C(8)=64+5=69.C(8) = 256 + 40 + 20 = 316, \qquad C'(8) = 64 + 5 = 69.

The cost of the ninth unit is C(9)C(8)C(9) - C(8), and by the marginal approximation with h=1h = 1 this is approximately C(8)=69C'(8) = 69 thousand pounds. The additional cost of raising production from 88 to 8.58.5 has step h=0.5h = 0.5, so the same formula gives

C(8.5)C(8)C(8)0.5=690.5=34.5 thousand pounds.C(8.5) - C(8) \approx C'(8) \cdot 0.5 = 69 \cdot 0.5 = 34.5 \text{ thousand pounds}.

Halving the step size halves the predicted change in cost, a feature the unit-step formula of two sections back was unable to express.

Units of a rate of change

The rate-of-change reading fixes the units of f(x)f'(x) from those of f(x)f(x) and xx. Because f(x)f'(x) is computed as a change in ff divided by a change in xx, the units of f(x)f'(x) are

units of f(x)=units of f(x)units of x.\text{units of } f'(x) = \frac{\text{units of } f(x)}{\text{units of } x}.

The examples in this lesson supply several instances at once: position in metres against time in seconds gives velocity in metres per second; velocity in metres per second against time in seconds gives acceleration in metres per second per second; cost in pounds against units of production gives marginal cost in pounds per unit; output in garments against person-hours of labour gives marginal product in garments per person-hour. Stating the units alongside the numerical answer is part of the answer in any rate-of-change calculation.

Exercises

Exercise 1

Differentiate f(x)=5x42x3+7x11f(x) = 5 x^{4} - 2 x^{3} + 7 x - 11 by the rules of this section, and evaluate f(1)f'(1).

Exercise 2

Find the equation of the tangent line to y=2x3x+4y = 2 x^{3} - x + 4 at x=1x = -1, in slope-intercept form.

Exercise 3

Show that the curve y=x42x2+3y = x^{4} - 2 x^{2} + 3 has exactly three horizontal tangents, and locate each of them.

Exercise 4

Differentiate f(x)=(5x2)6f(x) = (5 x - 2)^{6} by the General power rule, and confirm the answer at x=1x = 1 by expanding (5x2)6(5 x - 2)^{6} to a polynomial in xx and differentiating term by term.

Exercise 5

Differentiate f(x)=x2+9f(x) = \sqrt{x^{2} + 9}, state the natural domain of ff', and write the equation of the tangent line at the point (4,5)(4, 5).

Exercise 6

For f(x)=(1+2x)2f(x) = (1 + 2 x)^{-2} on x12x \neq -\tfrac{1}{2}, compute f(x)f'(x) by the General power rule, then verify by writing f(x)=1/(1+2x)2f(x) = 1 / (1 + 2 x)^{2} and applying the General power rule with r=2r = -2 instead.

Exercise 7

Differentiate f(r)=πr2f(r) = \pi r^{2} with respect to rr, treating π\pi as a constant by the constant multiple rule. Recognising f(r)f(r) as the area of a disc of radius rr, identify the geometric quantity the derivative equals.

Exercise 8

Compute ddz((z26z+9)5)\dfrac{d}{dz}\bigl((z^{2} - 6 z + 9)^{5}\bigr) by the General power rule, and verify the answer by first writing z26z+9=(z3)2z^{2} - 6 z + 9 = (z - 3)^{2} and so (z26z+9)5=(z3)10(z^{2} - 6 z + 9)^{5} = (z - 3)^{10}, then differentiating the rewritten form.

Exercise 9

For y=x43x2+7y = x^{4} - 3 x^{2} + 7, compute

ddx ⁣(dydx)x=2\left.\frac{d}{dx}\!\left(\frac{dy}{dx}\right)\right|_{x = 2}

by differentiating twice and substituting.

Exercise 10

After a publicity drive ends, the daily downloads of a mobile app, in thousands, are modelled by

D(t)=2t2+24t+80,0t12,D(t) = -2\,t^{2} + 24\,t + 80, \qquad 0 \leq t \leq 12,

days from the end of the drive. Compute the average rate of change of DD over the interval [3,4][3, 4], and the instantaneous rate of change at t=2t = 2.

Exercise 11

Let f(p)f(p) denote the number, in thousands, of headphones sold per month when the price is set at £p\pounds p per pair. Suppose

f(80)=12andf(80)=0.5.f(80) = 12 \qquad \text{and} \qquad f'(80) = -0.5.

Interpret each value, then estimate the monthly sales when the price is raised to £85\pounds 85 per pair.

Exercise 12

Let P(x)P(x) denote the profit, in pounds, from manufacturing and selling xx specialty bicycles per month. Suppose

P(50)=45,000andP(50)=800.P(50) = 45{,}000 \qquad \text{and} \qquad P'(50) = 800.

Interpret each value, and use the linear approximation to estimate the profit from manufacturing and selling 4949 bicycles per month.