The chain rule is one of the essential differentiation rules. It allows us to calculate the derivative of most interesting functions. In this page we'll first learn the intuition for the chain rule. This intuition is almost never presented in any textbook or calculus course. This rule is usually presented as an algebraic formula that you have to memorize.
There is, though, a physical intuition behind this rule that we'll explore here.
After we've satisfied our intuition, we'll get to the "dirty work". We'll learn the step-by-step technique for applying the chain rule to the solution of derivative problems. Our goal will be to make you able to solve any problem that requires the chain rule. With that goal in mind, we'll solve tons of examples in this page.
Suppose that a car is driving up a mountain. Using the car's speedometer, we can calculate the rate at which our height changes. Let's say our height changes 1 km per hour. And let's suppose that we know temperature drops 5 degrees Celsius per kilometer ascended.
So, we know the rate at which the height changes with respect to time, and we know the rate at which temperature changes with respect to height.
Using this information, we can deduce the rate at which the temperature we feel in the car will decrease with time. That will be simply the product of the rates: if height increases 1 km for each hour, and temperature drops 5 degrees for each km, height changes 5 degrees for each hour.
This fact holds in general. To show that, let's first formalize this example.
Let's say that h(t) represents height as a function of time, and T(h) represents temperature as a function of height.
We know the derivative equals the rate of change of a function, so, what we concluded in this example is that if we consider the temperature as a function of time, T(t), its derivative with respect to time equals:
In the previous example the derivatives where constants. We set a fixed velocity and a fixed rate of change of temperature with resect to height. But this doesn't need to be the case.
If, for example, the speed of the car driving up the mountain changes with time, h'(t) changes with time. And if the rate at which temperature drops with height changes with the height you're at (if you're higher the drop rate is faster), T'(h) changes with the height h.
In this case, the question that remains is: where we should evaluate the derivatives?
In the previous example it was easy because the rates were fixed. If at a fixed instant t the height equals h(t)=10 km, what is the rate of change of temperature with respect to time at that instant? It would be the rate at which temperature changes with time at that specific height, times the rate of change of height with respect to time. That is:
This makes perfect intuitive sense: the rates we should consider are the rates at the specified instant. Now, let's put this conclusion into more familiar notation.
Let's use the standard letters for functions, f and g. In our example, let's say f is temperature as a function of height (T(h)), g is height as a function of time (h(t)), and F is temperature as a function of time (T(t)). In formal terms, T(t) is the composition of T(h) and h(t). That is:
Or using the new notation F(t) = T(t), h(t) = g(t), T(h) = f(h):
This is a composite function. The chain rule tells us what is the derivative of the composite function F at a point t: it equals the derivative of the "outer function" evaluated at the point g(t) times the derivative of g at point t":
Notice that, in our example, F'(t) is the rate of change of temperature as a function of time. The result in our concrete example coincides with this differentiation rule: the rate of change of temperature with respect to time equals the rate of temperature vs. height, times the rate of height vs. time.
Another way of understanding the chain rule is using Leibniz notation. In our example we have temperature as a function of both time and height. We know the derivative of temperature with respect to height, and we want to know its derivative with respect to time. So, what we want is:
That is, the derivative of T with respect to time. And what we know is:
So, to find the derivative with respect to time we can use the following "algebraic" trick:
because the dh "cancel out" in the right side of the equation. Notice that the second factor in the right side is the rate of change of height with respect to time.
Let's rewrite the chain rule using another notation. This rule says that for a composite function:
Its derivative is:
Let's see some examples where we need to apply this rule.
Let's find the derivative of this function:
As I said, it is useful for this type of comosite functions to think of an outer function and an inner function. In this example, the outer function is sin. The inner function is 1 over x.
According to the rule, if:
Then, its derivative:
That is, to derive a composite function:
In this case we have:
Applying the rule:
So, our derivative is:
What happens if we have:
We can write this function as:
Let's use a special notation for the "squaring" function:
This composite function can be written in a convoluted way as:
So, we can see that this function is the composition of three functions. To find its derivative we can still apply the chain rule. We can give a name to the inner function, for example g(x):
So, our f(x) is:
And here we can apply what we already know about composite functions to derive:
The derivative of the outer function is:
Evaluated at g(x):
But g(x) is:
And we can apply the rule again to find g'(x):
And that is:
So, as you can see, the chain rule can be used even when we have the composition of more than two functions.
In the previous examples we solved the derivatives in a rigorous manner. We applied the formula directly.
Solving derivatives like this you'll rarely make a mistake. But there is a faster way. In fact, this faster method is how the chain rule is usually applied.
Let's try it with example 2. We had:
First of all, let's derive the outermost function: the "squaring" function outside the brackets.
To do this, we imagine that the function inside the brackets is just a variable y:
And I say imagine because you don't need to write it like this! If it were just a "y" we'd have:
But "y" is really a function. Inside the empty parenthesis, according the chain rule, we must put the derivative of "y".
Replacing y for its real "value":
With practice, you'll be able to do all this in your head. Now, we only need to derive the inside function:
We already know how to do this using the chain rule:
The more examples you see, the better. Let's derive:
Let's use the same method we used in the previous example. First, we write the derivative of the outer function. With what argument? The argument of the original function:
Now, in the parenthesis we put the derivative of the inner function:
And that was easy.
Let's calculate the derivative of:
We can write it like this:
First, we take out the constant and derive the outer function:
And now we derive the inner function:
And the derivative of cos is sin:
Now, we shouldn't forget that cos(2x) is a composite function. So, we must derive the "innermost" function 2x also:
And the derivative of x is 1:
So, finally, we can write the derivative as:
That is enough examples for now. You'll be applying the chain rule all the time even when learning other rules, so you'll get much more practice. Your next step is to learn the product rule.
If you have just a general doubt about a concept, I'll try to help you. If you have a problem, or set of problems you can't solve, please send me your attempt of a solution along with your question. These will appear on a new page on the site, along with my answer, so everyone can benefit from it.
Click below to see contributions from other visitors to this page...
Combination of Product Rule and Chain Rule Problems
How do we find the derivative of the following functions? Answer by Pablo: In these two problems posted by Beth, we need to apply not …
Derivative of Inverse Trigonometric Functions
How do we derive the following function? Answer by Pablo: Here we have the derivative of an inverse trigonometric function. A whole section …
Derivative of Trig Function Using Chain Rule
Here's another example of nding the derivative of a composite function using the chain rule, submitted by Matt: This kind of problem tends to …