In this page I'll present the definition of derivative using the intuitive notions that naturally lead to it. The derivative is one of the central concepts in Calculus, and achieving an intuitive grasp of it is important.
I'll go through two different routes: first using the geometric idea of slope, and then using the physical idea of speed or velocity. We'll check that we arrive to the same definition of derivative either way.
It is good to have both perspectives on the derivative, the geometric and the physical. One is more useful than the other in some situations.
We'll begin with the geometric idea. For that, we need to review the concept of slope as it is studied in algebra.
To understand the geometric intuition for the derivative, we need to review a concept from algebra: slope. In normal calculus courses it is assumed that you have a solid grasp of this concept.
In my experience, though, most students that arrive to the calculus level don't have the needed acquaintance with it. It is so important for a true understanding of the derivative and for applying it to solving problems, that I consider it worthy of spending at least two minutes reviewing it.
To begin with, let's consider a straight line, like the one shown on the graph below
This line is the graph of a function y=f(x). We'll give a mathematical definition of slope. We take two points on the line, corresponding to the values x=a, and x=b
Now, let's denote the horizontal distance between a and b by Δx. This is spelled "delta x". And let's denote the vertical distance between f(a) and f(b) by Δy ("delta y"). These distances are shown below:
In calculus we like to use the letter Δ to denote the "increment", "change" or "variation" of something. So, for example, Δx is the change in x between x=a and x=b. In this case, Δx=b-a.
Similarly, Δy is the change in y along the line, also between x=a and x=b. In this case, Δy = f(b) - f(a).
Now we are ready to define the slope of the line. We define slope as the following quotient:
This number is independent of the points a and b that we choose. It is a ratio that is constant along the line. And this definition of slope makes intuitive sense. Let's see why.
A line that has a larger slope (steeper line) will correspond to a larger quotient between the deltas, because in this case Δy will be much bigger than Δx. If the quotient is small, it means that Δy is much smaller than Δx.
For example, in the figure below, we show two lines. In the two, we take a Δx of the same length. You can see that the first line has a smaller corresponding Δy. So, the quotient m will be much smaller for the first line. This corresponds to our intuition of steepness: clearly the second line is steeper than the first, so its slope is larger.
As we've seen, the change in y along a line given as a graph of a function y=f(x), from x=a to x=b, is
So, we can write the definition of slope as:
The geometric idea for the derivative is just a generalization of the concept of slope. We arrive to the concept we call derivative when we try to calculate the slope of a curve that is not a straight line.
The geometrical definition of derivative that we'll arrive to will look very similar to the above definition of slope. The biggest difference will be that it will involve taking a limit.
With this we can end our algebra crash course. I would like you to keep in mind this definition of slope and the intuitive geometric idea behind it. This will guide you and really help in your learning of derivatives.
We'll now take the big step and try to generalize the concept of slope to curves that aren't necessarily straight lines. We'll use the curve in green shown below
We know how to calculate the slope of the secant line. So we can take that specific value as an approximation to the slope of the curve. But there is a little problem here. The slope of the secant line that we calculate depends on the Δx that we choose.
If we change the Δx, the line will change, and hence the slope will change. The question is, when we make Δx approach something, what does the slope of the tangent line approach?
We want the slope of the curve to not depend on the Δx, so we want it to dissapear. The easiest way to do that is to make it approach zero. What happens when Δx approaches zero? We can see what happens on the short movie below:
We're now ready to present the geometric definition of derivative. First, let's write in equations what we've done so far. What we said is that we'll consider approximations to the slope of the curve as the slopes of the secant lines for different values of Δx. So we need to calculate these slopes.
As we've just seen in the review above, the slope of the secant line of a specific Δx is:
We said that we want our definition of slope of a curve to not depend on Δx. And the easiest way to make it dissapear is to make it approach zero. And that is exactly what we do.We define the derivative of the function f at the point x0 as the limit of the above slopes as Δx approaches zero:
Let's summarize the intuitive meaning of f'(x). The limit we calculated equals the slope of the tangent line at point x. Being strict, curves don't have slope, so we use the slope of the tangent line as a replacement.
Remember what I said earlier, the essential idea in differential calculus is to approximate non-linear objects by linear ones. It happens that the tangent line at a point is the best linear approximation to the curve near that point. We'll study that more in depth in other lessons.
But, intuitivelly, we can think of f'(x) as the slope of the graph of f(x) at the point x. From now on, think of f'(x) as the analogue of slope, but for a any curve. This intuitive idea will really help you. In the case of a straight line, the derivative f'(x) is reduced to the slope of the line.
An important difference between the derivative and the slope of a straight line is that, generally, the derivative depends on the point x. For a straight line, the slope is the same at all points. For the graph of a function that is not a straight line, f'(x) depends on the point x. That is, the slope of a curve is variable.
Let's introduce some notation for derivatives. A big part of learning calculus is getting used to notation.
There are several ways to write derivatives. They all have their advantages in some situations. I will introduce you to the most common.
The first one is "prime notation". This is the notation that I used in the definition above. When you have a function f(x), its derivative at point x is written as f'(x) (f prime of x). Remember that the derivative is a new function.
The other common notation is Leibniz notation. Suppose you have a function defined by the equation y = f(x). Then the derivative of this function is written as:
This is read "dee why dee x". Another way to write the same thing is:
This is read "dee dee x of f of x".
It is interesting that probably it was physics what first inspired the idea of the derivative. In this section we'll start with a simple physical problem that will lead us to the concept of instantaneous speed or velocity. We'll see that the concept of velocity and slope of a curve are connected by the same mathematical definition.
Consider a train travelling at constant velocity, say, 80 km per hour (change km by miles if you are more comfortable with that unit).
If at specific time intervals we record what distance it has covered so far, we'd end up with a table as the one shown below:
We can see that distance is a function of time. So, we can draw a graph, using the horizontal axis for time and the vertical axis for distance.
As a first step, we plot on the graph the points in the table:
We can easily connect these dots with a straight line:
Now, we know how to find the slope of this line. We take any two points on the line and measure the increments Δy and Δx:
Inspecting the graph and using the values on the table we see that Δx = 4 and Δy = 320.
The slope is defined as:
So, we get that m = Δy/Δx = 320/4 = 80. This equals the speed of the train. Is that a coincidence? Let's find out.
In the horizontal axis we represent time. That means that the increment Δx is in fact measured in units of time. Specifically, it represents the time passed between the two points we chose on the line.
We can write it as Δt instead of Δx to make this point clear.
Similarly, in the vertical axis we represent distance. We can write Δs instead of Δy to make it clear that this increment represents a distance.
So we can write the slope as:
The slope is the change in distance over the change in time. You may recall from physics that this is how we measure velocity. This means the geometric slope of the line we plotted is the physical velocity of the train.
This basic fact shows that the geometric concept of slope has a critical importance in physics. Let's consider now a more interesting example.
Because the train was traveling at constant velocity, we got a line as the graph for distance as a function of time. In reality, most things don't travel at constant velocities. When you drive a car you accelerate and decelerate constantly, for example.
Let's say that the distance a car travels over time is represented by this graph:
As we did in the previous section, we can take two points on the curve and draw a secant line.
The fraction Δy/Δx represents a velocity, because it is measured on units of distance over time. But this is just an average velocity between two points. This is because we consider the distances and times of two points.
What if we wanted to know the velocity exactly at point A, for example? That is, we want to know the velocity at a specifi instant. This is the type of velocity we usually talk about in our every day lives. And here is when the derivative comes into play.
We want the velocity at a specific instant. So, what we can do is to approximate it by taking average velocities over smaller and smaller time intervals. An approximate average velocity will be the velocity over a very small time interval around the specific instant we are interested in.
The length of the time intervals we consider is Δx. To make better approximations, we can take smaller and smaller time intervals. In the language of calculus, we take the limit of the slope as Δx approaches 0.
And this is the natural way of defining the "instantaneous" velocity at an instant A, that is, as the limit:
This is exactly the same definition of the derivative we arrived at using the geometric intuition. Remember, the derivative is the limit of the slopes of the secant lines.
Now let's make a few remarks just to understand this concept of instantaneous speed a little better.
When we measure velocities in reality we're just measuring average velocities over very small time intervals. This is because our physical instruments can't act on infinitely small time intervals. But what we actually measure are really good approximations to the actual velocity.
Thus, just like most concepts in physics, velocity is a theoretical construct that can be measured with absolute precision. Calculus allows us to talk precisely about the velocity at a specific point or the velocity at a specific instant.
Now, is the derivative only useful for finding velocities? No. There are many quantities defined in physics that are "rates of change".
A rate of change is a more general concept than velocity. Velocity is the rate of change of distance over time, for example. As a rule of thumb, a rate of change over time is the change in the quantity over the change in time.
Another simple example in physics is electric current. Current is defined as the rate of change of electric charge over time. If we represent the electric charge as a function of time, the electric current is the derivative of charge with respect to time.
There are many other examples of rates of change in the sciences and in everyday life.
The moral of the story is that when you need to think about the derivative in physical terms, you should think of it as a rate of change. To be more precise, an instantaneous rate of change, not an average rate of change.
The derivative tells you how much a quantity, represented by a function, is changing with respect to the independent variable. The indepent variable can be time, for example, but it is not restricted to that.
With this we can conclude our initial contact with the definition of derivative. You'll use this intuitive interpretations, both of them, time and time and again. So you won't forget them.
The next step is learning how to use the definition of derivative to actually calculate derivatives of functions. We do that in Calculating the Derivative by Definition.
If you have doubts or questions about what has been said on this page, please leave me a comment below. I, or other knowledgable readers will be happy to help you.
If you have just a general doubt about a concept, I'll try to help you. If you have a problem, or set of problems you can't solve, please send me your attempt of a solution along with your question. These will appear on a new page on the site, along with my answer, so everyone can benefit from it.
Click below to see contributions from other visitors to this page...