This is a gentle introduction to how the Discrete Fourier transform is computed. See the tutorial on the Fourier transform for a basic introduction to its use. After doing this tutorial you will be able to:

Compute the amplitude and phase of each component frequency of a signal. Explain how the Fourier transform is a projection on to cosine and sine axis of different frequencies. Reason about how this is a rotation of the time based axes. See how the complex exponential is used to compute cosine and sin in one equation. Know the three tricks that are used in computing the Fourier transform

Okay, so now we know how to get the amplitude and phase of the frequency in the signal that we create using the Fourier transform. Let's dig in just a little bit to some of the peculiarities of how it is computed that we alluded to in the Fourier transform tutorial.

First, say you want to know what the amplitude and phase is of a signal at a certain frequency, say 1 cycle/second. You can use the dot product to calculate this. The dot product of two vectors x and y results in a scalar value and is computed as follows:

$$ \sum_{i=1}^{n} x_i y_i $$

That is, you multiple each element of the two vectors of length n together and add that all up into a single number. Easy peasy. Turns out this simple operation is very, very useful. You may have seen it when you compute the correlation between two variables that are expressed as vectors:

$$ r = \frac{\sum_{i=1}^{n} (x_i-\bar{x}) (y_i-\bar{y})}{\sqrt{\sum_{i=1}^{n} (x_i-\bar{x})^2}\sqrt{\sum_{i=1}^{n} (y_i-\bar{y})^2}} $$

Where $\bar{x}$ and $\bar{y}$ are the means of the two vectors. Let's make all this easier by starting with two vectors for which we have already subtracted the mean. That would give us:

$$ r = \frac{\sum_{i=1}^{n} x_i y_i }{\sqrt{\sum_{i=1}^{n} x_i^2}\sqrt{\sum_{i=1}^{n} y_i^2}} $$

Now, stare at the equation above. The bottom of the equation is just computing the Euclidean length (L2 norm) of each vector. That is, $\sqrt{\sum_{i=1}^{n} x_i^2}$ is another way to just say the **length** of the vector because you are taking the sum of squares along each dimension of the vector, adding them up and taking the square root. If you were doing this with a vector with just two dimensions, then the geometry you should be thinking about is like this:

and to compute the length of the red vector which has components $x_1$ and $x_2$ can be done using the Pythagorean theorem, right? So, would be equal to $\sqrt{x_1^2 + x_2^2}$. Thus the bottom of the correlation equation is just the product of the two lengths (I'll use the notation $||x||$ to denote the Euclidean length or L2 norm of a vector).

$$ r = \frac{\sum_{i=1}^{n} x_i y_i }{||x|| \times ||y||} $$

So, rearranging the terms, we can see that the dot product is equal to

$$ \sum_{i=1}^{n} x_i y_i = r \times ||x|| \times ||y|| $$

Ok. Cool, but, how does all that help us? Well, if a signal is exactly the same shape as the cosine function then its correlation with a cosine function should be 1, right? So, if you take the dot product of the signal with a cosine, you should get the product of their two lengths.

Let's try that out.

% first make a unit length cosine function t = 0:deltaT:1-deltaT; cosBasis = cos(2*pi*t); % now make a cosine signal of magnitude of 5 sig = 5*cos(2*pi*t); % now take the dot product and see what you get sum(cosBasis .* sig) % You can check it against the product of the lengths, by using the norm function in matlab which computes the 2-norm which is another way of saying Euclidean length. norm(cosBasis) * norm(sig)

Still, how does that help us? Well if you want to figure out how many times bigger the signal is in the part of it that correlates with the cosine function, you could normalize the dot product like this:

$$ \frac{\sum_{i=1}^{n} x_i y_i}{||y|| \times ||y||} $$

This would make it so that the value will turn out to one if you are putting in the vector y. Going back to our example, it should give us a value of 5 since we have made our signal 5 times larger than the cosine function.

sum(cosBasis .* sig) / norm(cosBasis)^2

Got it? So this computation can give you back how much the amplitude of the signal is that is correlated with the cosine function which we use as a “basis” function.

Ok. so that's the way we can compute the amplitude. Let's say we don't know the amplitude or frequency of a sindsoid in a signal. How can we figure that out? Easy, we just make basis sinusoids of every possible frequency and compute the dot product with the signal, only the one at the right frequency will have a non-zero value. That will be the amplitude of the signal at that frequency. Challenge yourself and see if you can write a code snippet to do that. First, here's the signal:

% make a signal that has a frequency of 3 cycles/sec with an amplitude of 3.4 deltaT = 0.1; t = 0:deltaT:1-deltaT; sig = 3.4*cos(2*pi*3*t);

Now see if you can recover what frequency and what amplitude that signal is.

Sometimes people use the geometric definition of the dot product to see what is happening.

$$ \sum_{i=1}^{n} { x_i y_i } = ||x|| \times ||y|| \times cos(\theta) $$

This says that the dot product is equal to the products of the lengths of each vectors times the cosine of the angle between them. It might not be intuitive why this identity is true, but it can be proven using some relatively simple geometry and algebra, check here under “Proof of Equivalence” where it shows how to use “The law of cosines” to prove it. Here is a diagram of what is going on.

We say that the red vector is being projected on to the blue vector. If you think back to geometry, you will remember the rule “SOH-CAH-TOA”, and note that if the length of y is set to be one, then the dot product is equal to $cos(\theta)$ times the hypotenuse of the triangle (red line) which is equal to the adjacent segment (labelled P). So, what it tells you is that the dot product is a projection of the red X vector on to the blue y vector. Another way to say this is that it finds how far along the direction of the blue vector the red vector lies.

In this geometric interpretation the signal is thought of as a multidimensional vector with each dimension being a sample in time. So, for our signal, we have ten dimensions because we computed cosine for 10 time points. You can think of that vector as pointing in space (of course we can't really visualize 10 dimensions, so we typically have in our head a picture in two or three dimensions and hope it generalizes). Then the dot product can be thought of as how far along the cosine vector the signal lies. If it's projection is the same length then it is a cosine of the same length. If it is shorter or longer then it is amplitude that is smaller or larger. In this picture, the signal is a cosine (red) and the cosine we are projecting on to is blue. If we normalize the length of the blue vector, then we can find out how many times greater the red vector is. Remember that in this special case, the red vector and blue vector point in the same direction since they are both cosines. It's just that one (the red one), points farther than the blue one because it has a higher amplitude.

Note that if the signal is not exactly correlated with the cosine function it is like the first drawing above in which the signal vector (red) is not in the same direction as the other vector (blue). You can also think of this as the red vector not having perfect correlation with the blue vector. In this case the dot product is finding how much of the signal vector is in the same direction as the cosine vector - this will become very useful as it will allow us to compute phase below.

Ok. So we have a way to compute the amplitude and frequency of a signal. How can we figure out the phase. Well, it turns out that cosine and sine are very special functions relative to each other. They are orthorgonal to each other. What that means is that if you take the dot product of cosine and sine, you will get 0.

sum(cos(2*pi*t) .* sin(2*pi*t))

You should get a value that is near zero (there may be a small roundoff error making it not exactly equal to zero. Another way to think about orthogonality is that the vectors (10 dimensional in our case) are at 90 degree angles to each other.

Another very cool property of cosine and sine is that if you have a sinusoid of the same frequency but different phase, it can always be composed of the weighted sum of a cosine and sine function. Another way to say this is that cosine and sine form a basis set of all sinusoids with the same frequency but arbitrary phase. Sometimes people call this property “shiftability”, meaning that you can shift the location (phase) of a sinusoid, by linearly combining the two basis functions of cosine and sine.

Let's try that out to see if it is really true. First, how do you find what combination of cosine and sine make up another signal with the same frequency but different phase? Well, the dot product again. You take the dot product of the signal with a cosine and sine and that becomes the coefficients for how you would add up a cosine and sine to make that signal. This is really important, so let's work through that with some code.

% First let's make a signal with some phase that we want to try to recover deltaT = 0.1; t = 0:deltaT:1-deltaT; phase = 0.37; sig = cos(2*pi*t-phase); % now, let's make cosine and sine basis functions. cosBasis = cos(2*pi*t); sinBasis = sin(2*pi*t); % Let's project the signal on to these basis functions to find the coefficients cosCoefficient = sum(sig .* cosBasis); sinCoefficient = sum(sig .* sinBasis); % Now, let's plot the original signal clf; plot(t,sig,'k-'); xlabel('Time (sec)'); ylabel('Amplitude'); % Now let's see if we can reconstruct the signal using the weighted sum of the cosine and sine basis functions. reconstructedSig = cosCoefficient .* cosBasis + sinCoefficient .* sinBasis; % And plot it plot(t,reconstructedSig,'g-');

If that worked correctly, the original signal in black, should have been covered up by the reconstruction in green. Pretty cool, right? Now we have a way to make any phase of a sinusoid as a weighted sum of a sine and cosine, where we find the proper weights through the dot product.

So, what does this all mean? Going back to our geometric view of the dot product, it means that a cosine vector (remember for our case since we have 10 time points we are thinking of a vector that points in a 10 dimensional space) and our sine vector are orthogonal to each other, and the dot product is being used to find out how much the signal lies in the direction of cosine and how much lies in the direction of sine:

In this diagram, the green arrow represents the sine vector (remember it is 10 dimensional with the coordinates as shown in the diagram) and the blue arrow represents the cosine vector. Our signal is in black, and it has a phase that is between cosine and sine since it does not point in the direction of either directly. By taking the projection of our signal (black) on to cosine (blue) and sine (green), we can find the weights (P1 and P2, respectively) that tell us what combination of cosine and sine are needed to reproduce a signal of that phase.

Even more importantly, the $\theta$ value is equal to the phase! Imagine if $\theta$ were 0. Then our signal would be along the direction of the blue arrow and be pure cosine. If $\theta$ were 90, then the signal would be pure sine. In between it is whatever phase is that we find for $\theta$. So if we know the two weights (P1 and P2) for cosine and sine, then we can reconstruct their phase by computing the angle $\theta$. How do we do that? With SOH-CAH-TOA of course. That is $tan(\theta) = \frac{P2}{P1}$ so, we can compute using arctan (note that arctan is badly behaved when P1 is 0, but should work for other angles.

Let's do that for our example above. Should be able to get back the phase in radians that we set in the code above.

atan(sinCoefficient/cosCoefficient)

Ok. So, we now can state a pretty complete version of what the Fourier transform does. It projects the signal on to a set of basis functions which are cosine and sine functions at different frequencies. Cosine and sine are orthogonal to each other at any given frequency. Each cosine and sine at a different frequency (e.g. 1 cycle/sec vs 2 cycle/sec) are also orthogonal to each other. So, how many possible frequencies can there be? It depends on how many dimensions. If we have 10 dimensions (i.e. 10 time points), then we will have two basis functions for each frequency and we can therefore make 5 total orthogonal frequencies (from 1 cycle/sec to 2cycles/sec etc). Remember that if we have 10 dimensions, there are only 10 directions that can be made orthogonal (at 90 degrees) to each other.

So, the picture you might have in your head should look something like this:

Where the black vector is our signal and each of the colored vectors is a sinusoid of specified frequency and phase. Of course, we can't really imagine 10 dimensions, so I'm just trying to draw what you should be thinking about. That our signal projects down on to all 10 of these new axis which you should think of as being orthogonal (90 degrees) away from every other axis. Another way to think about this, is that all one is doing is rotating the axis (one dimension for each time point) to align with the cosine / sines of different frequencies and phases, and then finding out where the signal lies in this new axis. The signal will project down on to these axis and give you its coordinate on each of the axis.

A couple last points here. I have labelled the axis as cycle/period. By period, I mean the length of time for which you have sampled the signal. In our case, this was one second, so each of the frequencies is cycles/second. But, if we had sampled for 20 seconds, then the frequencies would be cylces/20 seconds, and you might convert them to the appropriate cycles/second.

Also, if you have more samples than 10, you would be able to project down on to more frequencies. For example, if you had 30 samples, then you would have 15 frequencies components, with one cosine and one sine phase each. Another way to say this is you need to have at least two samples for each frequency that you want to resolve. If you have a period of one second and you want to resolve 2 cycles / sec, then you would need to have at least 4 samples. This is an important fundamental limit of sampling called the Nyquist frequency.

Ok, so now we are ready to understand the tricks that are used for computing the Fourier transform.

First is to change the projection axis from cos and sin to cos and $\sqrt{-1}$ sin. We will use the electrical engineering convention of calling $\sqrt{-1}$ = j instead of i (which is used for current):

So, huh? Why do that? Well, it means that you can do the projection onto sin and cos with a single operation, instead of two, because of Euler's formula

$$e^{j2\pi t} = cos{(2\pi t)} + j sin{(2\pi t)}$$

The $e^{j2\pi t}$ is called a complex exponential and the formula above can be proven by taking Taylor expansions of the respective functions. It's pretty amazing (and somewhat strange, I think) formula - and very useful. As an aside, you can use it to relate five fundamental constants in math $e$, $pi$, $i$, $0$ and $1$, which is pretty darn cool.

In the Fourier transform it is being used to simultaneously find the amplitude and phase of a sinusoidal component by projecting on to the complex exponential instead of separately on to cosine and sine as we did above.

So, how does this work, in practice? Well, if you take a signal with cosine phase and project it on to the complex exponential of the same frequency, you would get back a real number that represent the amplitude of the signal, since the projection on to j $\times$ sin (which is orthogonal) will be 0. Similarly, if you take a signal with sine phase and project it on to the complex exponential of the same frequency, you would get back a purely imaginary number whose amplitude is the amplitude of the signal. If you take a signal with some arbitrary phase, you will in general get a complex number, the amplitude of which gives you the amplitude of the frequency component and whose phase (or angle) is equal to the phase of the signal. Make sense?

Now, there's only one little problem with this scheme. If you try to replicate a signal from sets of complex exponentials (like what we did for cosines and sines of different frequencies), you will get back signals that have imaginary components, which is problematic. For example, if you have a signal with sine of one cycle/sec, then your Fourier transform would have amplitude j and your reconstruction would look like the following:

$$ j \times e^{j2\pi t} = j \times (cos{(2\pi t)} + j sin{(2\pi t)}) = j cos{(2\pi t)} - sin{(2\pi\theta)} $$

Uh, oh. So, how to fix that? Well, the second clever (diabolical) trick is to project your signal on to two sets of complex exponentials for each frequency. One with a positive and one with a “negative” frequency. Huh? There is no such thing as a negative frequency, you say? You are right, but, mathematically, we can just put a little negative sign in to the complex exponential and we get something that looks like this:

$$ e^{-j2\pi t} = cos{(-2\pi t)} + j sin{(-2\pi t)} $$

How does that help? Well, let's take our sine signal example again, and project down on to this “negative” frequency complex exponential. What happens? Well, first thing to note is that by putting the negative into the cosine, nothing happens, but for sine it shifts phase by 180 degrees and makes it equal to negative sine.

So, what does that do? Well, it means that we can rewrite the negative frequencies as follows:

$$ e^{-j2\pi t} = cos{(2\pi t)} - j sin{(2\pi t)} $$

Now if we project our sine signal down on this negative frequency complex exponential, we get -j, right? So, putting the positive and negative frequencies together, and trying to reconstruct we get:

$$ j \times e^{j2\pi t} + -j \times e^{-j2\pi t} $$ $$ j \times (cos{(2\pi t)} + j sin{(2\pi t)}) - j \times (cos{(2\pi t)} - j sin{(2\pi t)}) $$

$$ j \times cos{(2\pi t)} - sin{(2\pi t)} - j \times cos{(2\pi t)} - sin{(2\pi t)} $$

$$ - 2 sin{(2\pi t)} $$

Neato, right? By projecting your signal on to positive and “negative” complex exponentials, everything magically cancels out when you reconstruct the signal to give you a real valued function. And this is in general true, we can do it with a signal of any phase and we will get back a real valued function and that is the trick that is used for computing the Fourier transform!

So, now we can finally write down what the Discrete Fourier transform looks like as an equation. Let's call our signal s and the Discrete Fourier transform is S. We have N samples, so we can compute N/2 frequencies (remember the Nyquist limit). For the positive complex exponentials, we have:

$$ S[f] = \sum_{n=1}^{N}{s[n] \times e^{\frac{fnj2\pi}{N}}} $$

This means, that for each frequency f, from 1 to N/2, we compute the above projection. Note that the complex exponential is being evaluated at 1/N, 2/N,.. 1, and we have added in the f so that it is computed for different frequencies. This projection gives us the Fourier coefficient as a complex number and plop it into the vector S. We also need the negative frequencies:

$$ S[-f] = \sum_{n=1}^{N}{s[n] \times e^{\frac{-fnj2\pi}{N}}} $$

Because math people are still unsatisfied with the horrible inelegance of having two sets of equations, they make them all into one equation, with a third final trick:

$$ S[f] = \sum_{n=1}^{N}{s[n] \times e^{\frac{-fnj2\pi}{N}}} $$

Where f goes from 1 all the way to N. So, how does that work? Where did the positive frequencies go? They magically come from when f > N/2. Which you can see by rewriting those frequencies as f = N-$\hat{f}$. Substituting into the equation:

$$ e^{\frac{-fnj2\pi}{N}} = e^{\frac{-(N-\hat{f})nj2\pi}{N}} = e^{\frac{-Nnj2\pi + \hat{f}nj2\pi}{N}} = e^{-nj2\pi + \frac{\hat{f}nj2\pi}{N}} = e^{\frac{\hat{f}nj2\pi}{N}} $$

In the last step, we can get rid of $-nj2\pi$, by noting that for whatever integer n, this will just be a multiple of $2\pi$, thus adding $2\pi$ of phase which does nothing as it is just wrapping around the circle. So, there you have it then. The frequencies greater than N/2 are the positive frequencies in reverse order (remember we set f = N-$\hat{f}$).

So, to sum up the mysteries from when we were working with the Fourier transform. There are three tricks. One is to combine sin and cos with the complex exponential. The second is to use positive and negative frequencies and the third is to pack away the positive frequencies at the top end. From these ticks, we get both a positive and negative frequency component, which are symmetric because we are using the positive and negative complex exponentials to cancel out each other's complex parts. In the computation, the positive frequencies are actually counted down form the highest component (N) towards the middle component (N/2) and so that is why we saw two peaks, one at the “correct” frequency, and one at the other end of the Fourier transform. We get a factor of two in the amplitude because of the way the positive and negative frequency components cancel out. DC is calculated separately (or you can think of it as the 0 frequency component), so doesn't have a postive / negative pair and so doesn't have the factor of two. Phew. All make sense now?