Special relativity is famous for drawing shocking conclusions from relatively (ha) straightforward math. Yet the derivation of E = Mc2 is often left out of these simple derivations. Science writers need a good way of answering the question “Why does E = Mc2?” and so I’m writing this down, inspired by a discussion in The Universe in the Rearview Mirror by Dave Goldberg.

First comes a qualitative description, which becomes beautifully quantitative with just one additional step. It also mirrors Einstein’s actual derivation of the formula. Ready? Here we go . . .

Begin with two bedrock principles. One, the conservation of energy, which will play a small part in the following, and only toward the end. Two, the conservation of momentum, which will play a larger part right from the beginning.

OK, imagine if you will an atom sitting perfectly still. That atom emits a pair of photons of exactly equal frequency (f) in exactly opposite directions.

(“Ah, ha!” you say, “already you’re violating one of your bedrock principles, or else assuming what you’re setting out to prove, for where did the atom get the energy to emit two photons? Don’t worry about that for now. Just assume that the atom was “energized” somehow. We know that atoms emit photons like this often, so it’s no stretch to begin the thought experiment with an observed phenomenon.)

Since a photon’s momentum is determined by its frequency (see below), and the frequencies are equal, we know that these two photons have identical momentum. Since they also are emitted in exactly opposite directions, their momenta exactly cancel. As predicted by the conservation of momentum, the atom does not move. It is in the same position before and after the emission.

Now imagine that same situation, but think of the atom moving slowly across your field of view (maybe the atom is moving, maybe you are moving. According to the Principle of Relativity, it doesn’t matter.) We specify slow motion to show that this is not an effect of high speed, but rather an effect of any motion at all.

Photons always move at the speed of light (c), so the emitted photons (whether against the direction of the atom’s motion or with the direction of the atom’s motion) will just move at the speed of light, no faster and no slower. Instead of changing speed, they change frequency (f). The photon emitted in the direction of motion is “squashed” into a higher frequency f’, while the photon emitted opposite the direction of motion will be “stretched” into a lower frequency f’’. This is just the Doppler effect for light.

Photons have momentum. We know this because light shining on a surface actually pushes on that surface. However, the momentum of a photon can’t be given by mass times speed, because photons have no rest mass. Instead, we know that the photon’s momentum is given by p = hf/c, where p is the momentum, h is Planck’s constant, and c is the speed of light.

Now we notice something strange. In the non-moving frame, we see that momentum is conserved quite naturally: hf/c = – hf/c as the left and right photon momenta cancel. In the moving frame, though, the atom keeps moving at the same speed before and after the emission (this is necessary by the principle of relativity; we can’t tell if the atom is moving and we’re sitting still or if we’re moving and the atom is sitting still – or even if we’re both moving at the same rate when we think we’re both at rest). This means that in the moving frame the atom’s momentum plays no role if the atom is the same before and after the emission. This is crucial, so hold it in your mind.

In the moving frame the photons are either squashed or stretched. This changes their momentum. In particular, there is suddenly more momentum in the forward direction and less in the backward direction. Uh oh! We said that one of our bedrock principles is that momentum is conserved. How can that be? It appears that in this example momentum is created from nothing.

There’s only one way to save our momentum conservation bedrock. The atom itself has to change. In particular, some of the atom’s mass must have disappeared. Where did it go? The only sensible place is into the photons. But since photons have no mass, that mass must have become energy. Wow!

Now for a more quantitative approach.

In our naïve assumption that the atom doesn’t change in the emission of photons, we ended up with something that made no sense. Here it is

The no sense “equation”:

momentum before  =  momentum after

mv = hf’/c – hf’’/c  + mv

where m is the mass of the atom and v is the atom’s speed. This “equation” tells us that hf’/c – hf’’/c must equal zero. We know that can’t be true, because when the atom is in motion the photons are no longer identical – one is squashed and the other is stretched; this changes their frequencies and therefore their momenta.

We know by the Principle of Relativity that the v’s have to be the same on both sides of the equal sign. This means the two m’s in this equation are not the same.

Let’s call the m before the emission m0. Let’s call the mass after the emission m’.

OK, here’s the equation again; this time it really is an equation:

momentum before = momentum after

m0v = hf’/c – hf’’/c + m’v

now we rearrange a little

(h/c)(f’ – f’’) = (m0 – m’)v                             (1)

This equation (we’ll call it equation 1 because we’ll need to call on it later) tells us that the difference in momentum between the two photons must be equal to the difference in momentum (which comes down to the difference in mass) between the atom before emission and the atom after emission.

So what is the difference in momentum of the two photons? This is something we can know experimentally, or we can use the formula for the Doppler effect for light. We’ll do the latter here.

For the photon emitted in the direction of motion:

f’ = f(1 + v/c + (v/c)2 + (v/c)3 + . . . )    NOTE: frequency gets bigger for the squashed photon

For the photon emitted opposite the direction of motion:

f’’ = f(1 – v/c + (v/c)2 – (v/c)3 + . . . )    NOTE: frequency gets smaller for the stretched photon

Here’s the nice thing about using a very slow atom. v is much, much smaller than c, so for anything bigger than (v/c)2, the number becomes tiny, tiny, tiny. We can ignore such tiny numbers, hooray!

Now let’s go back to equation (1), putting in our new values for f’ and f’’.

Equation 1:

(h/c){ f(1 + v/c + (v/c)2 + (v/c)3 + . . . ) – f(1 – v/c + (v/c)2 – (v/c)3 + . . . ) } = (m0 – m’)v

and simplify:

(h/c){f(2v/c)}  = (m0 – m’)v    NOTE: all the other terms either cancel {f – f and (v/c)2 – (v/c)2} or else are so small that we can ignore them.

And simplify some more (the v terms cancel, the c terms are combined)

2hf/c2 = (m0 – m’)

We’re almost there!

We know that the energy of the two photons is just 2hf (again, the Principle of Relativity tells us that the energy in the rest frame and the energy in the moving frame has to be the same). We also know that the term m0 – m’ is just the change in mass. Let’s call that change in mass M

Then 2hf = E

(m0 – m’)  = M

E/c2 = M

And finally,

E = Mc2

Wow indeed. The energy that came out as two photons is just the mass lost by the atom, multiplied by the speed of light squared.

About these ads