Welcome to Spacetime

[Contents]

Welcome to Spacetime
John Denker

1 Executive Summary

Special relativity is the geometry and trigonometry of spacetime. Spacetime is four-dimensional.
The fourth dimension is in some ways very similar to the other three, but in some ways different. A great deal of what you already know about the xy plane can be applied to the xt plane directly. Even more can be applied with minor modifications.
We trust special relativity because it is connected to – and consistent with – a huge number of things that are already well understood. In many cases, it gives a unified explanation of things that would otherwise have to be explained separately: unification of space and time, unification of energy and momentum, unification of eletricity and magnetism, et cetera. Secondly, it explains things that would otherwise be hard to explain at all, such as the constancy of the speed of light. Thirdly, it provides a firm foundation for further developments, including general relativity and relativistic quantum mechanics.

* Contents

1 Executive Summary

2 Introduction

3 Principles and Applications

3.1 Straight-Line Motion in Spacetime

3.2 Relativity

3.3 Spacetime

3.4 Coordinate Systems and Rotations

3.5 Application : Kinetic Energy

3.6 Better and Better Approximations

3.7 Remark : The Kinematic Significance of the Rest Energy

3.8 Application : How to Make Antimatter (Vector Analysis)

3.9 Components versus Invariants

3.10 Application : How to Make Antimatter (Graphical Analysis)

3.11 Application : Muon Lifetime

3.12 Some Trigonometry

3.13 Orthogonality in Spacetime

3.14 Fast-Moving Particles : Speed, Momentum, and Energy

3.15 Application: Relativistic Doppler Shift and Aberration

3.15.1 Low-Speed Case

3.15.2 High-Speed Case

3.15.3 General Case

3.15.4 Transverse Components

3.16 Long, Steady Acceleration

3.17 Steady Acceleration : Additional Discussion

3.18 Breakdown of Simultaneity at a Distance

3.19 Application: GPS

3.20 Arc Length, Proper Time, and Proper Length

3.21 Various Ways to Compute the 4-velocity

3.22 Classical Velocity, 4-velocity, 4-momentum, et cetera

3.23 Invariance ± Conservation

3.24 Photons at Rest, Or Not

4 Great Quotes

4.1 Galileo : Relativity (1632)

4.2 Minkowski : Spacetime (1908)

5 Tactics

6 Some Trigonometric Identities – Applied to Relativity

7 Dirty Laundry

8 References

2 Introduction

You presumably were taught that velocity is proportional to the first power of momentum: «v = p/m».
Except when it isn’t: If the momentum is very large, v = c. The velocity is independent of momentum.
You presumably were taught that kinetic energy is proportional to the second power of momentum: «E_K = ½p²/m».
Except when it isn’t: When the momentum is very large, «E_K = |p|c», which is only first order in «|p|».
You presumably learned that in nuclear reactions, we sometimes find that part of the energy is equal to mc².

You probably learned all five of those things separately.

Now suppose that you could learn one simple theory that explains all five of those things together. It shows that those five things are mutually consistent and not exceptional ... including the low-speed limit, the high-speed limit, and everything in between. It explains all that and lots more besides.

Well, we have such a theory. It’s called special relativity. It gives a unified understanding of many things that would otherwise have to be learned separately.

It unifies space and time.
It unifies momentum and energy.
It unifies low-speed kinematics with high-speed kinematics.
It unifies rest-energy with mass.
It unifies electricity and magnetism.
It explains why the speed of light is the same in all reference frames.
It lays the groundwork for further developments, including general relativity and relativistic quantum mechanics.
et cetera.

Most remarkably, it does all of this using only one tool: non-Euclidean geometry and trigonometry.

Note: Many of the expressions in this section have been written in scare quotes «...», because they are valid only in the non-relativistic approximation. They should not be taken as gospel. In particular, the non-relativistic «p» used here must not be confused with the 4-vector p used in the rest of this document. The latter is much more useful.

3 Principles and Applications

Applying the ideas of special relativity is more interesting than deriving them. The goal is to get to some applications as soon as possible, but first we need to mention a couple of fundamental principles.

If you are interested in a more deductive approach, see reference 1 and reference 2.

3.1 Straight-Line Motion in Spacetime

Let’s start with a super-simple example. The laws of physics say that a free particle moves in a straight line at uniform velocity. This is called Newton’s first law, although the idea itself was clearly stated and used by Galileo several decades earlier.

Figure 1 shows the motion of a particle, plotting Y versus X. Obviously this is not a free particle. The fact that the motion is non-straight tells us the particle must be subject to some external force.

Figure 2 is harder to interpret. We can see that the particle is moving in a straight line, but we cannot determine from this figure whether it is moving with a uniform velocity.


Figure 1: Curved Motion in Space		Figure 2: Seemingly Straight Motion in Space

Figure 3 makes things more explicit. The magneta curve shows Y versus X, while the red curve shows X versus T and the blue curve shows Y versus T. We can see that X is a non-straight function of T, and also Y is a non-straight function of T.

Similarly, figure 4 is unambiguous. The particle is evidently accelerating in a straight line through space. When we look at it in spacetime, we see that X is a non-straight function of T, and also Y is a non-straight function of T. The dots in all these curves are equally spaced in time, which is another way of visualizing the time-dependence.


Figure 3: Curved Motion in Spacetime		Figure 4: Another Curved Motion in Spacetime

We can visualize things even more clearly using computer graphics. However, the javascript animation is not included in this version of the document.

The spacetime viewpoint gives us a very simple, very elegant statement of the first law of motion: A free particle moves in a straight line through spacetime.

We see that in figure ??, the magenta particle’s true motion – the spacetime motion – is curved, even though the two-dimensional shadow is straight. We conclude that this is not a free particle, because its motion through spacetime is not straight.

We should take the hint: All physics is spacetime physics.

Tangential remark: We use straight-line motion to recognize free particles. We do not need free particles to define what we mean by straight. There is a perfectly good, fundamental geometrical definition of straight, as explained in reference 3.

3.2 Relativity

Here is Big Idea #1(a): Galileo’s principle of relativity: The laws of physics are invariant with respect to motion. That is, if you shut yourself up in a room in a ship, you cannot tell the difference between a stationary ship and a ship that is undergoing uniform straight-line motion ... assuming you are truly isolated from any outside influences. For a fuller statement of this principle, see section 4.1.

Here is Big Idea #1(b): Rotational invariance: The laws of physics are invariant with respect to rotation. That is, if you shut yourself up in a room in a ship, you cannot tell which direction is which ... assuming you are truly isolated from any outside influences.

Here is Big Idea #1(c): Locality: The laws of physics depend only on what is happening in the immediate neighborhood of here and now. That is to say, they do not depend on far-distant places or far-distant times.

3.3 Spacetime

Here is Big Idea #2(a): Spacetime is four-dimensional. Many a thing you previously thought of as being a scalar or vector living in three dimensions is really just a part of a vector or bivector living in four dimensions.

This starts with the idea of a position vector, but it does not end there. Given that the position is four dimensional, it should come as no surprise that the velocity and acceleration are also four dimensional. Similarly, given that the velocity is four dimensional, it should come as no surprise that the momentum is four dimensional.

You don’t need to visualize all four dimensions at once. At the level of introductory special relativity, it suffices to visualize the dimensions two at a time. For relativistic motion in a straight line, it suffices to understand the tx plane. We can draw nice two-dimensional pictures of that.

This is important, because most people – even professional physicists – have a hard time visualizing rotations in three dimensions, let alone four.

People like to say that time is the fourth dimension, but that’s only true for position vectors. Whereas the fourth component of the position vector is called the time, the fourth component of the momentum vector is called the energy. We can summarize this as follows:

D=3				D=4
t	and	[x, y, z]	→	[t, x, y, z]	= unified time and space
E	and	[p_x, p_y, p_z]	→	[E, p_x, p_y, p_z]	= unified energy and momentum

(1)

In the previous equation, we have chosen to measure things in units such that the speed of light comes out to be c=1. More generally, we can stick in the factors of c explicitly:

D=3				D=4
t	and	[x, y, z]	→	[ct, x, y, z]	= unified time and space
E	and	[p_x, p_y, p_z]	→	[E/c, p_x, p_y, p_z]	= unified energy and momentum

(2)

If you are wondering why the timelike component of the position involves a factor of c, while the timelike component of the momentum involves a factor of 1/c, don’t worry about it too much. There are no fundamental issues here.

Partly it’s a historical accident. People first learned about space and time and momentum and energy separately, long before special relativity gave a unified view of such things. Sometimes the names are not as logical as they could be, even though the physics is perfectly logical. You can verify that the factors of c in equation 2 make sense from a dimensional-analysis point of view.
The definitions of E and t are partly motivated by convenience. At slow speeds (slow compared to the speed of light), it is easier to measure t (with a clock) than to measure ct (with a ruler). Similarly, at slow speeds it is easier to measure E than E/c. (By way of analogy, the international system of units defines the liter, which is a convenient unit of volume, even though the “natural” SI unit of volume would be the cubic meter.)

Here is Big Idea #2(b): The geometry of spacetime is non-Euclidean.

In three dimensions, in any particular reference frame, we can always construct three basis vectors x̂, ŷ, and ẑ.

In four dimensions, in any particular reference frame, we can always construct four basis vectors t̂, x̂, ŷ, and ẑ.

These three vectors are normalized as follows:

x̂·x̂

ŷ·ŷ

ẑ·ẑ

(3)

These four vectors are normalized as follows:


t̂·t̂	=	−1					(4a)

x̂·x̂	=	ŷ·ŷ	=	ẑ·ẑ	=	1	(4b)

The minus sign that appears in equation 4a is the only thing that makes the fourth dimension different from the other three. You could deduce all of special relativity just using the two big ideas presented here. We’re not going to take a purely deductive approach, but we could if we wanted to.

Surely you already knew that the fourth dimension is not exactly the same as the other three. Now you know exactly how different it is ... and also how similar it is.

The basis vectors are of course mutually orthogonal:

		x̂·ŷ	=	x̂·ẑ	=	0
				ŷ·ẑ	=	0

(5)

The basis vectors are of course mutually orthogonal:

t̂·x̂	=	t̂·ŷ	=	t̂·ẑ	=
		x̂·ŷ	=	x̂·ẑ	=
				ŷ·ẑ	=

(6)

3.4 Coordinate Systems and Rotations

Let’s review some things we know about coordinate systems, and the effect of rotating a coordinate system.

Consider an object such as the ruler shown in figure 5. It exists as a physical object. Its existence is independent of whatever coordinate systems, if any, we choose to use. It is what it is.

Figure 5: Just a Ruler

Without changing any properties of the object, we can always impose a coordinate system. This allows us to assign x and y coordinates to various points in space. Indeed we have a choice of different coordinate systems, as you can see by contrasting figure 6 with figure 7.

Figure 6: Ruler Plus Red Coordinate System

The ruler is aligned with the contours of constant y in the red coordinate system, so we say it has zero slope relative to the red coordinate system. Meanwhile, the same ruler has a slight slope relative to the blue coordinate system.

Figure 7: Ruler Plus Blue Coordinate System

We can play the same game in spacetime. Rather than plotting y versus x, let’s plot x versus t. In figure 8, the green object should not be thought of as a ruler, because it does not measure x, y, or z. Instead let’s call it a log. The log is a chronological record of the ticks of the clock.

Figure 8: Clock Plus Red Coordinate System

In figure 8, the motion of the clock is aligned with a contour of constant x_@R, so we say the clock is stationary relative to the red coordinate system. It just sits in one location and gets later. Meanwhile, figure 9 shows the same clock relative to a different coordinate system. We can see that as time progresses, the clock’s x_@B coordinate increases, so we say that the clock is moving relative to the blue coordinate system.

Figure 9: Clock Plus Blue Coordinate System

So far, the diagrams in this section haven’t told us much we didn’t already know. The idea of plotting x versus t as we have done in figure 9 is completely standard. It’s something you should have seen in 8th grade (if not before), and seen many times since then. The only thing that is even slightly special is that rather than showing the x-axis we have shown the contours of constant x. Similarly, rather than showing the t-axis we have shown the contours of constant t. This is a useful tactic. It doesn’t make much difference in figure 9, but it makes figure 8 considerably easier to interpret. For a fuller explanation of why contours are better than axes, see reference 4.

Tactics aside, the main strategic reason for showing these plots is to apply some new language to a familiar situation. We are taking seriously Big Idea #2(a), the idea that spacetime is four dimensional. This leads us to the realization that velocities are intimately related to rotations in four dimensions.¹

This realization is important, because there is something special about rotations: The dot product between two vectors is unchanged by a rotation. Indeed, once you have a well-defined dot product, you can use that to define what you mean by rotation, and the resulting rotation is guaranteed to leave dot products invariant. In this document, rather than deriving the rotation law, we will simply assert it and then explain why it makes sense. If you are interested in the derivation, see reference 1.

In D=3, you can specify three rotation angles. These correspond to rotation in the xy plane, the zx plane, and the yz plane ... also known as yaw, pitch, and roll (respectively).

In D=4, you can specify six rotation angles. This includes three spacelike rotations, namely rotations in the xy plane, the zx plane, and the yz plane. It also includes three timelike rotations, namely rotations in the tx plane, the ty plane, and the tz plane.

A rotation in three dimensions is sometimes called a twist.

A timelike rotation (i.e. a change in velocity) is sometimes called a boost. The angle in the tx plane is sometimes called the rapidity. The rapidity is sometimes denoted by ρ, but in this document we use plain old θ, just like any other angle.

Our notion of velocity is made more precise in section 3.12, section 3.21, and section 3.22 ... but for now all we need are:

an intuitive notion of velocity, namely change of position per unit time;
the idea that velocity in the x direction is intimately related to a rotation in the tx plane, as illustrated in figure 9; and
the idea that any rotation leaves invariant the dot product as defined by equation 4 and equation 6.

3.5 Application : Kinetic Energy

At this point, we already know enough special relativity to do some interesting things. As a first application, let’s see what happens to the momentum and energy of a moving object.

Suppose we have a particle moving through space. We arrange the red reference frame so that the particle has no x- y- or z-velocity measured using this frame. This is the situation shown in figure 8. The particle just sits in one place and gets later.

As always, any vector can be expanded in terms of its components. In the red reference frame, we expand the particle’s position vector as [t, x, y, z]_@R. The 4-velocity u is the rate-of-change of position with respect to proper time, τ, which is defined to be the time as measured by a clock comoving with the particle. In this situation, τ is identical to the t_@R component. When calculated using the red reference system, the 4-velocity components are particularly simple:

u		=		[1, 0, 0, 0]_@R
		=		four-velocity in its own rest frame

(7)

Compare equation 63.

It must be emphasized that the 4-velocity is never zero. This may seem odd, but it turns out to be useful. For one thing, it makes the statement of various conservation laws much more elegant; see reference 5. For another, it permits a consistent view of velocity, momentum, and energy (including rest energy) as discussed in section 3.7.

Let’s be clear: When we say a particle is “at rest” in a given coordinate system, it means the x, y, and z components of its velocity are zero. The 4-velocity as whole is never zero. When “at rest”, the particle is moving toward the future at a rate of 60 minutes per hour.

If we stick in the explicit factors of c, we find u = [c, 0, 0, 0]_@R.

The four-vector momentum could hardly be simpler. It is just the mass times the four-velocity:

m u

(8)

In all cases, we define the gorm of a vector to be the dot product of the vector with itself.

As always, if we know how to calculate dot products involving the basis vectors, as in equation 4 and equation 6, we can calculate any dot product whatsoever. Just expand each vector as a linear combination of basis vectors, take the dot product, and turn the crank. As a consequence:

For a three-dimensional vector with components x, y, and z, the gorm is equal to x² + y² + z².

For a four-dimensional vector with components t, x, y, and z, the gorm is equal to −t² + x² + y² + z², with an important minus sign in front of the t² term. This minus sign is necessary. It is an inescapable consequence of the minus sign in equation 4a.

We recognize this as the square of the length.

In some cases the gorm is related to the square of a length, and in other cases it is related to the square of an elapsed time, but let’s not worry about that just yet.

In D=3, the gorm is unchanged by rotations.

In D=4, the gorm is unchanged by rotations. It is unchanged by all six rotations, including the three timelike rotations as well as the three spacelike rotations.

The gorm of the 4-velocity (u) and 4-momentum (p) are always:

u·u		=		−1		(i.e. −c²)
p·p		=		−m²		(i.e. −m²c²)

(9)

Let’s calculate the same thing again using a different coordinate system, such as the blue coordinate system shown in figure 9.

We know from equation 1 that p = [E, p_x, p_y, p_z]_@B. When we calculate the gorm in terms of these components, we find it is equal to − E² + p_x² + p_y² + p_z². Meanwhile, the gorm is still equal −m². We know this because we calculated it using the red coordinate system, and we know that the gorm is invariant with respect to rotations. Combining these two expressions for the gorm, we obtain:

−m²

−E² + p_x² + p_y² + p_z²

(10)

E²		=		m² + p_x² + p_y² + p_z²
(E/c)²		=		(mc)² + p_x² + p_y² + p_z²

(11)

On the last line we have stuck in the explicit factors of c.

We can simplify the equations by introducing the 3-momentum, p_xyz. In any particular reference frame, it is just the spatial part of the 4-momentum. That is:

If
	p	=	[E, p_x, p_y, p_z]	(4 dimensions)
then
	p_xyz	:=	[p_x, p_y, p_z]	(3 dimensions)

(12)

Combining equation 12 with equation 11, we can write:

E²		=		m² + p_xyz²
(E/c)²		=		(mc)² + p_xyz²

(13)

On the last line, we have stuck in the explicit factors of c. As always, p_xyz² is shorthand for the dot product p_xyz·p_xyz.

Let’s examine equation 13 more closely. The first thing to do is to draw the graph of E versus p_xyz. It’s a hyperbola. Figure 10 shows the case where m=1 and c=1. For simplicity, the graph assumes the 3-momentum has no y or z components. (We can always rotate our point of view to make this happen.) The small black circle in figure 10 represents 1 radian. Note that 1 radian corresponds to a reduced velocity (v) equal to 76% of the speed of light.

Figure 10: Dispersion Relation

One thing that we notice immediately is that the energy is equal to mc² when the particle is at rest and not otherwise. Let’s be clear: The famous equation E=mc² is very widely misunderstood. It would be better to rewrite it to emphasize that mc² corresponds to only part of the energy, namely:

E₀	:=	mc²
	=	rest energy	(when m≠0)
	=	E_rest	(when m≠0)

(14)

This E₀ is more-or-less universally called the rest energy.

This makes perfect sense for particles that have nonzero mass. When the particle is at rest, its total energy E is equal to the rest energy E₀.

For a massless particle such as a photon, calling E₀ the “rest energy” is a bit of a misnomer. A running-wave photon has a well defined mass, namely m=0 which means E₀=0. However, strictly speaking, we ought not call this the rest energy because the photon is never at rest. Its total energy E is never equal to E₀.

On the scale of things, this is not a serious problem. See section 3.23 and section 3.24 for some related discussion.

Actually, we hardly need a name for E₀ at all. Since we are using equation 14 to define E₀, the equation is automatically and tautologically true. Therefore E₀ is the mass. Specifically, it is the mass measured in energy units. If we used sensible units, measuring distance in the spacelike directions using the same units we use for the timelike direction, then c would be equal to 1, and equation 14 would even more clearly tell us the E₀ is the mass.

Because it is a tautology, equation 14 is not terribly interesting. We are far more interested in equation 13, which tells us how the rest energy (aka mass) is related to the plain old total energy, E.

One should never say that mass is “equivalent” to energy, because “equivalent” is much too strong a word. An equivalence relation is reflexive, symmetric, and transitive; for details see reference 6. One would not say that Lake Baikal is equivalent to water, because some of the world’s water is in Lake Baikal but some is not. By the same token, one should never say that mass is equivalent to energy, because some of the the world’s energy is in the form of mass but some is not.

If you want to say mass corresponds to a subset of the energy, that’s fine, in accordance with equation 14. Just don’t leave out the word “subset”. For any single particle, the total energy E (in your chosen frame) is equal to the rest energy mc² if and only if the particle is at rest (in that frame). The case of multiple particles is discussed in section 3.23 and section 3.24.

When the particle is moving slowly, we can learn some amusing things by expanding equation 11 to lowest order.

So, in our chosen frame, we have:

E²	=	m² + p_x² + p_y² + p_z²
	=	m² + p_xyz·p_xyz
	=	m²[1 + (p_xyz/m)²]

(15)

√

1 + (p_xyz/m)²

(16a)

E_slow

≈

m [1 + ½(p_xyz/m)²]

(Taylor expansion)

(16b)

≈

m + ½ p_xyz²/m

(16c)

i.e.

mc² + ½ p_xyz²/m

(16d)

i.e.

(rest energy) + (classical kinetic energy)

(16e)

In equation 16d, we have stuck in the explicit factors of c.

It is well known in classical physics that the kinetic energy is ½p_xyz²/m. Special relativity is telling us that classical physics can be considered a lowest-order approximation to the true spacetime physics.

Spacetime physics is the true physics.
It contains classical physics as a special case.

There are other ways of expressing this result, some of which will turn out to be useful later. To proceed, we need to introduce the classical velocity v. For present purposes, it suffices to note that p_xyz is the classical momentum, and the classical velocity v is approximately equal to p_xyz/m for a slow-moving particle. As discussed in section 3.6, this is not the official definition of v, but it is a good approximation at low speeds, which is the regime we are considering.

E_slow	≈	mc² + ½ p_xyz²/m	passable approximation
	≈	mc² + ½ p_xyz·v	recommended approximation
	≈	mc² + ½ m v²	passable approximation

(17)

The approximation p_xyz ≈ mv is an excellent approximation for a slowly-moving particle. It is correct to first order, and indeed exact to second order, as discussed in section 3.6.

We can interpret equation 17 as saying that the particle has a kinetic energy of ½ p_xyz·v, plus a non-kinetic energy (rest energy) of mc². The kinetic energy depends on the 3-momentum, while the rest energy does not.

Now let’s consider the opposite extreme, namely photons or other particles that have little or no mass and/or very large momentum, such that the momentum terms dominate on the RHS of equation 11. We see immediately that in this limit

E_fast		≈		p_xyz·v
		≈		\|p_xyz\| c

(18)

For a fast-moving massive particle, these expressions are true to a good approximation. We have used the fact that the particle’s speed is very nearly the speed of light.

For a massless particle such as a photon, these expressions are exact. The particle’s speed is equal to the speed of light.

Comparing equation 18 with equation 17 is interesting. We see that the slow-moving particle has a kinetic energy equal to a half p_xyz·v, whereas the fast-moving particle has a kinetic energy equal to a whole p_xyz·v. This may seem peculiar, but it is in fact correct.

In ordinary situations, where the speeds are very small compared to the speed of light, the momentum and kinetic energy are easy to measure. The results agree with equation 17. We recognize ½p_xyz·v as the classical low-speed result, which is very well attested.
Photon momentum can be measured using a Nichols radiometer (not to be confused with a Crookes radiometer) and the energy is even easier to measure, using a bolometer or whatever. The results agree with equation 18.

The nice thing about special relativity is that it allows us to simultaneously understand the slow-moving particles and the fast-moving particles and everything in between. In particular:

In the low-speed limit, special relativity doesn’t tell us anything about the kinetic energy that we didn’t already know. It firmly predicts that the kinetic energy is ½ p_xyz·v, in agreement with the classical low-speed result. This is an example of the correspondence principle. It serves as a check on the theory. If special relativity did not agree with classical physics in this regime, we would know something was broken.
Similarly, in the high-speed limit, special relativity doesn’t tell us much beyond what we already knew. This is another check on the theory, another application of the correspondence principle.
The fact that special relativity gives us a unified view of both limits and everything in between is a nifty result that we get from special relativity, and could not have gotten otherwise. You can see immediately from figure 10 and/or figure 11 that at low speeds, the kinetic energy is quadratic in the momentum (½p_xyz²/m) while at high speeds the kinetic energy is linear in the momentum (p_xyz·v). These are just two parts of the same curve.

Figure 11 is similar to figure 10, but with some additional detail. The dark green curve, as before, represents the case where m=1, while the red curve represents a less-massive particle, m=0.2.

Figure 11: Dispersion Relations

You can see that:

At any particular mass, as |p_xyz| increases, the energy comes closer to the asymptote, E=|p_xyz|c.
At any particular kinetic energy, as the mass decreases, the total energy comes closer to the asymptote, E=|p_xyz|c.
At any particular momentum, as the mass decreases, the total energy comes closer to the asymptote, E=|p_xyz|c.

The small black circles in figure 11 indicate different rotation angles in the tx plane, from 0 to 1 radian in steps of 1/4 radian.

The dashed magenta curve in figure 11 represents the recommended approximation presented in equation 17, namely E≈mc²+½p_xyz·v. You can see that it is a very good approximation at moderate speeds, and even at the highest speeds it is never off by more than a factor of 2.

The other approximations presented in equation 17 are just as good when the speed is small, but not otherwise. At high speeds, E≈mc²+½p_xyz²/m is a woeful overestimate, while E≈mc²+½mv² is a woeful underestimate.

Figure 11 is in some ways related to figure 9. The relationship becomes more clear if we transpose figure 9, so that x increases horizontally and time increases vertically, as shown in figure 12.

Figure 12: Clock Plus Blue Coordinate System

In figure 12 as in figure 9, the rotation angle is 1/4 of a radian.

3.6 Better and Better Approximations

Let’s consider various approximations to the 4-momentum, in the case where the speed |v| is not too large.

To zeroth order in |v|, the 4-momentum is just [E,0,0,0] = [mc²,0,0,0].
To first order in |v|, we need to include the classical momentum, p_xyz = mv = mu_xyz, so that the 4-momentum is [mc², p_x, p_y, p_z].
To second order in |v|, we need to include the kinetic energy, ½p_xyz·v, so that the 4-momentum is [mc² + ½p_xyz·v, p_x, p_y, p_z].
Section 3.5 does not account for terms that are third order (or higher) in |v|. If you want the mathematical details on this, see equation 73.

In particular, the approximation v ≈ u_xyz is correct to first order, and indeed is exact to second order. The lowest-order contribution to the difference (v−u_xyz) is third order in |v|.

3.7 Remark : The Kinematic Significance of the Rest Energy

You may have heard about the importance of the rest energy E₀ = mc² in situations where the mass is changing, such as in nuclear reactions. We will discuss an example of this in section 3.8.

However, before we delve into that, let’s consider the significance of mc² in situations where the mass is not changing, such as the kinetic-energy calculation in section 3.5. In such a situation, you might ask why we don’t simply ignore the rest energy. The answer is that we need it for consistency.

The existence of the rest energy mc² makes the kinetic energy ½p_xyz·v consistent with our interpretation of velocity as a rotation in the tx plane. Specifically:

As for your p_x direction: The momentum mv_x can be understood as a little bit of the rest energy mc² that is peeking around the corner, i.e. that is being projected onto the p_x direction. Ditto for the p_y and p_z directions.
As for your E direction: To first order, the projection of the 4-momentum onto your E direction is unchanged by the fact that the particle is moving. To second order, the energy you measure – the projection of the energy onto your reference frame – is increased, and this increase is just the classical kinetic energy, ½mv².

It is ironic that the rest energy is not directly observable when the particle is at rest, but becomes visible when the 4-momentum is slightly rotated.

This is related to the reason why we write the 4-velocity of a particle at rest as u = [1, 0, 0, 0] instead of [0, 0, 0, 0]. We want to be able to write p = m u as an equation between 4-vectors. Note the correspondence between the energy/momentum 4-vector and the 4-velocity, when we rotate things by an angle θ in the tx plane. To lowest order:

u = [1, 0, 0, 0]		→		[1 + ½θ², θ, 0, 0]
p = [m, 0, 0, 0]		→		[m + ½p²/m, p, 0, 0]

(19)

where (to lowest order) the x-component of the 4-velocity is u_x = θ, and (to all orders) the momentum is p = m u.

3.8 Application : How to Make Antimatter (Vector Analysis)

There was a time, not so very long ago, when nobody had ever seen any antiprotons, and certain folks were highly motivated to build an accelerator that could make some. See reference 7 and reference 8.

The question for today is, how much energy must such an accelerator impart to the particles? For simplicity, assume we will accelerate a proton and smash it into a target containing a high density of stationary protons (e.g. liquid hydrogen).

There is an easy way to answer this question. This provides a wonderful illustration of the power of vectors in general, and four-vectors in particular. No math is required beyond high-school “Algebra I” plus the rule for taking dot products of 4-vectors. (See section 3.10 for another easy way of answering the same question.)

In order to get started, we need to understand what sort of reaction we are going to use. We have already decided on a proton/proton collision, so that tells us there will be two protons on the left-hand side of the reaction equation:

p + p ⇒ something (20)

There are all sorts of reactions that cannot possibly occur, because they would violate fundamental conservation laws such as conservation of charge, conservation of baryon number, or whatever. In particular, the following are ruled out:

p + p	⇒	p^–	(wrong)
p + p	⇒	p + p^–	(wrong)
p + p	⇒	p + p + p^–	(wrong)

(21)

where p stands for proton and p^– stands for antiproton.

The simplest possible reaction will be one that creates a proton/antiproton pair (and keeps the two protons we started with):

p + p ⇒ p + p + p^– + p (22)

Accelerators are hard to build, and we don’t want to make the accelerator much bigger than it has to be. Therefore, we don’t want to consider all possible versions of equation 22, but only the most energy-efficient versions. The minimum total energy will be achieved in the special case where the products of the reaction have the minimum kinetic energy. That means the products will not be moving relative to each other. This is fairly obvious when you think about it in the center-of-mass frame, as shown in figure 13.

Figure 13: Antiproton Production Reaction Sketch @ CM

Note that figure 13 is not intended to be quantitatively correct. We don’t know enough at this stage of the analysis to make a quantitatively correct diagram, but it is a good idea to make some sort of diagram anyway. Very often there is an iterative process:

Sketch some approximate diagrams.
The diagrams tell you what calculation to do.
The results of the calculation allow you to create diagrams that are more accurate. (See figure 23.)
And so on, iteratively.

In the lab frame, we will see the four product particles come flying out the backside of the target in a bundle, as shown in figure 14. In a later step, you can extract the antiproton from the bundle, perhaps by applying magnetic and/or electric fields.

Figure 14: Antiproton Production Reaction Sketch @ Lab

We can easily solve problem by using 4-vectors. (See reference 9 for details on what we mean by “vector”.)

Let p_b be the energy/momentum four-vector for the incident beam particle. Similarly p_t for the target particle, and p_p for the bundle of products.

By conservation of momentum, we have

p_b + p_t = p_p (23)

Squaring both sides we get

(p_b + p_t) · (p_b + p_t) = p_p · p_p (24)

Expanding we get

p_b² + p_t² + 2 p_b · p_t = p_p² (25)

We know many of the terms in this expression. For starters, we know that

p_b² = −m² (26)

where m is the mass of the incident particle, in accordance with equation 9. In this section, we have chosen to measure things in units such that c=1.

The correctness of equation 26 is obvious in the frame comoving with the incident particle. It must then be correct in all frames, since the gorm of any four-vector is invariant.

Similarly p_t² also equals −m².

Similarly p_p² must equal −(4m)². Don’t forget that the 4 gets squared.

Collecting results, we find

2 p_b · p_t		=		−(16−2) m²
p_b · p_t		=		−7 m²

(27)

All the equations to this point have been true in all frames. We now specialize to the lab frame. In the lab frame, the target is stationary, so its four-momentum has very simple components:

p_t = [m, 0, 0, 0]_{@ Lab} (28)

Combining the two previous equations and carrying out the dot product, we see that the timelike (energy) component of p_b must be 7m in the lab frame; that is:

p_b = [7m, ?, ?, ?]_{@ Lab} (29)

That tells us that in the lab frame, the incident particle must have a total energy of 7m. We can calculate the momentum, i.e. the spacelike components of equation 29 – see below – but we can answer the question without that.

Remember that the question asks how much energy must be supplied by the accelerator. The incident particle was born with 1m of energy, i.e. its rest energy, in accordance with equation 14 ... so the accelerator only needs to supply 6m.

E_K(required) = 6m (30)

Note: The Berkeley Bevatron was in fact designed to produce antiprotons. The design energy was very nearly equal to what we calculated in equation 30. Actually it was slightly less, because the designers were clever enough to not use a hydrogen target. They used copper. Protons in a non-hydrogenic nucleus are not stationary. Exclusion principle, orbitals, blah-de-blah. If you manage to hit a nucleon that is moving toward the incident beam, its kinetic energy contributes maybe 20% of the reaction energy.

3.9 Components versus Invariants

Let’s take a closer look at how the ruler lines up against the various coordinate systems.

It should be obvious from figure 15 that the ruler is 12 units long. It extends from x_@R = 2 to x_@R = 14, and it has no extent at all in the y_@R direction (since we are talking about the length, not the width).

It should be obvious from figure 16 that the ruler is 12 units long. It extends from x_@R = 2 to x_@R = 14, and it has no extent at all in the t_@R direction (since we have taken a snapshot at constant t_@R = 12).


Figure 15: Ruler x\|y; Red Coordinate System		Figure 16: Ruler x\|t; Red Coordinate System


Figure 17: Ruler x\|y; Blue Coordinate System		Figure 18: Ruler x\|t; Blue Coordinate System

It should be obvious on physical grounds that the ruler in figure 17 is 12 units long, since it’s the same ruler! Switching to a different coordinate system cannot possibly change the length of the ruler.

It should be obvious on physical grounds that the ruler in figure 18 is 12 units long, since it’s the same ruler! Switching to a different coordinate system cannot possibly change the length of the ruler.

We can also compute the length using figure 17, although this requires slightly more work. If you look closely at the figure, you can see that the ruler begins a little to the right of x_@B = 2 and ends a little to the left of x_@B = 14, so the x component is slightly less than 12 units. There is also a nonzero y component. Specifically, the components are:

Δx		=		12 cos(0.25)	=	11.6269
Δy		=		12 sin(0.25)	=	2.9688

(31)

We can also compute the length using figure 17, although this requires slightly more work. If you look closely at the figure, you can see that the ruler begins a little to the left of x_@B = 2 and ends a little to the right of x_@B = 14, so the x component is slightly greater than 12 units. There is also a nonzero t component. Specifically, the components are:

Δx		=		12 cosh(0.25)	=	12.377
Δt		=		12 sinh(0.25)	=	3.0313

(32)

When we account for both components we find that the length is indeed 12 units.

The relevant equation is:

(proper length)²

Δx² + Δy²

(33)

The relevant equation is:

(proper length)²

Δx² − Δt²

(34)

The minus sign that shows up in equation 34 is yet another manifestation of the minus sign that we first saw in equation 4.

When measuring the length of some object that is oriented at an arbitrary angle in the xy plane, you can’t just measure the x-component and call it quits. You have to account for the x and y components, both. The x_@B component is not the length.

When measuring the length of some object that is moving at an arbitrary rapidity in the x direction, you can’t just measure the x-component and call it quits. You have to account for the x and t components, both. The x_@B component is not the length.

This is a basic fact about the geometry of spacetime. We have already seen this in the context of momentum vectors. We used it to calculate the kinetic energy in section 3.5. The only thing that is new in this section is that we have emphasized the pictorial representation (not just the equations) and applied it to position vectors (not just momentum vectors).

A rotation in the xy plane guarantees that the x-component is less than or equal to the proper length. This has been understood in connection with perspective in painted artwork for many centuries. Artists call it foreshortening.

A rotation in the tx plane guarantees that the x component is greater than or equal to the proper length. Remember that the geometry in timelike directions is non-Euclidean. This could be called forelengthening ... but I’m not sure that term will ever catch on very widely.

For fast-moving objects, you really need to pay attention to Big Idea #2 if you want to get the right answers. Everybody learned in grade school that x, y, and z are “the” components, and everybody habitually takes them into account when calculating the length. Special relativity tells us that t is also a component, and must be taken into account when calculating the length.

Let’s turn our attenion 90 degrees, and see what happens if we want to calculate elapsed time (rather than length). If you have accepted Big Idea #2, the results will be completely routine ... but if you have not yet fully accepted the idea that spacetime is four-dimensional, you are in for a surprise.

The following figures are the same as the preceding figures, except that we consider an object that extends in some non-x direction.


Figure 19: Ruler y\|x; Red Coordinate System		Figure 20: Clock t\|x; Red Coordinate System


Figure 21: Ruler y\|x; Blue Coordinate System		Figure 22: Clock t\|x; Blue Coordinate System

The ruler in figure 19 and figure 21 is 12 units long. It’s the same ruler!

The elapsed time in figure 20 and figure 22 is 12 units. It’s the same clock! The start-event is the same in both figures. The end-event is the same in both figures.

You can see in figure 21 that the y_@B component is slightly less than 12.

You can see in figure 22 that the t_@B component is slightly greater than 12.

The relevant equation is:

(proper length)²

Δy² + Δx²

(35)

The relevant equation is:

(proper time)²

Δt² − Δx²

(36)

Again: For fast-moving objects, you really need to pay attention to Big Idea #2 if you want to get the right answers.

When you measure the x component, it’s usually obvious that there are other components you need to worry about.

When you measure the t component, if you don’t understand special relativity, it won’t be the least bit obvious that there are other components you need to worry about.

You have to account for all the components. The y_@B component is not the proper length.

You have to account for all the components. The t_@B component is not the proper time.

3.10 Application : How to Make Antimatter (Graphical Analysis)

Whenever a calculation produces a result that is simpler than expected, it is a good practice to see if there is a simpler way of obtaining the same result.

The result obtained in section 3.8 falls into this category. It was not obvious a priori that the answer would be a round number, so we have to suspect there is a more elegant way to obtain this number, and a better way of understanding where it comes from. Indeed there is. With the aid of the spacetime diagrams, you can solve the whole problem in your head, using no mathematics beyond addition, subtraction, multiplication and division ... plus a qualitative notion of rotation in the xt plane. (This is even simpler than the method presented in section 3.8, which uses vectors and dot products.)

This method is easier and more elegant, but it is less powerful in the sense that it depends on the symmetry of the situation. In contrast, the 4-vector method would work even in less-symmetrical situations.

In the center-of-mass frame, as we can see in figure 23, the product particles have no kinetic energy, so their total energy is just their rest energy, for a total of 4m. By conservation of energy, that means the incident particle and the target particle have 4m of energy total, or 2m apiece. That means that for each of them, the energy is evenly split: 1m of rest energy and 1m of kinetic energy.

Figure 23: Antiproton Production Reaction @ CM

Similarly, we can create a spacetime diagram of the situation in the lab frame, simply by boosting the worldlines in figure 23, thereby producing figure 24. It takes only a few moments to do this using the transform dialog in the drawing program, as discussed in section 5.

Figure 24: Antiproton Production Reaction @ Lab

It is no accident that the angle θ₁ in figure 23 is the same as the angle θ₁ in figure 24. The two figures show the same physics, and differ only by a rotation of the reference frame. This fact – combined with the fact that the target particle’s energy in the CM frame was evenly split – tells us that the product particles’ energy in the lab frame is also evenly split: Each particle has 1m of rest energy and 1m of kinetic energy. The total energy for the four-particle bundle is 8m.

We have used the idea that each particle’s energy is determined by its mass and its rapidity, and the rapidity θ₁ is the same in both figures.

The target particle has 1m of energy in the lab frame, so conservation of energy tells us that the incident particle must have 7m of energy, of which 1m is rest energy and 6m is kinetic energy.

This is the answer to the question: The accelerator must impart 6m of kinetic energy to the incident particle. In engineering units, the mass of a proton is about a GeV (.938 GeV) so we must design the accelerator to produce about 6GeV.

We have solved the problem without worrying too much about the numerical value of θ, but we can quantify it if we wish, as follows: The short version is that the energy varies in proportion to cosh(θ).

The long version of the same story goes like this: The 4-momentum of any particle in its own rest frame has components [m, 0, 0, 0] in accordance with equation 19. In any other reference frame, the 4-momentum has components [m cosh(θ), m sinh(θ), 0, 0] as you can see by applying equation 45.

That tells us that in figure 23 and figure 24, the rapidity is θ₁ = arccosh(2). We know that arccosh(2) = 1.31696, but we didn’t really need to know that to solve the problem.

The figures in this section (figure 23 and figure 24) are drawn with the quantitatively-correct angle, θ₁ = 1.31696. This is in contrast to section 3.8, where the sketches (figure 13 and figure 14) used the artistically-licentious value of θ₁=0.5. It turns out that the diagrams with the quantitatively-correct angles don’t tell us much beyond what the non-quantitative sketches told us. In some ways the sketches are actually easier to interpret.

Sometimes you want a quantitatively correct blueprint, and sometimes you would rather have a sketch where some features have been exaggerated for clarity. When in doubt, make one of each. Keep in mind that the diagram cannot be expected to do the whole calculation for you; instead the diagram should guide the calculation. Then the calculation can guide the construction of a better diagram, and so on, iteratively.

Remark: If we turn our attention to the incident beam particle, and examine its energy-versus-rapidity relationship in the two coordinate systems, we discover that we have just proved that arccosh(7) = 2 arccosh(2). This can be understood as a special case of a trigonmetric identity, namely the double-angle formula cosh(2θ) = 2cosh²(θ)−1.

3.11 Application : Muon Lifetime

It is fairly easy to obtain muons. They are produced all the time by cosmic rays striking the upper atmosphere. They are also produced by particle accelerators.

It is known from experiments on stationary muons that they decay with a half-life of 1.56 microseconds. However, the available muons are not stationary. Let’s consider the case where they have a rapidity of θ = 3 radians, which means their classical velocity is v = dx/dt = tanh(3) = 99.5% of the speed of light.

Let’s calculate how far they will travel. A naïve rate × time calculation suggests that half of them will survive for 1,560 feet ... but this is quite wrong. It’s off by a factor of 10.

Here is the correct calculation: The thing that matters is the proper time. The experiment on stationary muons measures the proper time, τ_½= 1.56µs. In contrast, when the muon is not at rest with respect to the lab, the t_@lab component is not the proper time. That is, the time you naïvely measure with a stopwatch in the lab frame is not the proper time.

Geometry tells us that the t_@lab component will be longer than τ by a factor of dt/dτ = cosh(θ) = cosh(3) = 10.

As another way of saying the same thing, the 4-velocity of the muon is:

u	=	[dt/dτ,	dx/dτ,	dy/dτ,	dz/dτ]
	=	[cosh(θ),	sinh(θ),	0,	0]_@lab
	=	[10.0667,	10.0179,	0,	0]_@lab

(37)

Let’s be clear:

The half-life of the muon is unchanged. The half-life is still 1.56µs. It’s the same muon! Boosting a reference frame cannot possibly change the way a muon keeps time, for the same reason that rotating a reference frame cannot possibly change the length of a ruler.
The time you measure with a stopwatch in the lab frame is not the muon lifetime. It is the projection of that lifetime onto the t_@lab direction.

Δt_@lab = (dt/dτ) τ_½

≈ 10 τ_½

(38)
The distance you measure with a ruler in the lab frame is not cτ_½. It is the projection of the muon’s worldline onto the x_@lab direction.

Δx_@lab = (dx/dτ) τ_½

= (dx/dt) (dt/dτ) τ_½

≈ 10 c τ_½

(39)
All of this is well explained by projective geometry. There’s nothing weird or tricky going on. No part of the explanation depends on any special properties of the muon; to explain the Δt_@lab and Δx_@lab components we need to know the half-life and the rapidity, nothing more.
It is nice that the explanation is independent of the internal details of the muon. This independence keeps things simple. More importantly, it increases our confidence in the principle of relativity. It guarantees that you can measure proper time using any method you choose: muon clocks, photon clocks, cuckoo clocks, biological aging processes, and/or whatever else you can think of. In every case, proper time gets projected onto the lab frame in the same way, because the projection has got nothing to do with how the clocks work; it is entirely explained by the geometry and trigonometry of spacetime.
To say the same thing the other way: Suppose the dt/dτ did depend on the internal workings of the clock.
- If different clocks were affected in different ways, it would violate the basic principle of relativity. By comparing different types of clock, you could tell whether or not you were moving.
- If all N types of clock were affected in the same way, you would a bizarre N-way coincidence, which would be very hard to explain.

The following diagrams may make the situation easier to visualize. Recall that most of the previous spacetime diagrams considered the situation where the rotation angle was 0.25 radians. Figure 25 shows the situation where the rotation angle (i.e. the rapidity) is a full radian. You can see that the red reference frame is rather seriously stretched in one direction and squashed in another direction. If we increase the angle to 2 radians, as in figure 26, things are so badly stretched and squashed that the diagram is hard to interpret. Three radians would be so bad that it’s not worth showing the diagram, even though that is the case that corresponds to our muon example. At some point you have to trust the equations ... and/or use your mind’s eye to extrapolate on the basis of figure 25 and figure 26.

Figure 25: Clock Plus Two Coordinate Systems : Rapidity=1

Figure 26: Clock Plus Two Coordinate Systems : Rapidity=2

3.12 Some Trigonometry

Figure 27 shows part of a circle, in green. This is what we get if we consider an ensemble of vectors, rotated in the xy plane by various amounts. The small black circles represent angles from 0 to 1 radian, in steps of 1/4 radian.

Figure 28 shows part of a hyperbola, in green. This is what we get if we consider an ensemble of vectors, rotated in the tx plane by various amounts. The small black circles represent angles from 0 to 1 radian, in steps of 1/4 radian.


Figure 27: Rotations in the xy Plane		Figure 28: Rotations in the tx Plane

The points in figure 27 satisfy equation 40, which in some sense defines what we mean by circle:

x² + y²

(40)

The points in figure 28 satisfy equation 41, which in some sense defines what we mean by hyperbola:

t² − x²

(41)

I did not, however, plot figure 27 by solving the equation x² + y² = 1. Instead I plotted x=cos(θ) and y=sin(θ) for various values of θ.

I did not, however, plot figure 27 by solving the equation t² − x² = 1. Instead I plotted t=cosh(θ) and x=sinh(θ) for various values of θ.

The functions sin(), cos(), tan(), etc. are called circular trig functions.

The functions sinh(), cosh(), tanh(), etc. are called hyperbolic trig functions.

The trigonometric identity cos² + sin² = 1 guarantees that the dot product between any two vectors is invariant under rotations in the xy plane.

The trigonometric identity cosh² − sinh² = 1 guarantees that the dot product between any two vectors is invariant under rotations in the tx plane.

The minus sign that shows up in equation 41 is essentially the same as the minus sign that shows up in equation 4. It is the hallmark of non-Euclidean geometry.

Note that figure 28 conveys essentially the same information as figure 10. The main difference is that each is transposed relative to the other. That is, we plot t horizontally and x vertically in one figure, and vice versa in the other.

The choice of which variable to plot in which direction is a matter of taste. In figure 10 it looks better to plot the timelike variable (energy) vertically. Indeed there is a tradition in the relativity business, dating back to Minkowski, of plotting the timelike variable vertically. (This conflicts with the high-school physics tradition of plotting time horizontally.)

No matter what the tradition, we are allowed to make exceptions, as we did in figure 28, where the timelike variable runs horizontally. This is connected to figure 9, which also plots time horizontally, to facilitate comparison with figure 7 ... and thereby to help explain the idea of slope in spacetime.

Let’s revisit the idea of slope. Here are copies of figure 7 and figure 9.


Figure 29: Ruler Plus Blue Coordinate System		Figure 30: Clock Plus Blue Coordinate System

In figure 29, for small rotation angles, the slope is proportional to the angle θ. For larger angles, the relationship is nonlinear: the slope is given by:

dy/dx

tan(θ)

(42)

In figure 30, for small rotation angles, the reduced velocity is proportional to the angle θ. For larger angles, the relationship is nonlinear: the reduced velocity is given by

dx/dt		=		tanh(θ)
		i.e.		c tanh(θ)

(43)

The rotation matrix for a rotation in the xy plane is:

R(θ)

⎡
⎢
⎣

cos(θ)		−sin(θ)
sin(θ)		cos(θ)

⎤
⎥
⎦

(44)

This uses circular trig functions ... and one of the matrix elements has an important minus sign.

The rotation matrix for a rotation in the tx plane is:

R(θ)

⎡
⎢
⎣

cosh(θ)		sinh(θ)
sinh(θ)		cosh(θ)

⎤
⎥
⎦

(45)

This uses hyperbolic trig functions ... and there are no minus signs.

Here is equation 45 again, with more context, to provide a hint about what the matrix elements mean:

⎡
⎣

⎤
⎦

_@R

⎡
⎢
⎣

⎤
⎥
⎦

⎡
⎢
⎣

cosh(θ)		sinh(θ)
sinh(θ)		cosh(θ)

⎤
⎥
⎦

(46)

Summary: If you’ve been paying any attention at all, you will have noticed that spacetime is not quite the same as ordinary Euclidean space, but there are profound similarities:

angle	↔	angle
sin	↔	sinh
cos	↔	cosh
slope = tan(θ)	↔	velocity = tanh(θ)
cos² + sin² = 1	↔	cosh² − sinh² = 1

(47)

We continue this line of thought in the next section.

3.13 Orthogonality in Spacetime

Let’s take another look at the red coordinate systems in figure 15 and figure 16.

The first thing we notice is that each of them is tilted relative to the corresponding blue coordinate system. (There is a vestige of the blue coordinate system in the middle of each diagram, to facilitate this comparison.) However, there are two different types of tilt:

In figure 15, both the contours of constant y and the contours of constant x are tilted counterclockwise (relative to the blue system). The whole system looks like it has been rotated.

In figure 16, the contours of constant x are tilted counterclockwise, while the contours of constant t are tilted clockwise. Superficially, the whole system looks like it has been skewed ... but really it is has just been rotated in the tx plane.

This is characteristic of conventional circular trigonometry.

This is characteristic of hyperbolic trigonometry. This is yet another manifestation of the minus sign that we saw in equation 4. We have seen the same minus sign again and again.

In figure 15, the contours of constant x are orthogonal to the contours of constant y ... as is apparent from the diagram.

In figure 16, the contours of constant t are orthogonal to the contours of constant x ... even though this is not readily apparent from the diagram.

Here’s the deal: In figure 16, the lines on paper are merely symbols that represent the actual contours in spacetime. The lines on paper are obviously not orthogonal ... but the contours that they represent are orthogonal.

Never mistake a symbol
for the thing symbolized.

Let’s do an example. Let’s consider two basis vectors in the red frame:

t̂_@R		=		[1, 0, 0, 0]_@R
x̂_@R		=		[0, 1, 0, 0]_@R

(48)

It is obvious that these two vectors are orthogonal. If it’s not obvious, you can check it using equation 4 and especially equation 6.

Meanwhile, the same two vectors can be analyzed in the blue frame:

t̂_@R		=		[cosh(θ), sinh(θ), 0, 0]_@B = cosh(θ) t̂_@B + sinh(θ) x̂_@B
x̂_@R		=		[sinh(θ), cosh(θ), 0, 0]_@B = sinh(θ) t̂_@B + cosh(θ) x̂_@B

(49)

If we take the dot product between these two vectors, using the blue-frame expansion on the LHS of equation 49, we find it is equal to −cosh(θ)sinh(θ) + sinh(θ)cosh(θ), which is always zero, confirming that the vectors are orthogonal.

One way to explain this is to say that the minus sign that is present in the dot-product rule (equation 4) makes up for the minus sign that is missing from the rotation matrix (equation 45).

This is one of the few truly tricky things about special relativity: Whereas a diagram such as figure 15 is a remarkably faithful representation of the actual rotated contours, a diagram such as figure 16 is not an entirely faithful representation. You need some skill to interpret it correctly.

In any case, the fact remains that spacetime diagrams are your friend. Having a spacetime diagram is always better than not having one. The main points of a spacetime diagram are easy to interpret, and if the fine points are somewhat hard to interpret, so be it.

3.14 Fast-Moving Particles : Speed, Momentum, and Energy

Let’s impose two coordinate systems (red and blue) on the same physics. Specifically, let’s superimpose figure 12 and the corresponding red coordinate system. The result is shown in figure 31.

Figure 31: Fast-Moving Particle

The black line in figure 31 represents the worldline of a fast-moving particle. It has a reduced velocity v = [c, 0, 0]. Remarkably, its reduced velocity is the same in either frame (and in any other rotated frame, for any rotation in the tx plane).

The other diagonal (not shown) has the same property: A particle with reduced velocity v = [−c, 0, 0] has the same reduced velocity in any frame. No other directions in the tx plane have this property.

This is very unlike ordinary spacelike rotations, where no vector in the plane of rotation is unaffected by rotations.

When you calculate the reduced velocity in the two different frames, the Δt and the Δx will be different. You can see by looking at the starting-point and ending-point of the black line, and evaluating the coordinates of these points in the two different frames. However, the ratio Δx/Δt will be the same in both cases.

If you take an ordinary particle (such as an electron) and boost it to higher and higher rapidity, its world line gets closer and closer to the black line in figure 31. So, loosely speaking, the black line corresponds to a world line where the x-component of the 4-velocity is infinite.

For a massless particle (such as a photon) moving in the x direction, its worldline coincides with the black line. The 4-velocity of such a particle is undefined.

Interestingly enough, the 4-momentum is perfectly well defined for massless particles, even though the 4-velocity is not. Obviously you cannot compute the 4-velocity from the 4-momentum via the formula u=p/m, since the mass is zero. Still, you can measure the energy and the momentum directly.

For a massless particle, E² always equals p_xyz², in accordance with equation 11.

Important tangential remark: The speed “c” is conventionally called the speed of light. However, the phenomenon we are describing here is absolutely not restricted to light. The speed we are talking about throughout this document is a geometrical property of spacetime. Rather than calling it the speed of light, you could call it the speed of diagonals in spacetime.

Special relativity does not need light.
Special relativity does not need photons.
Special relativity does not need electromagnetism.
Special relativity is the geometry and trigonometry of spacetime.

For details on this, see reference 10.

3.15 Application: Relativistic Doppler Shift and Aberration

3.15.1 Low-Speed Case

We start by reviewing the familiar low-speed situation. The main purpose here is to establish the interpretation of the diagrams.

So ... Suppose we are having a slug race. We take 12 slugs and set them all at the same location. They immediately begin slithering away from each other in 12 different directions, all at the same speed |v|. The situation relative to the red reference frame is shown in the diagram on the left in figure 32. The green lines represent velocity vectors. Position is not represented in these diagrams, and is not relevant, since we are considering the initial situation, when all 12 slugs are at the same location.

Figure 32: Aberration : Low Speed

Now let’s look at the same situation in the blue reference frame, which is moving northward (relative to the red reference) frame at a rate equal to three quarters of the slug-speed |v|. This situation is shown in the middle diagram in figure 32. In this frame, the slugs that were moving northward now have a smaller speed (as seen near the 12:00 position in the diagram), while slugs that were moving southward have a greater speed (as seen near 6:00).

Continuing that line of thought, let’s look at the same situation in a frame that is moving northward even faster, at a speed 1.5 times the slug-speed |v|. Even the slugs that were moving northward in the red frame are moving southward in this frame.

There is nothing tricky going on here. These results should be familiar. They are well explained by classical physics. Of course special relativity agrees with classical physics in the low-speed regime.

3.15.2 High-Speed Case

Let’s do the same experiment again, except using photons instead of slugs. Photons are quite a bit faster than slugs.

That is, we set off a flash of light. Twelve photons fly outward, in 12 different directions, all with the same speed |v| = c. The situation as seen in the red reference frame is shown in the left diagram in figure 33.

Figure 33: Aberration : High Speed

Now let’s look at the same situation in the blue reference frame, which is moving northward (relative to the red frame) with a rapidity of 1/3rd of a radian. (That’s about 32% of the speed of light.) This is shown in the middle diagram in figure 33.

Continuing that line of thought, let’s look at the same situation in a frame that is moving northward with a rapidity of 2/3rds of a radian. (That’s about 58% of the speed of light.) This is shown in the right diagram in figure 33.

You can see that in all cases, in all frames, the photons travel with speed |v| = c.

Note that no matter how fast your frame is moving northward, it will never catch up with the northward-moving photon.

Here’s how to calculate such things. The executive summary is very simple and easy to understand: Promote the classical velocity from a 3-vector to a 4-vector, boost the 4-vector, and then convert it back to a 3-vector.

Here are the details. We assume the initial photon direction is known. Since the speed |v| is known to be c, we know the entire classical velocity vector v.

The classical velocity v is a 3-vector [v_x, v_y, v_z], but we can promote it to promote it to a 4-vector of the form

q = [1, v_x, v_y, v_z]

i.e. [c, v_x, v_y, v_z]

             (50)

If we know the energy E of the photon, we can multiply this q by E/c to obtain the 4-momentum, namely

p = [E/c, p_x, p_y, p_z]

= [E/c, Ev_x/c, Ev_y/c, Ev_z/c]

             (51)

If we don’t know the energy, or don’t care, we can set E=1 and forge ahead. It doesn’t matter, because the whole calculation is linear, and E is effectively just a scale factor.
As a check on our work, note that the gorm of p is zero, as it should be for a massless particle such as a photon.
As a further check, note that if we calculate v from p, by plugging equation 51 into equation 71, we get back the v we started with.
Now that we know the 4-momentum of the photon, we can rotate it using the usual boost matrix. Equation 45 is the relevant matrix when we only need to worry about one spatial dimension. Since we are dealing with more dimensions here, we might as well write out the full 4-dimensional matrix for a boost in the x direction:

R(θ) =
⎡
⎢
⎢
⎢
⎣
cosh(θ)    sinh(θ)    0    0

sinh(θ)    cosh(θ)    0    0

0    0    1    0

0    0    0    1

⎤
⎥
⎥
⎥
⎦

             (52)

Beware that the boost angle (aka rapidity) θ will be negative in our example, since the red frame is moving in the −x direction relative to the blue frame.
This gives us the components of the 4-momentum relative to the blue frame.
- The E-component of the 4-momentum tells us how much the photon gets redshifted or blueshifted by the transfer from one reference frame to the other.
- The spacelike components of the 4-momentum tell us the direction of the photon (relative to the blue frame). Specifically: We calculate the classical velocity by applying equation 72; that is, we divide each of the spacelike components by the E-component.

This 3-step procedure can easily be reduced to a closed-form expression, but the resulting expression is much harder to remember, and not any easier to use in practice.

3.15.3 General Case

In the general case, we have a particle that may or may not have any mass. If it has mass, it may be moving very slowly, very quickly, or anywhere in between. An intermediate case is shown in figure 34. In this figure (as in other figures in this section), the red ring represents the speed of light. The pink disk serves as a reminder of what the velocity vectors were doing originally, when the blue frame was not moving relative to the red frame.

Figure 34: Aberration : Fast but Not Massless

In all cases, we use the same 3-step procedure: Figure out the particle’s 4-momentum, boost the 4-momentum, and then (if necessary) convert that to a classical velocity. All the figures in this section are computed using the same code, just using different parameters. The parameters are given in the following table:

	m	\|p_xyz\|
figure 32	1	.01
figure 33 and figure 35	0	1.5
figure 34	1	1.5

You can see that figure 34 is intermediate between figure 32 and figure 33. This demonstrates yet again the power and elegance of special relativity: It provides us a unified understanding of the low-speed limit, the high-speed limit, and everything in between.

It must be emphasized that this approach is quite general. It treats massive particles and massless particles the same way. We have not made use of any detailed knowledge of the electromagnetic field, even during the discussion of photons in section 3.15.2; we merely assumed that the photon was a particle with some energy and momentum but no mass.

One famous application has to do with the so-called “aberration of starlight” which was first noticed experimentally hundreds of years ago. The earth in its orbit is moving at about 0.01% of the speed of light, and the direction changes every 6 months. This has a noticeable effect on the apparent direction from which light arrives from distant stars; that is, the stars appear to shift position.

For some purposes, 0.01first-order semi-classical approximation is satisfactory, and you don’t need to understand special relativity to calculate the aberration. On the other hand:

It was important in the history of relativity to come up with a formula for the aberration that not only gives the right answer but also upholds the basic principle of relativity ... which the first-order approximation does not.
Modern high-accuracy astrometry using fast-moving satellites can measure the higher-order terms, for which special relativity is the only explanation.

We also care about the Doppler part of the equation (not just the angular aberration). There are bench-top atom-trapping experiments where the frequencies are so finely tuned that the fully-relativistic Doppler formula is needed. There are also innumerable applications in elementary particle physics.

3.15.4 Transverse Components

Note that the transformation matrix equation 52 leaves unchanged the two components of the 4-velocity that are transverse to the boost, i.e. transverse to the relative velocity between the two frames. This is simple, and makes perfect sense in four dimensions. It agrees with your intuition at low speeds, where the classical velocity and the 4-velocity behave pretty much the same. You can see that each dot in figure 32 moves straight down the page as the velocity of the blue frame (relative to the red frame) increases.

This stands in contrast to the situation at higher speeds, where the transverse components of the classical velocity do change. You can see in figure 35 that the upper to dots initiallyn move away from the midline, while the lower two dots move toward the midline.

The only reason for mentioning it is to warn you that it is not worth thinking very much about this phenomenon in three dimensions or in terms of the classical velocity. Far and away the simplest way to explain what is going on is the three-step procedure given above: promote the 3-vector to a 4-vector, boost the 4-vector, and then convert back to a 3-vector.

Figure 35: Aberration : Effect on Transverse Components

For a massive particle, we can understand this as follows: The boost does not affect the transverse components of the 4-velocity u = d(position)/dτ, but it does affect the transverse components of the classical velocity v = d(position)/dt, for the simple reason that it affects dt. Remember that dt/dτ = cosh(θ).

For a massless particle such as a photon, you can make almost the same argument, but you have to phrase it in terms of the 4-momentum rather than the 4-velocity. (A massless particle doesn’t have any proper time, and its 4-velocity components are either undefined or infinite ... but its 4-momentum is still perfectly well behaved.)

In any case, the point is that the physics is simple in four dimensions. The projection onto three dimensions is not simple.

3.16 Long, Steady Acceleration

Consider the following puzzle:

Suppose an interstellar rocket starts from rest and accelerates in a straight line such that the passengers feel one Gee for one year. How fast are they going at the end of the year?

This puzzle is quite easy to solve, if you think about it the right way.

First of all, we need to interpret the terminology used in the statement of the puzzle. We assume that “how fast” refers to the classical velocity (v = dx/dt). It is usually safe to assume that anybody who is interested in the 4-velocity u = dx/dτ) is clever enough to ask for it explicitly.
Therefore the answer will take the form of v = tanh(θ), and all we need to do is find the value of the rapidity, θ.
Similarly, we assume that “one year” means one year of proper time, since that is what the passengers experience. (The projection of this time onto the lab frame will cover more than one year of lab-time.)
We are (once again) going to take seriously the idea that a boost is just a rotation in the tx plane.
Rotations have the nice property that if you rotate by an angle θ₁ and then rotate by an additional angle θ₂, the combined effect is the same as a single rotation by an angle (θ₁ + θ₂). That is, for compound rotations, the angles are additive.
In the first second of flight, the spaceship gains 9.8 m/s of velocity. That corresponds to 32.7 nanoradians of rapidity. This is obvious in the lab frame.
In the next second of flight, the spaceship gains another 32.7 nanoradians of rapidity. This is also obvious in the lab frame.
In the sixteen millionth second of proper time during flight, the spaceship gains yet another 32.7 nanoradians of rapidity. This is not necessarily obvious in the lab frame.
Therefore we introduce the idea of an instantaneously comoving reference frame, as shown in red in figure 36. In this frame, the ship has a small velocity and is undergoing a gentle acceleration, so we can use classical physics to understand what is happening in this frame. (For details on this, see section 3.17).

Figure 36: Steady Acceleration
Time in this frame is equal to the ship’s proper time. We conclude that the whole flight is described by saying that the rapidity is proportional to proper time. The constant of proportionality is 32.7 microradians per second. That’s the acceleration, in spacetime units.
There are 31556926 seconds in a year, so at the end of one year of proper time, the spaceship has accumulated 31556926×32.7e-9 = 1.03 radians of rapidity.
It is quite a remarkable coincidence that earth’s surface gravity times the earth’s year very nearly equals 1 radian.
The small black circles in figure 36 correspond to rapidities from 0 to 1 radian in steps of 0.25.
So ... The answer to the question is: At the end of the year, v = 77.5% of the speed of light.

Remarks: This is obviously a made-up puzzle, not a real-world application, but it is easy and fun, and illustrates some useful principles. Also, there are some real-world problems that are not too different from this, for instance having to do with particle accelerators.

3.17 Steady Acceleration : Additional Discussion

We have already answered the question that was posed in section 3.16, but this system has some additional interesting features that we can explore.

The instantaneously comoving reference frame in figure 36 is an unaccelerated reference frame. (You could use an accelerated frame, but that would be unnecessary extra work.) We emphasize that this frame is not attached to the spaceship. It is just something that happens to be in the neighborhood as the spaceship passes by.

Indeed, it does not even need to be exactly comoving; all we really need to do is choose a frame where the ship is moving slowly (relative to the chosen frame) ... sufficiently slowly that we can confidently apply the classical (non-relativistic) laws of physics.

Whenever you encounter a new idea, it is smart to turn it over in your mind, checking whether it is consistent with other things you know, and seeing how it fits in. It is smart to be skeptical.

The technique of using an instantaneously comoving reference frame fits in as follows: It is quite a direct application of the basic principle of relativity, as set forth in section 3.2: The spaceship does not care about the distant past or the distant future. It does not care how things look in any particular reference frame. In figure 36, we are free to ignore the blue coordinate system and use the red reference system. At times when the ship’s rapidity is approximately 0.5 radian, the ship is moving only slowly with respect to the red reference frame, and the situation is entirely classical. Assuming the ship is in empty space, unaffected by outside influences, there is no experiment anyone can do to demonstrate that the ship is moving relative to the blue reference system.

The skeptical reader may also be wondering about the assertion that for a compound rotation, the angles are additive. For a rotation in the tx plane, we know that the velocities are not additive. We know that any nonlinear function of the angle (such as angle cubed) is not additive. So what is special about the angle that makes it additive? Here are three answers:

It should be plausible that angles in the tx plane are additive, by analogy to your experience with angles in the xy plane.
If the angle were not additive, we would redefine our notion of angle so as to make it additive.
More formally: You can multiply the rotation matrices (as given in equation 45) and then use hyperbolic trigonometric identities to show that R(θ₁+θ₂) = R(θ₁)R(θ₂). Indeed, if one of the angles is small, it suffices to show that (d/dθ₁)R(θ₁+θ₂) = (d/dθ₁)R(θ₁)R(θ₂) [evaluated at θ₁=0], and you don’t need trig identities for that; I can do that one in my head.

The whole flight is described by the equation:

dθ/dτ		=		a/c
		=		32.7 nanoradians per second

(53)

which we can immediately integrate to find that θ(τ) = (a/c)τ.

Therefore the 4-velocity is

u(τ)

[cosh(aτ), sinh(aτ), 0, 0]_@B

(54)

which is consistent with saying the classical velocity is tanh(aτ), as we did in section 3.16.

We can immediately integrate equation 54 to find the position:

X(τ)

[sinh(aτ)/a, cosh(aτ)/a, 0, 0]_@B

(55)

This tells us that the ship’s worldline (shown in dark green in figure 36) is a hyperbola. Indeed, steadily accelerated motion is sometimes referred to as hyperbolic motion in spacetime.

For yet more discussion of acceleration in spacetime, including sideways acceleration and circular motion, see reference 11.

3.18 Breakdown of Simultaneity at a Distance

Recall that figure 15 and figure 17 show a ruler that extends mostly in the x-direction in the two coordinate systems we have been considering. We now look at those figures again. In each case, we pair it with the analogous situation in the tx plane.


Figure 37: Ruler x\|y; Red Coordinate System		Figure 38: Ruler x\|t; Red Coordinate System


Figure 39: Ruler x\|y; Blue Coordinate System		Figure 40: Ruler x\|t; Blue Coordinate System

We contrast that with rulers and logs that extend mostly in the other (non-x) direction.


Figure 41: Ruler y\|x; Red Coordinate System		Figure 42: Clock t\|x; Red Coordinate System


Figure 43: Ruler y\|x; Blue Coordinate System		Figure 44: Clock t\|x; Blue Coordinate System

Note the contrast:

In figure 38 we see a ruler that is aligned with the red contours of constant time. The clocks at each end of the ruler agree. This is completely routine. We have colored the clocks red to emphasize that they were synchronized in the red system.
In figure 40, we see that according the blue coordinate system, the red clocks are not synchronized. Look at what the dial is indicating on each clock, and then look at where the clocks sit relative to the blue contours of constant time. This is a firm prediction of special relativity, and it turns out to be true. It is called the breakdown of simultaneity at a distance. Things that are simultaneous according to one reference frame are not simultaneous according to another.

The breakdown of simultaneity at a distance is something we learn by taking seriously the idea that time is the fourth dimension, and taking seriously the correspondence between rotations in the xy plane and rotations in the tx plane. Let’s be clear: To first order, every small² rotation does two things:

For a small rotation in the xy plane, a vector that extends in the x-direction picks up a small y-component ... and ... a vector that extends in the y-direction picks up a small negative x-component.

For a small rotation in the xt plane, a vector that extends in the t-direction picks up a small x-component (which corresponds to the ordinary classical velocity) ... and ... a vector that extends in the x-direction picks up a small t-component (which corresponds to the breakdown in simultaneity at a distance).

In principle, it is straightforward to observe this breakdown. We can observe the time that the left clock strikes zero. This is an event in spacetime, by which we mean something that happens at a specific time and place. Similarly we can observe the time that the right clock strikes zero. This is another event. These are not simultaneous events according to the blue contours of constant time.

So another way of making the same point is to say that to first order, a small difference in velocity – i.e. a small rotation in the xt plane – has two consequences:

The red contours of constant x are tilted relative to the blue ones. Remember that this tilt corresponds to the ordinary reduced velocity ... as discussed in connection with equation 43.
The red contours of constant t are tilted with respect to the blue ones. This tilt corresponds to the breakdown of simultaneity at a distance).

We can understand these two things mathematically by looking at the rotation matrix, equation 45, which we reproduce here:

⎡
⎢
⎣

cosh(θ)		sinh(θ)
sinh(θ)		cosh(θ)

⎤
⎥
⎦

(56)

If we expand this to first order, we find

⎡
⎢
⎣

1		θ
θ		1

⎤
⎥
⎦

for small θ (57)

The lower-left matrix element is quite prosaic: To first order, it tells us that distance = rate × time. More precisely, it tells us one component of the 4-velocity, namely dx/dτ = sinh(θ) ≈ θ. The upper-left matrix element tells us another component, namely namely dt/dτ = cosh(θ) ≈ 1. Dividing these, we find one component of the reduced velocity, namely dx/dt = tanh(θ) ≈ θ.
If we put in the explicit factors of c, we find that in our chosen reference frame (which is rotated by an angle θ relative to the rest frame of the particle), the equation of motion is:

Δx_motion = tanh(θ) cΔt

≈ θ cΔt

(58)
The upper-right matrix element is the mirror image of distance = rate × time. It tells that time = rate × distance. The time in this case is the amount of non-simultaneity. If we put in the explicit factors of c, we get

cΔt_{non-simultaneity} = tanh(θ) Δx

= θ Δx

             (59)

The factors of c in these two equations conspire to make it relatively easy to observe distance = rate × time, even when θ is small, as it is for ordinary day-to-day situations. In contrast, the breakdown of simultaneity at a distance is a factor of c² harder to observe.

It can be observed directly in some situations. See section 3.19.
It can be observed indirectly we consider more complex paths through spacetime (not just simple straight-line unaccelerated motion). A famous example concerns the notorious traveling twins, as discussed in reference 12. As a related point, anything involving a gravitational redshift can be considered another example.

We know indirectly that this matrix element must exist, because is necessary to preserve the logical consistency of the theory. It is needed to make sure that the matrix we are talking about (equation 45) actually qualifies as a rotation matrix. In particular, let us now invoke the idea that a rotation of size θ can be built out of N smaller rotations, of size θ/N apiece. This tells us that if we fully understand small rotations, we can figure out everything else, including large rotations. For small angles, it makes sense to expand the rotation operator in a Taylor series:

⎡
⎢
⎣

cosh(θ)		sinh(θ)
sinh(θ)		cosh(θ)

⎤
⎥
⎦

⎡
⎢
⎣

0		1
1		0

⎤
⎥
⎦

+ θ

⎡
⎢
⎣

0		1
1		0

⎤
⎥
⎦

+ ½θ²

⎡
⎢
⎣

0		1
1		0

⎤
⎥
⎦

+ ⋯

⎡
⎢
⎣

1		0
0		1

⎤
⎥
⎦

+ θ

⎡
⎢
⎣

0		1
1		0

⎤
⎥
⎦

+ ½θ²

⎡
⎢
⎣

1		0
0		1

⎤
⎥
⎦

+ ⋯

rest
energy

momentum

kinetic
energy

(60)

We see that the Taylor series is an expansion in powers of the matrix

L =

⎡
⎢
⎣

0		1
1		0

⎤
⎥
⎦

(61)

(Tangential remark: This matrix L is the Lie derivative of the rotation operator. It appears three times on the RHS of the top line of equation 60, and functions as the generator of rotations. It is related to a Pauli spin matrix. If none of this means anything to you, don’t worry about it. I mention it in order to give you the idea that what we are doing here is on very firm mathematical foundations, and to give you a hint where to look for further details.)

Now – hypothetically – we try to preserve simultaneity at a distance by zeroing out the upper-right matrix element, so that the matrix becomes

L’ =

⎡
⎢
⎣

0		0
1		0

⎤
⎥
⎦

(62)

When we apply the modified rotation operator to a position vector, there would no longer be any breakdown of simultaneity.

When we apply the modified rotation operator to the 4-momentum, the story is slightly more interesting. The zeroth power of L’ is not well defined (in the same way that 0⁰ is not well defined), but if we semi-arbitrarily define it to be the identity, then switching from L to L’ makes no change to the rest energy (which is zeroth order in θ). There would also be no effect on the momentum (which is first order in θ, and perpendicular to the rest energy). However, when we get to the next term, the party’s over. The square of L’ is zero. There would be no kinetic energy.

We see that the same matrix element that is responsible for the breakdown in simultaneity at a distance (directly, to first order) is also in some sense responsible for the kinetic energy (indirectly, to second order).

The breakdown of simultaneity is not a new, fundamental, or separate idea. In fact it is a minor corollary of the main idea, namely the idea that a boost is a rotation in spacetime. Specifically:

A boost applied to a 4-dimensional position vector produces several effects, one of which can be described as breakdown of simultaneity at a distance. All the effects are best described as a rotation. In this case the rotation mixes some of the t component into the x component and vice versa.
A boost applied to the 4-momentum (or any other 4-vector other than the position) has fundamentally the same set of effects, but the notions of “simultaneity” and “distance” are completely inappropriate. As always, all the effects are best described as a rotation. When applied to the 4-momentum, it mixes some of the E component into the p_x component and vice versa.

3.19 Application: GPS

The GPS system provides a direct check on several aspects of relativity. This includes some general relativity, namely the gravitational redshift. It also includes relativistic foreshortening as well as the breakdown of simultaneity at a distance. For now, let’s focus the simultaneity issue, since that is the one that people seem to have the most trouble with.

It turns out that:

The GPS satellites are moving reasonably fast. Orbital velocities are on the order of 14,000 km per hour. That is 13 millionths of the speed of light. That is to say, the rapidity is on the order of 13 microradians. That’s not a huge angle, but it’s not zero, either.
The satellites are reasonably far apart from each other, and from their ground stations. The orbital radius is about 26,600 km, and that sets the scale for the other distances. It is also relevant that the separations are continually changing, so the effect we are looking for is not a constant that can be swept under the rug.
The whole system depends on accurate timing, down to the nanosecond level. There is an atomic clock aboard each GPS satellite.

So this is the trifecta: this is exactly the sort of situation where you would expect to notice a breakdown of simultaneity. Indeed, if you crank through the numbers, you find the breakdown is on the order of hundreds of nanoseconds, which is quite huge on the scale of things. This is not some minor correction term, but rather a major contribution to the calibration procedure.

If the predictions of special relativity were not correct, the GPS operators definitely would have noticed. The GPS system can be considered a rather sensitive check on special relativity.

3.20 Arc Length, Proper Time, and Proper Length

Suppose we bend a wire into the shape shown in figure 45 and hang it so that the y direction is vertical and the x direction is horizontal. Imagine a small bug is crawling along the wire.

Figure 45: Y is Not a Function of X

Any attempt to describe this shape in terms of the slope dy/dx will end in disaster. Clearly y is not a function of x, let alone a differentiable function. The places where the wire is vertical could be loosely described as having infinite slope, but quantifying this would not be worth the trouble, because it is not relevant to the physics. In particular: As the bug crawls along the wire, at each point we can measure the slope of the wire (dy/dx). We can also measure dy/ds, where s is the arc length, measured along the wire.

Near location A, where the wire is horizontal, the slope is zero. Also dy/ds is zero. As the bug crawls along, it does zero work against the gravitational field.
Near location B, where the wire is vertical, the slope is infinite ... but that does not mean the bug must do infinite work against the gravitational field. The bug does not care about dy/dx. The bug is far more interested in dy/ds.

The lesson here is that at location A and location B and everywhere else, the gravitational physics depends more directly on dy/ds than on dy/dx.

The derivative dy/ds is quite well behaved. It is never less than −1 and never greater than +1, as you can infer from figure 47.

Also note that if we rotate the wire, the arc length is unchanged.

Figure 46: X is a Function of Arc Length

Figure 47: Y is a Function of Arc Length

So it is in spacetime. For a particle moving through spacetime, the relevant arc length is the proper time, denoted τ.

We define the 4-velocity as

u :=

dτ

(63)

where R is the 4-vector position. In some chosen reference system B, we can expand u in terms of components:

u :=

⎡
⎢
⎢
⎣

dτ

⎤
⎥
⎥
⎦

(64)

Note that dt/dτ will not be equal to 1 ... unless the particle is at rest in the chosen reference frame.

The 4-velocity u stands in contrast to the reduced velocity v, which can be expanded as:

v :=

⎡
⎢
⎢
⎣

⎤
⎥
⎥
⎦

(65)

It must be emphasized that the reduced velocity is not the spatial part of the 4-velocity. Instead it is the spatial part of the 4-velocity divided by dt/dτ.

3.21 Various Ways to Compute the 4-velocity

There are multiple methods for computing the 4-velocity. Let’s start with the obvious, prosaic method. For any particle with nonzero mass, in some frame F we can write:

⎡
⎢
⎢
⎣

Δt

Δτ

Δx

Δτ

Δy

Δτ

Δz

Δτ

⎤
⎥
⎥
⎦

(66)

The RHS of this expression is valid in the chosen frame (F) ... but the 4-velocity (u) is a full-fledged spacetime object that exists unto itself, independent of whatever frames, if any, we choose to use. It is like the ruler in figure 5.

The components of u are particularly simple in any frame that is comoving with the particle, since the coordinate time t is the same as the proper time τ in such a frame:

[1, 0, 0, 0]_@comoving

(67)

However, it is interesting and sometimes useful to define the 4-velocity much more abstractly, without mentioning components at all.

Suppose we have a particle moving through spacetime. We assume that the motion can be well approximated, at least locally, as uniform straight-line motion. Attached to the particle is a small light bulb. At point P_A the light bulb turns on, and point P_B the light bulb turns off. These points in spacetime are called events. They are represented as black dots in figure 48.

Figure 48: Spacetime Events and Displacement Vector

These events are completely generic and abstract. We could, if we wished, choose an origin and draw vectors from the origin to each point, but we don’t need to do that, and if we don’t, the points don’t even qualify as vectors. They’re just generic abstract points.

Given two such points, we can draw the displacement vector D_AB that goes from P_A to P_B. This vector is a well-behaved physical object in spacetime. It is a 4-vector, with a tip P_B and a tail P_A. Just like the ruler in figure 5, this vector is independent of whatever coordinate systems, if any, we choose to use.

We can also talk about the proper time that elapses between the event where the light turns on (P_A) and the event where the light turns off (P_B).

proper time

√

− D_AB·D_AB

(68)

This allows us to write the 4-velocity as:

D_AB

√

− D_AB·D_AB

(69)

This equation is true no matter what coordinate frame, if any, we choose to use. Let’s be clear: We do not need any coordinate frame in order to evaluate equation 69. All we need is to identify the points P_A and P_B, draw the vector from one to the other, and take the dot product of this vector with itself. We don’t need a coordinate system to do any of those things.

Of course, if we do have a coordinate system, we can express the 4-velocity as

ΔP

Δτ

P_B − P_A

τ_B − τ_A

(70)

It is perfectly fine if you want to do it that way, but the point remains that we are not required to do it that way. The worldline of the particle, as it travels from P_A to P_B, is just as real as the ruler in figure 5. For any particle with nonzero mass, the 4-velocity is just as real. It exists as an object in spacetime, independent of whatever coordinate system, if any, we choose to use.

3.22 Classical Velocity, 4-velocity, 4-momentum, et cetera

Recall that the 4-velocity and classical velocity are defined as follows:

u		:=		dR/dτ		(in all frames)
v_@F		:=		dR_xyz/dt		(in some frame F)

(71)

Note the contrast:

On the first line, u (the 4-velocity) is defined in terms of R (the 4-vector position) and τ (the proper time), as mentioned section 3.21.

On the second line, v (the reduced velocity aka classical velocity) is defined in terms of R_xyz (the projection of R onto the spatial part of the chosen frame F), and t (the projection of R onto the time-axis of that frame), as mentioned in section 3.5.

The 4-velocity is well defined no matter what reference frame – if any – we are using. It is in the same category as the 4-momentum and the ruler shown in figure 5, which exist as physical objects in spacetime.

The classical velocity only makes sense in a particular, chosen reference frame. We cannot even begin to define it except in terms of some frame.

If we do choose a frame, we can expand u and v in terms of components:

⎡
⎢
⎢
⎣

Δt

Δτ

Δx

Δτ

Δy

Δτ

Δz

Δτ

⎤
⎥
⎥
⎦

(in some frame F)

⎡
⎢
⎢
⎣

Δx

Δt

Δy

Δt

Δz

Δt

⎤
⎥
⎥
⎦

(in some frame F)

(72)

Beware of the following contrast, which can be a trap for the unwary, as discussed in reference 10:

The classical momentum (p_xyz), aka the 3-momentum, is just the spatial part of the 4-momentum (p).

The classical velocity (v) is not the same as the spatial part of the 4-velocity (u_xyz). It is less than that by a factor of Δt/Δτ, as we see in the following equation:

v_@F

u_xyz ÷

Δt

Δτ

(in some frame F, assuming m≠0)

u_xyz ÷ cosh(θ)

u_xyz ÷ γ

(73)

where θ is the rapidity with which the particle is moving relative to the frame F. This factor dt/dτ occurs so commonly in relativity that it has a standard symbol, namely γ (“gamma”). Obviously γ and θ implicitly depend on how fast the particle is moving relative to the chosen frame F.

Gamma is equal to cosh(θ) which is always greater than or equal to 1, which means that |v| is always less than or equal to |u_xyz|, which is why we call v the reduced velocity.

The status of some interesting velocity-related and momentum-related quantities is summarized in the following table:

		restrictions	spacetime object?	grade
proper time	τ	—	invariant	scalar
mass	m	—	invariant	scalar
4-momentum	p	—	covariant	vector
3-momentum	p_xyz	[#]	no	vector
4-velocity	u	[m]	covariant	vector
spatial part of 4-velocity	u_xyz	[#, m]	no	vector
classical velocity	v	[#]	no	vector

		[#] : requires a frame
		[m] : requires m≠0

Note the three-way contrast:

The classical velocity v requires you to choose a frame, but does not require nonzero mass.

The 4-velocity u requires the particle to have nonzero mass, but does not require you to choose a frame.

More importantly, the 4-momentum p exists always, whether or not you choose a frame, and whether or not the particle has nonzero mass. Therefore it is usually a good practice to think in terms of the 4-momentum (as opposed to 4-velocity or classical velocity).

Anything that can be expressed in terms of 4-momentum
probably should be expressed in terms of 4-momentum.

3.23 Invariance ± Conservation

Let’s consider the scenario shown in figure 49. There are two photons (namely G and H) in a box (B). For the moment, we use the word “photon” to refer to running wave packets; other uses of the word are discussed in section 3.24.

Figure 49: Two Photons in a Box

We use the following notation for the photon properties:

name	G	H
4-momentum	G_∘p	H_∘p
x-momentum	G_∘p_∘1	H_∘p_∘1	(in some specified frame)
	≡ G_∘p_∘x	≡ H_∘p_∘x	(..)
energy	G_∘p_∘0	H_∘p_∘0	(..)
	≡ G_∘E	≡ H_∘E	(..)

(74)

The notation can be read from right to left; for example B_∘p_∘1 can be read as the x-component of the momentum of photon B. This notation is analogous to the “dot qualifier” notation used to specify class membership in object-oriented programming languages such as C++. (If the previous sentence didn’t mean anything to you, don’t worry about it.) This notation gives us a systematic way to specify everything that needs to be specified. This stands in contrast to subscripts, which are often used in unsystematic ways. For example, p_A uses a subscript to denote that momentum of A, while p_x uses a seemingly-equivalent subscript to denote the x-component of the momentum.

In our scenario, the photons do not interact. They do not overlap. They are never at the same place at the same time, and even if they were, they would not interact, because the electromagnetic field is linear. Even if we account for the nonlinearities of quantum electrodynamices – pair production and all that – the interaction between two photons is negligible at ordinary intensities and garden-variety wavelengths. Our photons are constructed so that in the lab frame, they have the same color, and are moving in opposite directions. There is no component of motion in the y or z directions. In other words:

G_∘p	=	[q, +q, 0, 0]_@lab
H_∘p	=	[q, −q, 0, 0]_@lab
B_∘p	=	[2q, 0, 0, 0]_@lab

(75)

for some arbitrary q. We have calculated the total 4-momentum in the box B by simply summing over all the contents of the box. The box is just a box-shaped region of space, bounded by an imaginary dotted line, so its 4-momentum is just the 4-momentum of its contents, nothing more.

It is easy to calculate the mass of our various items, just by taking the dot product of the 4-momentum with itself, in accordance with equation 9.

G_∘m	=	0
H_∘m	=	0
B_∘m	=	2q

(76)

This may be somewhat counterintuitive, but it is the right answer. The mass of every individual item in the box is zero, but the mass of everything together is nonzero. Note that the results in equation 76 are correct in every frame (not just the lab frame).

Note the contrast:

Mass is invariant with respect to boosts.

Mass is not invariant with respect to lumping items together in groups.

Mass is a Lorentz scalar. That means you can evaluate it in the lab frame or in some other frame that is moving relative to the lab frame, and get the same mass every time.

Mass is not conserved. You may have heard in high-school chemistry class that mass is conserved, but that’s not exactly true. It is approximately conserved in the course of ordinary chemical reactions, but it is not exactly conserved even then.

The 4-momentum p is conserved. In any chosen frame, each and every component of p is separately conserved.

The dot product p·p is not conserved. Recall that p·p = −m².

The scenario shown in figure 49 leads to spectacular non-conservation of mass. At a certain time in the near future, photon G will leave the box, while photon H remains within the box. At this time, the box will become massless. The box will change from m=2q to m=0 ... even though no mass has crossed the boundary! In particular, the decrease in mass inside the box-region will not necessarily be accompanied by an increase in mass in any neighboring region, which would required (by definition) for conservation. See reference 5 for more about the details of what we mean by conservation.

3.24 Photons at Rest, Or Not

Let’s consider another scenario. In this section we consider the electromagnetic field in a box. This is a real, tangible box with reflective walls (unlike the imaginary box in section 3.23).

The geometry of the box dictates that the EM field will have certain modes, certain standing-wave patterns. We can consider each mode separately. It turns out that the equation of motion for each mode is just the harmonic-oscillator equation.

The harmonic oscillator has a series of stationary states i.e. energy eigenstates. The energy of these stationary states is quantized. There are plenty of non-stationary states that are not quantized, as discussed in reference 13, but for the moment let’s focus attention on the stationary states. Subject to this restriction, the level of excitation of the harmonic oscillator can be expressed in terms of the number of photons. The fact that energy is quantized is synonymous with the fact that the photon number is an integer.

It must be emphasized that the definition of photon used in this section is incompatible with the definition of photon used in section 3.23.

Presently (section 3.24)

Previously (section 3.23)

Standing-wave photons

Running-wave-packet photons

Standing wave can be considered the sum of equal-and-opposite running waves.

Each mode is monochromatic.

Any finite-sized packet necessarily contains a multitude of different wavelengths.

The standing wave is at rest in the frame of the box. It just stands there.

A running wave cannot be at rest in any frame.

The standing-wave electromagnetic field has nonzero mass, for reasons discussed in section 3.23.

The running-wave electromagnetic field has zero mass.

You can equate this mass to the rest energy, if you dare, in accordance with equation 14.

You cannot talk about rest energy, because the running wave cannot possibly be at rest.

For more discussion about what mass is, see reference 14. For a discussion of misconceptions related to special relativity, see reference 10.

4 Great Quotes

4.1 Galileo : Relativity (1632)

English translation, from reference 15:

Shut yourself up with some friend in the main cabin below decks on some large ship, and have with you there some flies, butterflies, and other small flying animals. Have a large bowl of water with some fish in it; hang up a bottle that empties drop by drop into a wide vessel beneath it. With the ship standing still, observe carefully how the little animals fly with equal speed to all sides of the cabin. The fish swim indifferently in all directions; the drops fall into the vessel beneath; and, in throwing something to your friend, you need throw it no more strongly in one direction than another, the distances being equal; jumping with your feet together, you pass equal spaces in every direction. When you have observed all these things carefully (though doubtless when the ship is standing still everything must happen in this way), have the ship proceed with any speed you like, so long as the motion is uniform and not fluctuating this way and that. You will discover not the least change in all the effects named, nor could you tell from any of them whether the ship was moving or standing still. In jumping, you will pass on the floor the same spaces as before, nor will you make larger jumps toward the stern than toward the prow even though the ship is moving quite rapidly, despite the fact that during the time that you are in the air the floor under you will be going in a direction opposite to your jump. In throwing something to your companion, you will need no more force to get it to him whether he is in the direction of the bow or the stern, with yourself situated opposite. The droplets will fall as before into the vessel beneath without dropping toward the stern, although while the drops are in the air the ship runs many spans. The fish in their water will swim toward the front of their bowl with no more effort than toward the back, and will go with equal ease to bait placed anywhere around the edges of the bowl. Finally the butterflies and flies will continue their flights indifferently toward every side, nor will it ever happen that they are concentrated toward the stern, as if tired out from keeping up with the course of the ship, from which they will have been separated during long intervals by keeping themselves in the air. And if smoke is made by burning some incense, it will be seen going up in the form of a little cloud, remaining still and moving no more toward one side than the other. The cause of all these correspondences of effects is the fact that the ship’s motion is common to all the things contained in it, and to the air also. That is why I said you should be below decks; for if this took place above in the open air, which would not follow the course of the ship, more or less noticeable differences would be seen in some of the effects noted.

In the original, from reference 16:

Risserratevi con qualche amico nella maggiore stanza, che sia sotto coverta di alcun gran navilio, e quivi fate d’ aver mosche, farfalle e simili animaletti volanti: siavi anco un gran vaso d’acqua, e dentrovi de’pescetti; sospendasi anco in alto qualche secchiello, che a goccia a goccia vada versando dell’ acqua in un altro vaso di angusta bocca che sia posto a basso; e stando ferma la nave, osservate diligentemente, come quelli animaletti volanti con pari velocità vanno verso tutte le parti della stanza; i pesci si vedranno andar notando inditferentemente per tutti i versi, le stille cadenti entreranno tutte nel vaso sottoposto; e voi gettando all’ amico alcuna cosa, non più gagliardamente la dovrete gettare verso quella parte che verso questa, quando le lontananze sieno eguali; e saltando voi, come si dice, a piè giunti, eguali spazj passerete verso tutte le parti. Osservate che avrete diligentemente tutte queste cose, benchè niun dubbio ci sia che mentre il vascello sta fermo non debbano succeder cosi; fate muover la nave con quanta si voglia velocità: chè (pur che il moto sia uniforme e non fluttuante in qua e in là) voi non riconoscerete una minima mutazione in tutti li nominati effetti; nè da alcuno di quelli potrete comprender se la nave cammina, o pure sta ferma. Voi saltando passerete nel tavolato i medesimi spazj che prima; nè perchè la nave si muova velocissimamente, farete maggior salti verso la poppa, che verso la prora, benchè nel tempo che voi state in aria il tavolato sottopostovi scorra verso la parte contraria al vostro salto; e gettando alcuna cosa al compagno, non con più forza bisognerà tirarla per arrivarlo, se egli sarà verso la prora e voi verso poppa, che se voi fuste situati per l’ opposito: le gocciole cadranno come prima nel vaso inferiore senza caderne pur una verso poppa, benchè, mentre la gocciola è per aria, la nave scorra molti palmi; ipesci nella lor acqua non con più fatica noteranno verso la precedente che verso la susseguente parte del vaso; ma con pari agevolezza verranno al cibo posto su qualsivoglia luogo dell’ orlo del vaso; e finalmente le farfalle e le mosche continueranno i lor voli indifferentemente verso tutte le parti; nè mai accederà che si riduchino verso la parete che riguarda la poppa, quasi che fussero stracche in tener dietro al veloce corso della nave, dalla quale per lungo tempo trattenendosi per aria saranno state separate: e se, abbruciando alcuna lagrima d’ incenso, si farà un poco di fumo, vedrassi ascender in alto, e a guisa di nugoletta trattenervisi, e indifferentemente muoversi non più verso questa che quella parte: e di tutta questa corrispondenza d’ efletti ne è cagione l’ esser il moto della nave comune a tutte le cose contenute in essa, e all’aria ancora; che perciò dissi io che si stesse sotto coverta, chè quando si stesse di sopra e nell’aria aperta e non seguace del corso della nave, differenze più e men notabili si vedrebbero in alcuni degli effetti nominati.

4.2 Minkowski : Spacetime (1908)

From now on, space of itself and time of itself

shall sink into mere shadows

and only a kind of union of the two

shall maintain its independence.

Or in the original:

Von Stund’ an sollen Raum für sich und Zeit für sich

völlig zu Schatten herabsinken

und nur noch eine Art Union der beiden

soll Selbständigkeit bewahren.

Hermann Minkowski (1908)

Reference 17

That must be one of the most profound sentences in human history. The notion of time as the fourth dimension is a serious, powerful, quantitative idea. It is not some loose, hand-wavy metaphor. It is not science fiction.

5 Tactics

When using special relativity, very often the first step is to draw the spacetime diagram.

You presumably find it easy to draw a rotated coordinate system, provided it has been rotated in the xy plane, such as we see in figure 41. You have seen thousands upon thousands of rotated objects in your lifetime.

When you get to the point where you have seen thousands of spacetime diagrams, including boosted coordinate systems, you will be able draw them freehand ... but until then, it is probably easier and better to use prefabricated spacetime graph paper, or to create your own using a computer.

Some prefabricated spacetime graph paper is available online; see e.g. reference 18.

If you want to make your own, here are some suggestions:

Create an ingredients file containing unrotated versions of everything you need: coordinate grids, rulers, clocks, text, et cetera.
Keep a safe copy of this file. You will need it more than once.
For each diagram you wish to create, start by making a copy of the ingredients file.
The drawing program makes it easy to rotate things in the xy plane.
The drawing program makes it almost as easy to rotate things in the tx plane. Here’s one way it can be done. In inkscape, fire up the transform dialog. It can be reached via Menu -> Object -> Transform, or via the Shift+Ctrl+M shortcut. The dialog has tabs for Move, Scale, Rotate, Skew, and Matrix. Boosts can be implemented using the Matrix tab. Set the matrix elements {A, B, C, D} to {cosh(θ), sinh(θ), sinh(θ), cosh(θ)} and apply the transformation.
This results in a quantitatively-correct boost.
Since we have not set the E and F matrix elements, the boosted object will probably get moved to a strange place, so you will have to find it and move it back to wherever it belongs.

Another suggestion: It is usually better to rotate text using a simple spacelike rotation, rather than a boost, because a boost would give the text a sheared look and make it hard to read. If a coordinate system has undergone a boost of angle θ, its labels should undergo a spacelike rotation of angle atan(tanh(θ)). Note that here we are using the hyperbolic tanh function and the circular atan function. We leave it as an exercise to prove that this is the correct angle.

And another: If there is any chance that you will ever want a complex diagram such as figure 31, draw it first. Then if you want a simplified view of the same situation, you can prepare it by copying the complicated drawing and deleting everything you don’t need. The point here is that deleting stuff from a complicated drawing obviously preserves alignment, whereas every time you add stuff to a simple diagram you have to fuss with the alignment.

My diagrams gradually improve over time. I do all my editing on the complicated diagram, and use the makefile mechanism to derive the various simplified views automatically. This reduces my workload while guaranteeing that consistency will be maintained. Hint: You can assign names to graphical objects, which makes it easy for the makefile to select them for deletion.

6 Some Trigonometric Identities – Applied to Relativity

Knowing a few trig identities is useful when thinking about relativity. It is especially useful when reading the literature, because it helps you recognize and simplify some otherwise-scary-looking expressions. Let’s start with the basic Pythagorean identity:

b² + a²

c²

(77)

b² − a²

c²

(78)

In figure 50, the red bar represents the base b, the blue bar represents the altitude a, and the green curve is a circle representing the locus of constant b² + a².

In figure 51, the red bar represents the base b, the blue bar represents the altitude a, and the green curve is a hyperbola representing the locus of constant b² − a².

In both figures, the small black circles mark angles, from 0 to 1 radian inclusive, in steps of 1/4 radian.


Figure 50: Circular Trigonometry		Figure 51: Hyperbolic Trigonometry

The corresponding trig identity is:

The corresponding hyperbolic trig identity has an important minus sign:

cos² + sin²

(79)

cosh² − sinh²

(80)

Let’s be explicit about the corresondences:

[red	,	blue]
[cos	,	sin]
[run	,	rise]

slope = rise/run = sin/cos

(81)

[red	,	blue]
[cosh	,	sinh]
[timelike	,	spacelike]
[energy	,	momentum]
reduced velocity = sinh/cosh

(82)

It is also useful to be able to convert back and forth between trig functions and exponentials. These are particularly useful for deriving the double-angle identities:

e^iθ

cos(θ) + i sin(θ)

cos(θ)

e^iθ + e^−iθ

sin(θ)

e^iθ − e^−iθ

(83)

e^θ

cosh(θ) + sinh(θ)

cosh(θ)

e^θ + e^−θ

sinh(θ)

e^θ − e^−θ

(84)

From these, we can derive lots more identities. We can use these identities to simplify physics problems. For example:

Suppose an object is moving along an upward-sloping path. We are given the slope s. We want to calculate the ratio between the actual length of the path and the ground track, i.e. the projection of the path onto the laboratory x-axis. One reasonable approach is to do it in two steps: Take the arctangent of the slope to find the angle θ, and then take the cosine of θ in the usual way.

Let’s revisit the muon-lifetime experiment discussed in section 3.11. The muon is moving along at a certain velocity relative to the lab frame. We prefer to think of this in terms of its four-velocity u, but alas the Muggle we hired as a lab assistant only measured the reduced velocity v as seen in the lab frame. We want to calculate the ratio between the muon’s actual elapsed time (proper time!) and the projection of its time onto the laboratory t-axis. Recall that this projection factor dt/dτ is conventionally called γ. One reasonable approach is to do it in two steps, as we did in section 3.11: Take the hyperbolic arctangent of the reduced velocity to find the rapidity θ, and then take the hyperbolic cosine of θ in the usual way.

If we do this often enough, we might want a shortcut, i.e. a formula to go from slope to projection-factor in one step. Such a formula is provided by equation 87c. It is easy to derive this formula whenever you need it, as follows:

If we do this often enough, we might want a shortcut, i.e. a formula to go from reduced velocity to gamma-factor in one step. Such a formula is provided by equation 88c. It is easy to derive this formula whenever you need it, as follows:

Let’s recall some terminology. Using the same a, b, and c as in the Pythagorean equation 77, we can express the slope as:

Let’s recall asome terminology: The reduced velocity is:

s		=		tan(θ)
		=		a/b

(85)

c tanh(θ)

(86)

We start with equation 79 and divide through by the first term.

We start with equation 80 and divide through by the first term.

1 +

sin²(θ)

cos²(θ)

(87a)

cos(θ)

√

1 + tan²(θ)

(87b)

cos(atan(a/b))

√

1 + a²/b²

(87c)

b/c

1 −

sinh²(θ)

cosh²(θ)

(88a)

cosh(θ)

√

1 − tanh²(θ)

(88b)

cosh(atanh(v/c))

√

1 − v²/c²

(88c)

dt/dτ

We see that the projection factor cos(⋯) is always less than or equal to 1. When the slope is small, the projection factor is unity, and as the slope goes to infinity, the projection factor goes to zero, in accordance with equation 87c.

We see that the projection factor cosh(⋯) is always greater than or equal to 1. When the velocity is small, the projection factor is unity, and as the velocity approaches the speed of light, the projection factor diverges to infinity, in accordance with equation 88c.

Another way of writing the cosine can be obtained by re-arranging equation 77.

Another way of writing the hyperbolic cosine can be obtained by re-arranging equation 78.

cos(θ)

√

1 − sin²(θ)

cos(asin(a/c))

√

1 − a²/c²

b/c

(89)

cosh(θ)

√

1 + sinh²(θ)

cosh(asinh(|u_xyz|)

√

1 + u_xyz²

dt/dτ

(90)

There is not any deep physics in any of this. These are little more than trigonometric identities. Equation 87c tells us about the cosine of the arctangent, while equation 89 tells us about the cosine of the arcsine.

Beware: All too often, discussions of special relativity have a great many formulas that involve factors of 1/√(1−v²/c²). However, you should avoid this as much as possible. If you are ever tempted to write such a thing, you should consider writing something else instead, something more elegant, something with more direct physical significance, such as γ or cosh(θ) or dt/dτ. Expressing the factor in terms of v puts too much emphasis on v, which is an old-fashioned three-dimensional quantity. You will gain more insight if you express the factor in terms of spacetime quantities such as four-vectors or Lorentz scalars.

If we are interested in momentum, we should always start with the definition in equation 8. Here it is again:

m u

(91)

That is the best model we have for the physics of the universe we live in, namely the physics of spacetime. Starting from this simple, elegant, powerful formula, we can always make things more complicated and more restricted if necessary. For example, suppose we have a particle (such as a muon) moving through the laboratory. Before it decays, it gets absorbed by something. We know the mass, and our lab assistant has measured the reduced velocity v. We want a one-step formula that tells us how much momentum the particle imparts to the absorber. We can easily derive such a formula:

m u

(simple and fundamental)

(92a)

dτ

(definition of velocity)

(92b)

p_xyz

dR_xyz

dτ

(spatial part)

(92c)

dR_xyz

dτ

(convert proper time to lab time)

(92d)

γ m v

(definition of gamma)

(92e)

cosh(θ) m v

(trig expression for gamma)

(92f)

cosh(atanh(v/c)) m v

(rapidity in terms of velocity)

(92g)

√

1−(v/c)²

m v

(algebraic form)

(92h)

Equation 92h is useful in specialized situations, but obviously it is messier, less fundamental, and more restricted than equation 91. Here’s the recommended strategy:

You should remember equation 91. It is so simple and so obviously consistent with the grade-school notion of “mass times velocity” that it is hard to forget.
In some situations, you may prefer to re-express things in terms of the reduced velocity v, instead of the four-velocity u. That’s easy to do. Just multiply by a factor of dt/dτ ≡ γ ≡ cosh(θ) i.e. the red bar in figure 51. At this point the formula will presumably look like equation 92e. One could make a good argument for stopping at this point. When you write γ m v, anybody who knows about relativity knows that γ implicitly depends on v, and knows how to calculate it. You can spell out this dependence if you want, as in equation 92g, but you aren’t obliged to.
If you want to express the gamma-factor in terms of velocity, that’s also easy to do. At this point the formula will presumably look like equation 92g. One could make a very good argument for stopping at this point! The equation is as simple and easy to interpret as it’s going to get.
If you want to convert the trigonometric expression in equation 92g to a purely algebraic expression, that’s allowed, although not particularly recommended. It’s easy to do, using trig identities. At this point the formula will presumably look like equation 92h.

For an example of what can go wrong if you skip the first steps in this process, and use equation 92h as your starting point, see reference 10.

Again: Beware that it is not a good idea to put too much emphasis on expressions involving v. It is better to focus attention on legitimate four-vectors and Lorentz scalars, because they communicate more about what is actually going on in spacetime. If you are given a 3-vector, usually the best strategy is to convert it to the corresponding 4-vector as quickly as possible. Learn to think in four dimensions.

Let’s do one more example: Suppose we know where the particle is initially, and we want to know where it will be a short time later. That’s simple:

ΔR

∫

u dτ

≈

u Δτ

⎡
⎢
⎣

⎤
⎥
⎦

(for constant u, or small-enough Δτ)

(93)

Equation 93 is a clear expression of a simple concept. It is obviously correct, as a corollary of the definition of velocity, equation 63. Here is the definition again:

u :=

dτ

(94)

As always, the recommended strategy is to remember the simple formulas, namely equation 93 and equivalently equation 63. These are so simple and so obviously consistent with grade-school notions of “distance equals rate times time” that they are hard to forget. You can complexify things later, if the situation warrants.

For example, suppose we want to find where the particle will be a short time later, but for some reason we choose to express this in terms of “time” as measured by laboratory clocks ... not the particle’s proper time. We know the mass, and our lab assistant has measured the three spatial components of the momentum, p_xyz. Note that measuring the momentum is smarter than measuring the velocity, especially if the velocity is near the speed of light.

The physics here is simple, if we think about it in spacetime. We know the four-momentum of the particle in its own rest frame, namely p = [mc, 0, 0, 0]. The momentum is purely timelike in that frame. When the particle is moving relative to the lab frame, the four-momentum gets rotated. A piece of its four-momentum gets projected onto the spacelike directions in the lab frame. This projection is the blue bar in figure 51, as mentioned in equation 82. It is what we measure as the particle’s p_xyz in the lab frame. The relevant projection factor is sinh(θ), as we have seen in equation 49 and elsewhere.

We can use this, plus a trig identity, to obtain a useful expression for gamma:

cosh(θ)

dt/dτ

√

1 + sinh²(θ)

√

1 + (u_xyz)²

√

1 + (p_xyz/mc)²

(95)

Equation 95 is sometimes useful, because it expresses γ in terms of the 3-momentum p_xyz, which can sometimes be relatively easy to measure. This equation is a cousin to equation 88c, which expresses γ in terms of the reduced velocity v; however, beware that equation 95 has a plus sign inside the square root, whereas equation 88c has a minus sign.

We can apply this idea to the “distance equals rate times time” equation.

ΔR

∫

u dτ

(simple and fundamental)

(96a)

≈

u Δτ

⎡
⎢
⎣

⎤
⎥
⎦

(for constant u, or small-enough Δτ)

(96b)

dτ

⎛
⎝

⎞
⎠

Δt

(convert from proper time to lab time)

(96c)

⎛
⎝

⎞
⎠

Δt

(definition of gamma)

(96d)

cosh(θ)

⎛
⎝

⎞
⎠

Δt

(another expression for gamma)

(96e)

⎡
⎢
⎣

⎤
⎥
⎦

cosh(asinh(

|p_xyz|

))

⎛
⎝

⎞
⎠

Δt

(rapidity in terms of momentum)

(96f)

⎡
⎢
⎣

⎤
⎥
⎦

cosh(asinh(

|p_xyz|

))

⎛
⎜
⎜
⎝

⎞
⎟
⎟
⎠

Δt

(since p = m u)

(96g)

⎡
⎢
⎣

⎤
⎥
⎦

√

⎡
⎢
⎢
⎣

1 + (

|p_xyz|

)²

⎤
⎥
⎥
⎦

⎛
⎜
⎜
⎝

⎞
⎟
⎟
⎠

Δt

(trig identity)

(96h)

ΔR_xyz

⎡
⎢
⎣

⎤
⎥
⎦

√

⎡
⎢
⎢
⎣

1 + (

|p_xyz|

)²

⎤
⎥
⎥
⎦

⎛
⎜
⎜
⎝

p_xyz

⎞
⎟
⎟
⎠

Δt

(spatial part)

(96i)

One could make a good argument for stopping at equation 96d. When you write (1/γ) u Δt, everybody knows that γ is implicitly dependent on the velocity, and knows how to calculate it. You can spell out the dependence if you want to, but you are not obliged to.
One could make an even better argument for stopping at equation 96g (if not earlier). The equation is as simple and as easy to interpret as it’s going to get.

Equation 96i is useful in special situations. Its advantage is that the RHS involves only things that Muggles can measure: three-dimensional momentum, wall-clock time, et cetera. Another alleged advantage is that it involves only algebraic math functions, not transcendental trig functions. The disadvantage is that it is ugly, messy, and hard to remember. This is the penalty you pay for thinking in terms of pre-1908 three-dimensional concepts.

In contrast, equation 93 is a clear expression of a simple concept. It is vastly clearer than equation 96i. It is also 33% more powerful, because it gives us all four spacetime components, not just the three spacelike components. It is the nice, simple, modern (post-1908) way to represent the physics. It is obviously correct, as a corollary of the definition of velocity, equation 63.

In practice, you do not need equation 96i. The recommended alternative is simple: Whenever you get a three-vector, convert it to the corresponding four-vector as soon as possible. Even if you wind up converting back to three-vectors at the end of the calculation, the extra work is negligible, and the advantage in terms of conceptual clarity is overwhelming. Along these lines, note that having an algebraic formula (as in equation 96h) offers no practical advantage over the transcendental trigonometric formula (as in equation 96f). Every “scientific” pocket calculator made in the 20 or 30 years can do hyperbolic trig functions just as easily as it can do square roots.

In any case, comparing equation 93 to equation 96i tells us a lot about what’s going on. Both have the structure of “distance equals rate times time”. Equation 96i has a factor of 1/γ out front, because we decided to measure wall-clock time (Δt) rather than proper time (Δτ), but other than that, the formulas are the same. If somebody shows you equation 96i by surprise, the main barrier to understanding it is recognizing that the first factor is just a messy way of expressing 1/γ.

7 Dirty Laundry

This document takes a modern (post-1908) approach to the subject. Alas, there are a great many other documents in the world that seem to think that the development of relativity began and ended in 1905. This results in some exceedingly confusing concepts, as well as some needlessly ugly equations.

If at all possible, you should avoid exploring the unwise ways of doing things. It just pollutes your brain. You’ve been warned. However, if you dare to ignore this warning, and if you want to see how horrible the un-modern approach can be, see reference 10.

Remember, though: In most cases, the less said about such things, the better. For all practical purposes, there is nothing you need to know about pre-1908 relativity. The modern approach is easier and in every way better.

8 References

: 1.
John Denker
“The Geometry and Trigonometry of Spacetime”.
http://www.av8n.com/physics/spacetime-trig.pdf
: 2.
John Denker,
“Odometers and Clocks in Introductory Relativity”.
http://www.av8n.com/physics/odometer.htm
: 3.
John Denker,
“Tabletop Geodesics, General Relativity, and Embedding Diagrams”
http://www.av8n.com/physics/geodesics.htm
: 4.
John Denker,
“Psychrometric Charts, and the Evil of Axes”
http://www.av8n.com/physics/axes.htm
: 5.
John Denker,
“Conservative Flow and the Continuity of World-Lines”
http://www.av8n.com/physics/conservative-flow.htm
: 6.
Mathworld entry: “Equivalence Relation”
http://mathworld.wolfram.com/EquivalenceRelation.html
: 7.
http://nobelprize.org/nobel_prizes/physics/laureates/1959/index.html
: 8.
Wikipedia article, “Bevatron”
http://en.wikipedia.org/wiki/Bevatron
: 9.
John Denker,
“Two Types of Vector : Physics and/or Components”.
http://www.av8n.com/physics/two-vector.pdf
: 10.
John Denker,
“Spacetime Dirty Laundry”
http://www.av8n.com/physics/spacetime-dirty-laundry.htm
: 11.
John Denker
“Acceleration in Spacetime”.
http://www.av8n.com/physics/spacetime-acceleration.htm
: 12.
John Denker,
“The Traveling Twins Puzzle”
http://www.av8n.com/physics/twins.htm
: 13.
John Denker,
“Coherent States”
http://www.av8n.com/physics/coherent-states.htm
: 14.
John Denker,
“How to Define Mass”
http://www.av8n.com/physics/mass.htm
: 15.
From reference 16, translated by Stillman Drake.
: 16.
Galileo Galilei,
Dialogo sopra i due massimi sistemi del mondo (1632).
: 17.
H. Minkowski,
“Raum und Zeit”
80. Versammlung Deutscher Naturforscher (Köln, 1908).
Published in Physikalische Zeitschrift 10 104-111 (1909)
and Jahresbericht der Deutschen Mathematiker-Vereinigung 18 75-88 (1909).
http://de.wikisource.org/wiki/Raum_und_Zeit_(Minkowski)
: 18.
John Denker,
“Spacetime Graph Paper”
./spacetime005blue.pdf
./spacetime005red.pdf
./spacetime005redblue.pdf

1: There are some details in figure 8 that depend on Big Idea #2(b), but the details don’t concern us at the moment.
2: More generally, every small rotation does four things, two of which are first order, and two of which are second order in the magnitude of the rotation.

[Contents]