[Contents]
Copyright © 2003 jsd

1  Introduction; Puzzle

Consider the following puzzle:

Suppose an interstellar rocket starts from rest and accelerates in a straight line such that the passengers feel one Gee for one year. How fast are they going at the end of the year?

There are at least two ways to solve the problem.

The steady acceleration of the rocket is profoundly analogous to a steady rotation. We can use what we know about ordinary rotations to learn some things that are useful for solving this puzzle and many others.

There are some really deep ideas involved, but once you learn to trust the formalism it is easy to remember and easy to use. I don’t even bother to remember the Lorentz transformation in terms of square roots or anything like that – equation 2 is incomparably easier to remember.

This document is also available in PDF format. You may find this advantageous if your browser has trouble displaying standard HTML math symbols.

2  Rapidities and Boosts

We need to introduce two concepts:

Executive summary: This section will demonstrate that rapidities are additive for compound boosts. If you already know what that means, you can skip ahead; go directly to section 3. Otherwise, read on. Consider the following comparison; rotations are in the left column and boosts are in the right column:

Suppose Bob’s reference frame is rotated relative to Alice’s. In particular, let it differ by a rotation in the XY plane. This “mixes up” the X and Y coordinates of an object. That is, the X coordinate measured by Alice depends on the X coordinate and the Y coordinate measured by Bob. Similarly, the Y coordinate measured by Alice depends on the Y coordinate and the X coordinate measured by Bob.   Suppose Bob’s reference frame is moving relative to Alice’s. In particular, let it differ by a boost in the X direction. This mixes up the X and T coordinates of an object. The X coordinate measured by Alice will depend on the X coordinate and the T coordinate measured by Bob. Similarly, the T coordinate measured by Alice depends on the T coordinate and the X coordinate measured by Bob – which is what we call the breakdown of simultaneity at a distance.

The exact expression is:



PX(A)
PY(A)


 = 

cos(φ) −sin(φ)
sin(φ) cos(φ)


 

PX(B)
PY(B)


             (1)

  The exact expression is:



PT(A)
PX(A)


 = 

cosh(ρ) sinh(ρ)
sinh(ρ) cosh(ρ)


 

PT(B)
PX(B)


             (2)

where φ is the angle of rotation. We recognize the rotation matrix for a rotation in the XY plane, namely

R(φ) := 

cos(φ) −sin(φ)
sin(φ) cos(φ)


             (3)

  where ρ is the rapidity of the boost. We recognize the rotation matrix for a rotation in the XT plane (i.e. a boost), namely

B(ρ) := 

cosh(ρ) sinh(ρ)
sinh(ρ) cosh(ρ)


             (4)

The slope of Bob’s X-axis relative to Alice’s is given by

dy
dx
 = m = tan(φ)              (5)

For small angles, the slope is equal to the angle (in radians). Equation 1 uses ordinary circular trig functions (sin and cos). The matrix element in the upper-right corner has a minus sign.

  The X-component of Bob’s reduced velocity relative to Alice is given by

dx
c dt
 = 
vx
c
 = tanh(ρ)              (6)

Unlike equation 3, there are no minus signs in equation 4. Equation 2 uses hyperbolic trig functions (sinh and cosh).

We see that a rotation in the XY plane changes the slope, whereas a rotation in the XT plane changes the velocity.

The slope is more directly analogous to the reduced velocity than to the 4-velocity.

  The reduced velocity should not be confused with the 4-velocity. The X-component of the 4-velocity is given by:

dx
c dτ
 = 
ux
c
 = sinh(ρ)              (7)

For small rapidities, both vx and ux are equal to the rapidity (in units where c = 1). It is not at all clear which should be considered “the” velocity.

If Bob is rotated relative to Alice, and Carol is rotated relative to Bob, the rotation matrix for the Alice → Carol transformation is just the product of the Alice → Bob and Bob → Carol matrices.   If Bob is moving relative to Alice, and Carol is moving relative to Bob, the boost matrix for the Alice → Carol transformation is just the product of the Alice → Bob and Bob → Carol matrices.

If Bob is rotated relative to Alice by a tiny amount, and Carol is rotated relative to Bob 99 times as much (which we construct by compounding lots of tiny rotations), the rotation matrix for the Alice → Carol transformation is just the 100th power of the Alice → Bob matrix. That is,

R(Nє)  =  [R(є)]N              (8)

  If Bob is moving relative to Alice by a tiny amount, and Carol is moving relative to Bob 99 times as much (which we construct by compounding lots of tiny boosts), the boost matrix for the Alice → Carol transformation is just the 100th power of the Alice → Bob matrix. That is,

B(Nє)  =  [B(є)]N              (9)

The rotation matrix for a tiny angle is:

R(є)  =  

1 −є
є 1


             (10)

You can compute this as a direct consequence of equation 1, or you can take this as a starting point – the next few paragraphs will re-derive equation 1 starting from equation 10 and equation 8.

  The boost matrix for a tiny rapidity is:

B(є)  =  

1 є
є 1


             (11)

You can compute this as a direct consequence of equation 2, or you can take this as a starting point – the next few paragraphs will re-derive equation 2 starting from equation 11 and equation 9.

This can be written in the suggestive form

R(є)  =  exp(є Lxy)  =  1 + є Lxy              (12)

where

Lxy = 

0 −1
1 0


             (13)

is called the generator of rotations in the XY plane.

  This can be written in the suggestive form

B(є)  =  exp(є Ltx)  =  1 + є Ltx              (14)

where

Ltx = 

0 1
1 0


             (15)

is called the generator of rotations in the XT plane (i.e. boosts in the X direction).

Note: If you are not familiar with taking e to the power of a matrix, just write out ex as a power series, and let x be a matrix. Then you’re all set. It is straightforward to raise x to integer powers, just by repeated multiplication. Powers of Lxy and Ltx are particularly easy to compute.

By compounding N tiny rotations, we obtain:

R(Nє)  =  exp(Nє Lxy)  =  (1 + є Lxy)N              (16)

where Nє is not necessarily small, even though є is small.

  By compounding N tiny boosts, we obtain:

B(Nє)  =  exp(Nє Ltx)  =  (1 + є Ltx)N              (17)

where Nє is not necessarily small, even though є is small.

Since any angles φ1 and φ2 can be written as multiples of epsilon, we can always write

R1)  =  exp(φ1 Lxy)              (18)

  Since any boosts ρ1 and ρ2 can be written as multiples of epsilon, we can always write

B1)  =  exp(ρ1 Lxy)              (19)

and

R2)  =  exp(φ2 Lxy)              (20)

  and

B2)  =  exp(ρ2 Lxy)              (21)

Exercise: calculate the Nth power of Lxy for all N from 0 to 5. Then weight each one by φN/N! and add them up, matrix-element-by-matrix-element. It is interesting how the power series for sine and cosine emerge in the proper places, confirming that equation 18 is consistent with equation 3. We see that exponentials are intimately related to sines and cosines. This is reminiscent of Euler’s formula, exp(i θ) = cos(θ) + i sin(θ), but using real-valued matrices rather than complex numbers.

Since it is particularly easy to multiply exponentials (just add the exponents), we find that

R1)R2) = exp[(φ12Lxy
     = R1 + φ2)
             (22)

  and since it is particularly easy to multiply exponentials (just add the exponents), we find that

B1)B2) = exp[(ρ12Lxy]
     = B1 + ρ2)
             (23)

This proves something that everybody takes for granted, namely that angles are additive for compound rotations (in the same plane). This is exceedingly useful. This is why angles were invented.   This proves something that might not have been 100% obvious a few minutes ago, namely that the rapidities are additive for compound boosts (in the same direction). This is exceedingly useful. This is why rapidities were invented.

Angles are much better behaved than, say, slopes. If A is sloping relative to B, and B is sloping relative to C, the compound slope is not the sum of the contributions (except when all slopes are small).   Rapidities are much better behaved than, say, velocities. If A is moving relative to B, and B is moving relative to C, the compound velocity is not the sum of the contributions (except when all velocities are small).

3  Solution of the Puzzle

Returning to the puzzle posed in section 1: After one second, the passengers have picked up a velocity of 9.80 meters per second. That means they have a rapidity of 9.80 m s−1 / c, since for small boosts the rapidity is equal to the velocity (in the appropriate units). There are just over 3×107 seconds in a year, so at the end of the year the rapidity is about 3×108 m s−1 / c. But since c is about 3×108 m s−1, the rapidity is ρ = 1. That’s a remarkable coincidence: The earth’s surface gravity times the earth’s year is about 1, in the appropriate units.

To answer the puzzle: After one year, the ordinary speed is |v| = tanh(ρ) = tanh(1) = 0.8 c.

[After two years, it would be tanh(2) = 0.96 c.]

4  Accelerated Reference Frames

One thing about this puzzle that sometimes throws people for a loop is that involves an accelerated reference frame. A key part of the statement is that the passengers feel an acceleration of one Gee. That is an observation made in an accelerated reference frame. Most people have relatively little experience working in accelerated reference frames, and they are justifiably uncertain as to which laws of physics can be relied on in such a frame. Some can, and some can’t

Let me coin a scale of complexity or sophistication:

Level 1 := Newtonian mechanics
Level 2 := Special relativity, as it is usually presented.
Level 3 := General relativity

Then I would say that accelerated reference frames are at level 2.1 or some such – more sophisticated than the usual introductory SR discussion, but certainly not requiring the heavy-duty machinery of GR (curved spacetime and all that).

I remember being bothered by accelerated frames for a day or so, back when I was a student. At first, nobody offered an explanation of why SR should work for accelerated reference frames. The discussion consisted of asking “why the heck shouldn’t it work” and asserting that there’s no reason why it shouldn’t. That was completely unconvincing, because there are some laws of physics that do not work in accelerated frames. In particular, the three laws of motion (in their usual basic form) don’t work in rotating reference frames; you need to more complex laws including centrifugal effects and Coriolis effects.

Eventually, the real explanation emerged: The trick is to arrange for a succession of instantaneously comoving unaccelerated observers. We know SR works for each observer separately. We then arrange to have enough observers at the right places at the right velocities, so that they can observe the action. Afterwards, we collect all their observations and integrate them.

By the correspondence principle, everything (including a modest acceleration) will look Newtonian to the instantaneously comoving observers, so life is easy for them.


1
Throughout this document, the term “velocity” will refer to the ordinary velocity, also called the coordinate velocity, v = dx/dt, not to be confused with the proper velocity, u = dx/dτ.
[Contents]
Copyright © 2003 jsd