In thermodynamics, it is common to have a large number of variables that are not all linearly independent. Such a situation is illustrated in figure 1.
The idea is that the thermodynamic state of the system is described by a point in some abstract D-dimensional space, but we have more than D variables that we are interested in. Figure 1 portrays a two-dimensional space (D=2), with three variables. You can usually choose D of them to form a linearly-independent basis set, but the rest of them will be linearly dependent, because of various constraints (the equation of state, conservation laws, boundary conditions, or whatever).
In such a situation, if we stay away from singularities, there is no important distinction between “independent” variables and “dependent” variables. Some people say you are free to choose any set of D nonsingular variables and designate them as your “independent” variables ... but usually that’s not worth the trouble, and – as we shall see shortly – it is more convenient and more logical to forget about “independent” versus “dependent” and treat all variables on the same footing.
Singularities can occur in various ways. A familiar example can be found in the middle of a phase transition, such as an ice/water mixture. In a diagram such as figure 1, a typical symptom would be contour lines running together, i.e. the spacing between lines going to zero somewhere.
See reference 1 for an overview of the laws of thermodynamics. Many of the key results in thermodynamics can be nicely formulated using expressions involving the d operator, such as:
| dE = T dS − P dV (1) |
In order to make sense of this, we need to know what kind of things are dE, dS, T dS, et cetera. We would like to be able to visualize them. It turns out that the best way to think about such things is in terms of differential forms in general and one-forms in particular. The details of how to deal with differential forms is explained in section 3.
But before we get into details, let’s look at some examples.
Consider some gas in a piston. The number of moles of gas remains fixed. We can use the variables T and V to specify where we are in the state space of the system. (Other variables work fine, too, but let’s use those for now.)
Figure 2 shows dV as a function of state. (See reference 1 for what we mean by “function of state”.) Obviously dV is a rather simple one-form. It is in fact a constant everywhere. It denotes a uniform slope up to the right of the diagram. Contours of constant V run vertically in the diagram.
Similarly, figure 3 shows dT as a function of state. This, too, is constant everywhere. It indicates a uniform slope up toward the top of the page. Contours of constant T run left-to-right in the diagram.
Note that the diagram of dT is also a diagram of dE, because for an ideal gas, E is just proportional to T.
Things get more interesting in figure 4, which shows dP as a function of state. (We temporarily assume we are dealing with an ideal gas.) Since dP is the gradient of something, we call it a grady one-form, in accordance with the definition given in item 17. We can see that dP is not a constant. It gets very steep when the temperature is high and/or the gas is squeezed into a small volume. For an ideal gas, the contours of constant P are rays through the origin. For a non-ideal gas, the figure would be qualitatively similar but would differ in details.
The one-forms dS, dT, dV, and dP are all grady one-forms, so you can integrate them globally, without specifying the path along which the integral is taken. When these variables take on the values implied by figure 4, if you integrate them “by eye” you can see that T is large along the top of the diagram, V is large along the right edge, and P is large when the temperature is high and/or the volume is small.
Mathematicians have a name for this d operator, namely the exterior derivative. But if that doesn’t mean anything to you, don’t worry about it. For more information about such things, see reference 2 and reference 3.
Here’s a point that is just a technicality now, but will be important later: These diagrams are meant to portray the one-forms directly. They portray the corresponding scalars T, V, and P only indirectly.
Figure 5 shows the difference between a grady one-form and an ungrady one-form.
| As you can see in on the left side of the figure, the quantity dS is grady. If you integrate clockwise around the loop as shown, the net number of upward steps is zero. This is related to the fact that we can assign an unambigous height (S) to each point in (T,S) space. | In contrast, as you can see on the right side of the diagram, the quantity TdS is not grady. If you integrate clockwise around the loop as shown, there are considerably more upward steps than downward steps. There is no hope of assigning a height “Q” to points in (T,S) space. |
Be warned that in the mathematical literature, what we are calling ungrady one-forms are called “inexact” one-forms. The two terms are entirely synonymous. A one-form is called “exact” if and only if it is the gradient of something. We avoid the terms “exact” and “inexact” because they are too easily misunderstood. In particular, in this context,
- exact is not even remotely the same as accurate.
- inexact is not even remotely the same as inaccurate.
- inexact does not mean “plus or minus something”.
- exact just means grady. An exact one-form is the gradient of some potential.
Remark: The idea of representing one-forms in terms of overlapping “fish scales” is not restricted to drawings. It is possible to arrange napkins or playing-cards in a loop such that each one is tucked below the next in clockwise order. This provides literally a hands-on model of an inexact one-form. Counting “steps up” minus “steps down” along a path is a model of integrating along the path.
You may be wondering what is the relationship between the d operator as seen in equation 1 and the plain old d that appears in the corresponding equation in your grandfather’s thermo book:
| dE = T dS − P dV (2) |
The answer goes like this: Traditionally, dE has been called a “differential” and interpreted as a small change in E resulting from some unspecified small step in state space. It’s hard to think of dE as being a function at all, let alone a function of state, because the step is arbitrary. The magnitude and direction of the step are unspecified.
In contrast, dE is to be interpreted as a machine that says: If you give me a vector that precisely specifies the direction and magnitude of a step in state space, I’ll give you the resulting change in E. If we apply this machine to an uncertain input we will get an uncertain output. But that doesn’t mean that the machine is arbitrary. The machine itself is completely non-arbitrary. The machine is a function of state.
By way of analogy: An ordinary matrix M is a machine that says: If you give me an input vector I, I will give you an output vector O, namely O=(M I). When talking about M, we have several choices:
This analogy is very tight. Indeed, at every point in state space, dE can be represented by a non-square matrix (one row and two columns, assuming our state space can be spanned by two variables such as V and T).
Operationally, you can (as far as I know) re-interpret every equation in thermodynamics, replacing d by d. All we are doing is shifting attention away from the output of the machine (d) onto the machine itself (d). This has several advantages and no known disadvantages. The main advantage is that we have replaced a vague thing with a non-vague thing. The machine dE is a function of state, as are the other machines dP, dS, et cetera. We can draw pictures of them.
Any legitimate equation involving d has a corresponding legitimate equation involving d. Of course, if you start with a bogus equation and replace d with d, it’s still bogus, as discussed in section 2. The formalism of differential forms may make the pre-existing errors more obvious, but you mustn’t blame it for causing the errors. Noticing an error is not the same as causing an error.
The notion of grady versus ungrady is not quite the same in the two formalisms: It makes perfect sense to talk about grady and ungrady one-forms. In contrast, as mentioned in section 1.2, it’s hard to talk about an ungrady differential, since if it’s ungrady, it’s not a differential at all, i.e. it’s not the gradient of anything.
Let’s forget about thermo for a moment, and let’s forget about one-forms. Let’s talk about plain old vector fields. In particular, imagine pressure as a function of position in (x,y,z) space. The pressure gradient is a vector field. I hope you agree that this vector field is perfectly well defined. There is a perfectly real vector at each (x,y,z) point.
A troublemaker might try to claim “the vector is merely a list of three numbers whose numerical values depend on the choice of basis, so the vector is really uncertain, not unique.” That’s a bogus argument. That’s not how we think of the physics. As explained in reference 4, we think of a physical vector as being more real than its components. The vector is a machine which, given a basis, will tell you the numerical values of its components. The components are non-unique, because they depend on the basis, but we attach physical reality to the vector, not the components.
The pressure gradient is a vector field. As we shall see in detail in section 3, there are two different kinds of vectors, leading to two perfectly good ways of representing the pressure gradient:
If you believe that the field of pointy vectors representing the pressure gradient is unique and well-defined, you ought to believe that the field of one-forms representing the same pressure gradient is equally unique and well-defined.
Given a nice Cartesian metric, in any basis the three numbers representing the pointy vector are numerically equal to the three numbers representing the one-form.
Returning to thermo: Let’s not leave behind all our physical and geometrical intuition when we start doing thermo. Thermo is weird, but it’s not so weird that we have to forget everything we know about vectors.
One-forms are vectors. They are as real as the more-familiar pointy vectors. To say the same thing another way, row vectors are just as real as column vectors.
If you think the pressure gradient dP is real and well-defined when P is a function of (x,y,z) you should think it is just as real and just as well-defined when P is a function of (V,T).
In addition to nice expressions such as equation 1, we all-too-often see dreadful expressions such as
| (3) |
As will be explained below, T dS is a perfectly fine one-form, but it is not an grady one-form, and therefore it cannot possibly equal dQ or d(anything) – except in very special circumstances, in which case one must not write anything like equation 3 without explicitly explaining the circumstances.
The same goes for P dV and many similar quantities that show up in thermodynamics. They cannot possibly equal d(anything) – except in very special circumstances.
Trying to find Q such that T dS would equal dQ is equivalent to trying to find the height of the water in an Escher waterfall, as shown in figure 6. It just can’t be done.
Of course, T dS does exist. You can call it almost anything you like, but you can’t call it dQ or d(anything). If you want to integrate T dS along some path, you must specify the precise path.
Again: P dV makes perfect sense as an ungrady one-form, but trying to write it as dW is tantamount to saying
There is no such thing as a W function, but if it did exist, and if it happened to be differentiable, then its derivative would equal P dV.
What a load of double-talk! Yuuuck!
We define differential forms to have the following properties:
| (4) |
for arbitrary scalar-valued functions f, g, et cetera. So we are using the set {[dxi]} as a basis.
| There exist pointy vectors, which are relatively familiar to most people. They can be represented by an arrow with a tip and a tail. In the language of linear algebra, these are column vectors. | There exist one-forms, which are less familiar to most people. They can be represented by contour-lines and/or fish-scales. In the language of linear algebra, these are row vectors. |
As we shall see, pointy vectors and one-forms have quite a few properties in common, but there are also some crucial differences, so be careful. In ordinary Cartesian (x,y,z) space, there is a one-to-one correspondence between pointy vectors and one-forms, but in thermodynamics (V,T) space we will not have any way of converting 1-forms to pointy vectors, nor any way of finding a 1-form that uniquely “corresponds” to a given pointy vector.
| df(x1, x2, ⋯) = |
|
| ⎪ ⎪ ⎪ ⎪ |
| [dxi] (5) |
where in the ith term of the sum, the partial derivative holds constant all the arguments to f() except for the xi argument. The notation for this is clumsy, but the idea is important. The partial derivative is really a directional derivative in a direction specified by holding constant an entire set of variables except for one … so it is crucial to know the entire set, not just the one variable that is nominally being differentiated with respect to. For details on this, including ways to visualize what it means, see reference 5.
An example is shown in figure 7. The intensity of the shading depicts the height of the function F := sin(x1)sin(x2) while the contour-lines depict the exterior derivative dF.
Suppose we want to visualize the gradient of some landscape. If you visualize the gradient as a pointy vector, it points uphill. In many cases, though, you are better off visualizing the gradient as a one-form, corresponding to contour lines that run across the slope.
You can judge the magnitude of the 1-form according to how closely packed the contour lines are. Closely-packed contours represent a large-magnitude 1-form. To say the same thing the other way, the spacing between contours is inversely related to the magnitude of the one-form.
Contour lines have the wonderful property that they behave properly under a change of coordinates: if you take a landscape such as the one in figure 7 and stretch it horizontally (keeping the altitudes the same) as shown in figure 8, the slopes become less. The contour lines on the corresponding topographic map spread out by the same stretch factor, as they should, to represent the lesser slope. In contrast, if you try to represent the gradient by pointy vectors, the representation is completely broken by a change in coordinates. As you stretch the map, the pointy vector doesn’t stretch; it has to get shorter to represent the lesser slope. If you want to represent a gradient, pointy vectors aren’t nearly so well-behaved as 1-forms; they aren’t attached to the real landscape the way contour lines are.
Of course, pointy vectors are needed also; they are appropriate for representing the location of one point relative to another in this landscape. These location vectors do stretch as they should when we stretch the map.
| pointy vector | one-form | |||
| Example: | distance | slope | ||
| Represented by: | column vector | row vector | ||
| When we stretch the map: | gets bigger | gets smaller | ||
| Adjective: | contravariant | covariant | ||
| Dirac notation: | ket |⋯⟩ | bra ⟨⋯| | ||
| dxi ∧ dxj = −dxj ∧ dxi (6) |
for all (i, j).
Non-grady force fields are common in the real world. See reference 6 for more about how to visualize such things.
A conspicuously ungrady form w is shown in figure 9. You can imagine that w = PdV (“work”) in a slightly-idealized heat engine. The form points everywhere counterclockwise. This w is a perfectly fine 1-form, but you cannot write w = dW because w cannot be the slope of any potential W. The concept of slope is locally well-defined, and you can integrate the slope along a particular path from A to B, but you cannot use this integral to define a potential difference W(B) − W(A) because the result depends very much on which path you choose. This is like Escher’s famous “Waterfall” shown in figure 6.
To repeat: You are free to write w = PdV. That is a perfectly fine 1-form, well-defined at every point in the phase space of the system. In contrast, you should be leery of writing w = dW or PdV = dW, because that cannot be well-defined throughout the phase space. (You might be able to define something like that on a one-dimensional subspace, along a particular path through the system, but then you would need to decorate “W” with all sorts of subscripts to indicate exactly which subspace you are talking about.)
A more subtle example of ungrady form is discussed in item 21 below.
| d(A ∧ B) = dA ∧ B + (−1)k A ∧ dB (7) |
where A has grade=k.
| dd = 0; (8) |
This important result can be expressed in words: “the boundary of a boundary is zero”.
Even though d when applied to a scalar function produces the ordinary first derivative, you should not think of d as the general-purpose derivative operator when applied to non-scalars, and dd is certainly not the general-purpose second-derivative operator. In fact, according to equation 7, dd would be the antisymmetric piece of the second derivative – except that the second derivative can never have an antisymmetric piece, because of the mathematically-guaranteed symmetry of mixed partial derivatives:
| ≡ |
| (9) |
for all f, assuming the derivatives exist.
Forms that are closed, including figure 7 and figure 10, have the property that the “contour” lines in one region mesh nicely with the lines in adjacent regions. In a non-closed form such as figure 9, the meshing fails somewhere. (Commonly it fails everywhere.)
Beware that this notion of “closed one-form” is not equivalent to the notion of “closed set” (containing its limit points) nor to the notion of “closed manifold” (compact without boundary). See reference 7 and reference 8.
| ∫ |
| dF = F(B) − F(A) (10) |
The meaning is simple: the integral measures the number of contours that you cross in going from point A to point B. For a grady 1-form, this number is independent of the path you take along the way from A to B.
This integral is, of course, linear.
| B = fi(x) dxi (11) |
We are using the Einstein summation convention, i.e. implied summation over repeated indices, such as index i in this equation.
As explained in section 4, the integral of this is:
| (12) |
To understand how we integrate a one-form B along the curve C, start by breaking the curve into small segments and integrating each segment separately:
| (13) |
and if f is a sufficiently smooth function and if C is a sufficiently smooth curve, and if the points {θ1, θ2, ⋯} are sufficiently close together, then we can treat f as being locally constant and pull it out front of the integrals:
| (14) |
Now we have grady forms inside the integral, so we can integrate them immediately using equation 10. We get
| (15) |
where we have described the point C(θ) using an expansion in terms of the basis vectors:
| C(θ) = Ci(θ) xi (16) |
Equation 15 is beginning to look like a familiar Riemann integral. In fact it is just
| (17) |
In equation 17, do not think of the integrand as a dot product, even though it involves the same sum-of-products you would use for evaluating f · ∂C/∂θ. We do not have a dot product. The operation here is a contraction. A contraction involves a one-form acting on a pointy vector. In this case the one-form is f and the pointy vector is ∂C/∂θ. In equation 15, you can visualize [Ci(θ2) − Ci(θ1)] as a pointy vector with its tip at C(θ2) and its tail at C(θ1).
We can carry out the contraction of a one-form with a pointy vector. We cannot carry out the dot product of two one-forms, nor the dot product of two pointy vectors. Think of one-forms as 1×D matrices (one row and D columns) and pointy vectors as D×1 matrices.
As an example, consider integrating the one-form
| f := |
| dx1 + |
| dx2 (18) |
where r := √(x12 + x22). This one-form is depicted, with fair accuracy, in figure 9. We wish to integrate it along a curve C which is a circular path of radius R, centered on the origin, so that along C:
| (19) |
Plugging in to equation 17 we find
| (20) |
(beware: at some points this assumes the existence of a dot product.)