[Previous] [Contents] [Next]

# 12  Spontaneity, Reversibility, and Equilibrium

## 12.1  Fundamental Notions

### 12.1.1  Equilibrium

See section 9.1 for a discussion of the fundamental concept of equilibrium.

Figure 12.1: Equilibrium – Forces in Balance

### 12.1.2  Stability

See section 9.1 for a discussion of the fundamental concept of stability.

Figure 12.2: Equilibrium and Stability

### 12.1.3  Irreversible by State or by Rate

Consider the contrast:

 In figure 12.3, there is no attempt to make the process reversible. The descent of the anvil is grossly dissipative. You can tell how much energy is dissipated during the process just by looking at the initial state and the final state. In figure 12.4, the process is very nearly reversible. There will probably always be “some” friction, but we may be able to engineer the bearing so that the friction is small, maybe even negligible. Typically, to a good approximation, the power dissipation will be second order in the rate of the process, and the total energy dissipated per cycle will be first order in the rate.

 We can call this “irreversible by state” and say that the amount of dissipation is zeroth order in the rate (i.e. independent of the rate). We can call this “irreversible by rate” and say that the amount of dissipation is first order in the rate.

 Figure 12.3: Irreversible by State Figure 12.4: Irreversible by Rate

### 12.1.4  Transformations, One-Dimensional or Otherwise

It is often interesting to ask, in a given situation, whether the situation is unstable, i.e. expected to change spontaneously – and if so, to ask in which direction it is expected to change. Let’s start by examining the idea of “direction” in thermodynamic state-space.

For starters, consider reactions involving carbon, oxygen, and carbon dioxide. Under some conditions we have a simple reaction that proceeds in the direction suggested by equation 12.1, which for the moment we choose to call the “forward” direction:

 C + O2 → CO2              (12.1)

Meanwhile, under other conditions the reverse reaction occurs, i.e. the decomposition of carbon dioxide to form carbon and oxygen.

More generally, however, we need to consider other possibilities, such as the possible presence of carbon monoxide.

 C + O2 → CO + ½O2 → CO2              (12.2)

This is now a multi-dimensional situation, as shown schematically in figure 12.5.

Figure 12.5: Reaction Space with Entropy Contours

Consider the contrast:

 In one dimension, we can speak of a given transformation proceeding forward or backward. Forward means proceeding left-to-right as written in equation 12.1, while backwards means proceeding right-to-left. In multiple dimensions, we cannot speak of forward, backward, right, or left. We need to specify in detail what sort of step is taken when the transformation proceeds.

Terminology note: In this document, the term “transformation” is meant to be very general, including chemical reactions and phase transitions among other things. (We do not consider it useful, or even possible, to distinguish “chemical processes” from “physical processes”.)

### 12.1.5  Conditionally Allowed and Unconditionally Disallowed

Entropy and the second law of thermodynamics will play a central role in our analysis of spontaneity and reversibility.

Keep in mind that the second law is only one law among many. Other laws include conservation of energy (aka the first law of thermodynamics), other conservation laws, various symmetries, spectroscopic selection rules, mathematical theorems, et cetera.

A process can proceed only if it complies with all the laws. Therefore if a process is forbidden by one of the laws, it is unconditionally forbidden. In contrast, if the process is allowed by one of the laws, it is only conditionally allowed, conditioned on compliance with all the other laws.

Based on a second-law analysis alone, we can determine that a process absolutely will not proceed spontaneously, in situations where doing so would violate the second law. In contrast, a second-law analysis does not allow us to say that a process “will” proceed spontaneously. Until we do a more complete analysis, the most we can say is that it might proceed spontaneously.

We are on much firmer ground when it comes to reversibility. In the context of ordinary chemical reactions,1 if we know2 the reaction can proceed in one direction, it is reversible if and only if it is isentropic. That’s because of the reversibility of all the fundamental laws governing such reactions, except the second law. So if there is no barrier to the forward reaction, there should be no barrier to the reverse reaction, other than the second law.

Terminology note: If the world were logical, an irreversible reaction would be called exentropic (in analogy with exergonic and similar terms) … but usually people just say “entropic” instead of “exentropic”.

### 12.1.6  General Analysis

I am reminded of the immortal words of David Goodstein. In reference 36, the section on “Variational Principles in Thermodynamics” begins by saying

Fundamentally there is only one variational principle in thermodynamics. According to the Second Law, an isolated body in equilibrium has the maximum entropy that physics circumstances will allow. However, given in this form, it is often inconvenient to use.

It is usually more practical to extremize something like the energy, free energy, enthalpy, or free enthalpy, depending on circumstances. However, you should keep in mind that any such thing is only a proxy for the thing you really should be extremizing, namely the entropy.

This is meant to be a review article, not a tutorial, so we will turn the normal pedagogical sequence on its head and start with the most general and most reliable formulation of the problem. Later we will discuss various simplified versions, and show why (subject to suitable restrictions) they are useful approximations.

Here’s the fundamental, essential criterion: A transformation will not proceed in any direction that reduces the amount of total entropy. This is essentially a statement of the second law of thermodynamics. (The second law, and the definition of entropy, are discussed in chapter 2.)

 Δ Stotal ≥ 0              (12.3)

To say the same thing the other way: Any step that produces a positive amount of total entropy is irreversible in the thermodynamic sense.

Suppose a transformation can proceed in a direction that increases the total entropy. Then the transformation in all likelihood will proceed in such a direction, in preference to directions that leave the total entropy unchanged.

In thermodynamic equilibrium, no such direction exists. The system might already be in a configuration of globally maximal entropy ... or it might be restricted by huge activation barriers, symmetries, selection rules, or other laws of physics such that it cannot move toward higher entropy at any non-negligible rate.

This can be expressed as a variational principle. It says that at equilibrium, you can take a small step in any direction, and to first order the entropy doesn’t change. There are standard procedures for using variational principles as the basis for analytical and computational techniques. These techniques are often elegant and powerful. The details are beyond the scope of this document.

Note that the spontaneity and reversibility criteria are based on the total entropy (Stotal), which consists of the entropy (S) inside the region of interest plus whatever entropy, if any, leaked out of the region across the boundaries during the transformation.

To quantify which directions correspond to increasing total entropy, it suffices to look at the exterior derivative, dStotal. This is a vector; in particular it is a one-form, as opposed to a pointy vector. As such, it is best visualized as a set of contours, namely contours of constant Stotal, as shown schematically in figure 12.5. (For details on one-forms and their application to thermodynamics, see reference 4.)

In equation 12.3, there are two possibilities

• The system may sit at a point of maximal Stotal (thermodynamic equilibrium), or it may skate along a contour of constant Stotal (a reversible transformation), in which case

 dStotal = 0
(12.4)

• The system may step across the contours in any direction that increases Stotal (an irreversible transformation), in which case

 dStotal > 0
(12.5)

This completes the general analysis. This is the whole story. This must be correct, since it is a direct consequence of the law of paraconservation of entropy. For an explanation of this law, see section 2.1.

## 12.2  Example: Heat Transfer

Suppose we have an object at a temperature T2 and we want to transfer some so-called “heat” to an object at some lesser temperature T1. As usual, trying to quantify “heat” is a losing strategy; it is easier and better to formulate the analysis in terms of energy and entropy.

Under the given conditions we can write

 dE1 = T1 dS1 dE2 = T2 dS2
(12.6)

In more-general situations, there would be other terms on the RHS of such equations, but for present purposes we require all other terms to be negligible compared to the T dS term. This requirement essentially defines what we mean by heat transfer or equivalently thermal transfer of energy.

By conservation of energy we have

 dE1 + dE2 = 0 dE1 = − dE2
(12.7)

One line of algebra tells us that the total entropy of the world changed by an amount

 dS1 + dS2 = (1/T1 − 1/T2) dE1
(12.8)

or equivalently

dS1 + dS2 =
 T2 − T1 T1 T2
dE1
> 0(since T2 > T1)
(12.9)

From the structure of equation 12.9 we can see that a thermal transfer from the hotter object to the cooler object gives rise to an increase in the entropy of the world. Therefore such a transfer can proceed spontaneously. A transfer in the other direction cannot proceed spontaneously, since that would violate the second law of thermodynamics.

We can also see that

• If the two temperatures are very close together, a thermal transfer of energy can be very nearly reversible.
• Conversely, if the temperature difference is large, the transfer creates a lot of entropy, and is therefore strongly irreversible, strongly dissipative.

The reversible case will be rather slow. This can be understood in terms of the small temperature difference in conjunction with a finite thermal conductivity. In practice, people usually accept a goodly amount of inefficiency as part of the price for going fast. This involves engineering tradeoffs. We still need the deep principles of physics, but we need engineering on top of that.

## 12.3  Carnot Efficiency Formula

Reasoning along lines similar to section 12.2, we can derive a famous formula that places limits on the efficiency of any so-called heat-engine. We will define the notion of efficiency in such a way that it applies to any thermodynamic cycle, not just the Carnot cycle described in section 7.7.

### 12.3.1  Definition of Heat Engine

Figure 12.6 is a sketch of such an engine. The details don’t matter. The key concept is that the heat engine, by definition, has three connections, highlighted by magental labels in the figure.

Figure 12.6: Heat Engine

Even though this is called a heat engine, trying to quantify the “heat” is a losing strategy. It is simpler and in every way better to quantify the energy and the entropy.

We could write the change in the energy content of the engine using equation 12.10, including all the terms on the RHS. However, as part of our definition of what we mean by heat engine, we require that only the terms shown in magenta are significant.

 dE = T3 dS3 − F3 · dX3 + ⋯ (connection #3) + T2 dS2 − F2 · dX2 + ⋯ (connection #2) + T1 dS1 − F1 · dX1 + ⋯ (connection #1)
(12.10)

More specifically, we require that connection #3 be 100% thermal, connection #2 be 100% mechanical i.e. nonthermal, and connection #1 be 100% thermal. The engine is designed to segregate the thermal energy-transfers from the mechanical energy-transfers.

We have just constructed a theoretical model of an engine. This is a remarkbly good model for a wide range of real-world engines. However, it must be emphasized that it does not apply to all engines. In particular, it does not apply to batteries or to electrochemical fuel cells.

We must impose one more restriction: We require that the engine be able to operate in a cycle, such that at the end of the cycle, after doing something useful, all the internal parts of the engine return to their initial state. In particular, we require that the engine not have a “hollow leg” where it can hide unlimited amounts of entropy for unlimited amounts of time. This requirement makes sense, because if entropy could be hidden, it would defeat the spirit of the second law of thermodynamics.

Without loss of generality we assume that T2T1. There is no loss of generality, because the engine is symmetrical. If necessary, just relabel the connections to make T2T1. Relabeling is just a paperwork exercise, and doesn’t change the physics.

### 12.3.2  Analysis

The engine starts by taking in a certain amount of energy via connection #3. We might hope to convert all of this heat-energy to useful work and ship it out via connection #2, but we can’t do that. The problem is that because connection #3 is a thermal connection, when we took in the energy, we also took in a bunch of entropy, and we need to get rid of that somehow. We can’t get rid of it through connection #2, so the only option is to get rid of it through connection #1.

For simplicity, we assume that T3 and T1 are constant throughout the cycle. We also make the rather mild assumption that T1 is greater than zero. Section 12.3.3 discusses ways to loosen these requirements.

Under these conditions, pushing entropy out through connection #1 costs energy. We call this the waste heat:

waste heat =
−
 Γ
T1 dS1
(12.11)

where the path Γ represents one cycle of operating the engine.

We can now use conservation of energy to calculate the mechanical work done by the engine:

work out =
 Γ
F2 · dX2
(12.12a)
=
 Γ
T3 dS3
 Γ
T1 dS1
(12.12b)
= T3 ΔS3T1 ΔS1     (12.12c)

where on the last line we have used that fact that the temperatures are unchanging to allow us to do the entropy-integrals. The deltas implicitly depend on the path Γ.

The second law tells us that over the course of a cycle, the entropy going out via connection #1 (−ΔS1) must be at least as much as the entropy coming in via connection #3 (+ΔS3). For a reversible engine, the two are equal, and we can write:

 ΔS3 =   +ΔS▸ ΔS1 =   −ΔS▸
(12.13)

Where ΔS is pronounced “delta S through” and denotes the amount of entropy that flows through the engine, in the course of one cycle. We define the efficiency as

η :=
 mechanical transfer out thermal transfer in
(12.14)

Still assuming the temperatures are greater than zero, the denominator in this expression is just T3 ΔS3. Combining results, we obtain:

η :=
 T3 ΔS3 + T1 ΔS1 T3 ΔS3
(12.15)

The maximum efficiency is obtained for a thermodynamically reversible engine, in which case

ηrev =
 T3 ΔS▸ − T1 ΔS▸ T3 ΔS▸
(12.16a)
=
 T3 − T1 T3
(12.16b)

where equation 12.16b is the famous Carnot efficiency formula, applicable to a reversible heat engine. The meaning of the formula is perhaps easier to understand by looking at equation 12.16a or equation 12.15, wherein the second term in the numerator is just the waste heat, in accordance with equation 12.11.

Let us now consider an irreversible engine. Dissipation within the engine will create extra entropy, which will have to be pushed out via connection #1. This increases the waste heat, and decreases the efficiency η. In other words, equation 12.16b is the exact efficiency for a reversible heat engine, and an upper bound on the efficiency for any heat engine.

### 12.3.3  Discussion

1.
If you were wondering whether it is possible to construct even one thermodynamic cycle that complies with all the restrictions in section 12.3.2, fear not. The job can be done using a Carnot cycle, as described in section 7.7. On the other hand, as mentioned in that section, the Carnot cycle often gets more attention than it deserves.

2.
Furthermore, the Carnot efficiency formula often gets more attention than it deserves.
• For one thing, not all engines are heat engines. The Carnot efficiency formula equation 12.16b applies only to heat engines, as defined by the magenta terms in equation 12.10.

As a specific, important counterexample, a battery or an electrochemical fuel cell can have an efficiency enormously greater than what you would guess based on equation 12.16b ... even in situations where the chemicals used to run the fuel cell could have been used to run a heat engine instead.

• Consider the following scenario: Some guy buys a bunch of coal. He uses it to boil some steam at temperature T3. He uses that to drive a steam engine. He uses river water to cool the condenser at temperature T1. He manages to operate the engine in such a way that its efficiency is close to the Carnot efficiency. He is very pleased with himself, and runs spots on television advertising how efficient he is. “I’m as efficient as the laws of physics allow, other things being equal.”

The problem is, he’s asking the wrong question. Rather than asking how well he is doing relative to the Carnot efficiency, he should be asking how well he is doing relative to the best that could be done using the same amount of fuel.

Specifically, in this scenario, the guy next door is operating a similar engine, but at a higher temperature T3. This allows him to get more power out of his engine. His Carnot efficiency is 30% higher. His actual efficiency is only 20% higher, because there are some unfortunate parasitic losses. So this guy is not running as close to the Carnot limit as the previous guy. Still, this guy is doing better in every way that actually matters.

[This assumes that at least part of the goal is to minimize the amount of coal consumed (for a given amount of useful work), which makes sense given that coal is a non-renewable resource. Similarly part of the goal is to minimize the amount of CO2 dumped into the atmosphere. The competing engines are assumed to have comparable capital cost.]

Suppose you are trying to improve your engine. You are looking for inefficiencies. Let’s be clear: Carnot efficiency tells you one place to look ... but it is absolutely not the only place to look.

3.
Looking at the structure of the result in equation 12.16b, one can understand why we were tempted to require that T1 and T3 have definite, constant values.

However, any heat engine still has a well-defined thermal efficiency, as defined by equation 12.14, even if the temperatures are changing in peculiar ways over the course of the cycle.

Furthermore, with some extra work, you can convince yourself that the integrals in equation 12.12b take the form of a ΔS times an average temperature. It’s a peculiar type of weighted average. This allows us to interpret the Carnot efficiency formula (equation 12.16b) in a more general, more flexible way.

4.
As a slightly different line of reasoning that leads to the same conclusion, start with an ordinary Carnot cycle, such as we see in figure 7.5 or figure 7.6. You can imagine covering the entire space with a mosaic of tiny Carnot cycles, as shown in figure 12.7. There is one big cycle made up of nine tiny cycles, such that the left edge of one cycle coincides with the right edge of another, and the top of one coincides with the bottom of another. For an ideal reversible engine, going around each of the nine tiny cycles once is identical to going around the big cycle once, because all of the interior edges cancel.

By selecting a suitable set of the tiny Carnot cycles, you can approximate a wide class of reversible cycles. The efficiency can be calculated from the definition, equation 12.14, by summing the thermal and mechanical energy, summing over all the sub-cycles.

Figure 12.7: Carnot Cycle with Sub-Cycles

5.
A hot gas has more entropy than a cool gas. This is partly because the probability distribution is more “spread out” in phase space, more spread out along the momentum direction. It is also partly because the unit of measure in phase space is smaller, as discussed in section 11.3.

Because of the higher entropy, you might think the gas has less “available” energy (whatever that means). However, the hot gas contains more energy as well as more entropy, and it is easy to find situations where the hot gas is more useful as an energy-source, for instance if the gas is applied to connection #3 in a heat engine. Indeed, in order to increase efficiency in accordance with equation 12.16, engine designers are always looking for ways to make the hot section of the engine run hotter, as hot as possible consistent with reliability and longevity.

Of course, when we examine the cold side of the heat engine, i.e. connection #1, then the reverse is true: The colder the gas, the more valuable it is for producing useful work. This should make it clear that any notion of “available energy” cannot possibly be a function of state. You cannot look at a bottle of gas and determine how much of its energy is “available” – not without knowing a whole lot of additional information about the way in which the gas is to be used. This is discussed in more detail in section 1.5.3.

6.
If you take equation 12.16b and mindlessly extrapolate it to the case where T1 is negative, you might hope to achieve an efficiency greater than 100%. However, before doing that note that the defining property of negative temperature is that when entropy goes out via connection #1, energy comes in. Therefore we must include this energy input in the denominator in the definition of efficiency, equation 12.14. When we do that, instead of getting equation 12.15 we get:

η :=
 T3 ΔS3 + T1 ΔS1 T3 ΔS3 + T1 ΔS1

= 100%    (always)
(12.17)

Both terms in the denominator here are positive; the second term involves a double negative.

We can understand this as follows: Whenever there are two thermal connections, one at a positive temperature and one at a negative temperature, the engine takes in energy via both thermal connections, and sends it all out via the mechanical connection. Therefore the efficiency is always exactly 100%, never more, never less.

Even if the engine is irreversible, such that −ΔS1 is greater than −ΔS3, the engine is still 100% efficient in its use of energy, because absolutely all of the energy that is thermally transferred in gets mechanically transferred out.

These conclusions are restricted to a model that assumes the engine has only three connections to the rest of the world. A more realistic engine model would allow for additional connections; see next item.

7.
A term involving T4 ΔS4 could be added to our model, to represent losses due to friction, losses due to heat leaks, et cetera.

η :=
 T3 ΔS3 + T1 ΔS1 + T4 ΔS4 T3 ΔS3 + T1 ΔS1

< 100%    (since T4 ΔS4 < 0)
(12.18)

In the real world, the only known ways of producing a heat bath at negative temperature are horrendously inefficient, so the 100% efficiency mentioned in the previous items is nowhere near being relevant to a complete, practical system.

## 12.4  Properties of the Equilibrium State

In this section, we derive a couple of interesting results. Consider a system that is isolated from the rest of the universe, and can be divided into two parcels. We imagine that parcel #1 serves as a heat bath for parcel #2, and vice versa. Then:

• If/when the two parcels have reached equilibrium by exchanging energy, they will have the same temperature.
• If/when the two parcels have reached equilibrium by exchanging particles as well as energy, they will have the same chemical potential (and the same temperature).

With great generality, we can say that at thermodynamic equilibrium, the gradient of the entropy vanishes, as expressed in equation 12.4.

We write the gradient as dS rather than ∇S for technical reasons, but either way, the gradient is a vector. It is a vector in the abstract thermodynamic state-space (or more precisely, the tangent space thereof).

As usual, subject to mild conditions, we can expand dS using the chain rule:

dS =
 ∂S ∂N

 E
dN +
 ∂S ∂E

 N
dE
(12.19)

We assume constant volume throughout this section. We also assume all the potentials are sufficiently differentiable.

We recognize the partial derivative in front of dE as being the inverse temperature, as defined by equation 22.4, which we repeat here:

β :=
 ∂S ∂E

 N
(12.20)

We can rewrite the other partial derivative by applying the celebrated cyclic triple partial derivative rule:

 ∂S ∂N

 E
 ∂N ∂E

 S
 ∂E ∂S

 N
= −1
(12.21)

For an explanation of where this rule comes from, see section 12.11. We can re-arrange equation 12.21 to obtain:

 ∂S ∂N

 E
=
−1
 ∂E ∂N

 S

 ∂S ∂E

 N
(12.22)

Note that if you weren’t fastidious about keeping track of the “constant E” “constant S” and “constant N” specifiers, it would be very easy to get equation 12.22 wrong by a factor of −1.

We recognize one of the factors on the RHS as the chemical potential, as defined by equation 6.27, which we repeat here:

µ :=
 ∂E ∂N

 S
(12.23)

Putting together all the ingredients we can write:

 dS = 0 = − µ β dN + β dE
(12.24)

Since we can choose dN and dE independently, both terms on the RHS must vanish separately.

If we divide the system into two parcels, #1, and #2, then

 dS1 = − dS2 since dS=0 at equilibrium dN1 = − dN2 since N is conserved dE1 = − dE2 since E is conserved
(12.25)

plugging that into the definitions of β and µ, we conclude that at equilibrium:

 β1 = β2 if parcels exchange energy β1 µ1 = β2 µ2 if parcels exchange energy and particles T1 = T2 µ1 = µ2
(12.26)

The last two lines assume nonzero β i.e. non-infinite temperature.

So, we have accomplished the goal of this section. If/when the two parcels have reached equilibrium by exchanging energy, they will have the same inverse temperature. Assuming the inverse temperature is nonzero, then:

• If/when the two parcels have reached equilibrium by exchanging energy, they will have the same temperature. In other words, equilibrium is isothermal.
• If/when the two parcels have reached equilibrium by exchanging particles as well as energy, they will have the same chemical potential (and the same temperature).

One way to visualize this is in terms of the gradient vector dS. The fact that dS=0 implies the projection of dS in every feasible direction must vanish, including the dE direction and the dN direction among others. Otherwise the system would be at non-equilibrium with respect to excursions in the direction(s) of non-vanishing dS.

## 12.5  The Approach to Equilibrium

### 12.5.1  Non-Monotonic Case

Figure 12.8 shows the position and momentum of a damped harmonic oscillator. The system starts from rest at a position far from equilibrium, namely (position, momentum) = (1, 0). It then undergoes a series of damped oscillations before settling into the equilibrium state (0, 0). In this plot, time is an implicit parameter, increasing as we move clockwise along the curve.

Figure 12.8: Phase Space : Under-Damped Harmonic Oscillator

You will note that neither variable moves directly toward the equilibrium position. If we divide the phase space into quadrants, and look at the sequence of events, ordered by time:

• In quadrant IV, the momentum is negative and becoming more so, i.e. moving away from equilibrium.
• In quadrant III, the position is negative and becoming more so, i.e. moving away from equilibrium.
• In quadrant II, the momentum is positive and becoming more so, i.e. moving away from equilibrium.
• In quadrant I, the position is positive and becoming more so, i.e. moving away from equilibrium.

This should convince you that the approach to equilibrium is not necessarily monotonic. Some variables approach equilibrium monotonically, but others do not.

When analyzing a complex system, it is sometimes very useful to identify a variable that changes monotonically as the system evolves. In the context of ordinary differential equations, such a variable is sometimes called a Lyapunov function.

In any physical system, the overall entropy (of the system plus environment) must be a monotone-increasing Lyapunov function, as we know by direct application of the second law of thermodynamics. For a system with external damping, such as the damped harmonic oscillator, decreasing system energy is a convenient proxy for increasing overall entropy, as discussed in section 12.6.4; see especially equation 12.40. Note that contours of constant energy are circles in figure 12.8, so you can see that energy decreases as the system evolves toward equilibrium.

### 12.5.2  Monotonic Case

For a critically damped or overdamped system, the approach to equilibrium is non-oscillatory. If the system is initially at rest, the position variable is monotonic, and the momentum variable is “almost” monotonic, in the sense that its absolute value increases to a maximum and thereafter decreases monotonically. More generally, each variable can cross through zero at most once. The critically damped system is shown in figure 12.9.

Figure 12.9: Phase Space : Critically-Damped Harmonic Oscillator

### 12.5.3  Approximations and Misconceptions

We know that if two or more regions have reached equilibrium by the exchange of energy, they have the same temperature, subject to mild restrictions, as discussed in section 12.4. This is based on fundamental notions such as the second law and the definition of temperature.

We now discuss some much less fundamental notions:

• Roughly speaking, ordinarily, hot things tend to cool off and cold things tend to warm up.
• Roughly speaking, ordinarily, hot things cool off monotonically and cold things warm up monotonically. In other words, in each region, temperature “ordinarily” behaves like a Lyapunov function.

Beware that some authorities go overboard and elevate these rules of thumb to the status of axioms. They assert that heat can never spontaneously flow from a lower-temperature region to a higher-temperature region. Sometimes this overstatement is even touted as “the” second law of thermodynamics.

This overstatement is not reliably true, as we can see from the following examples.

As a simple first example, consider a spin system. Region #1 is at a moderate positive temperature, while region #2 is at a moderate negative temperature. We allow the two regions to move toward equilibrium by exchanging energy. During this process, the temperature in region #1 will become more positive, while the temperature in region #2 will become more negative. The temperature difference between the two regions will initially increase, not decrease. Energy will flow from the negative-temperature region into the positive-temperature region.

This example tells us that the concept of inverse temperature is more fundamental than temperature itself. In this example, the difference in inverse temperature between the two regions tends to decrease. This is the general rule when two regions are coming into equilibrium by exchanging energy. It is a mistake to misstate this in terms of temperature rather than inverse temperature, but it is something of a technicality, and the mistake is easily corrected.

Here is another example that raises a much more fundamental issue: Consider the phenomenon of second sound in superfluid helium. The temperature oscillates like one of the variables in figure 12.8. The approach to equilibrium is non-monotonic.

Let’s be clear: For an ordinary material such as a hot potato, the equation of thermal conductivity is heavily overdamped, which guarantees that temperature approaches equilibrium monotonically ... but this is a property of the material, not a fundamental law of nature. It should not be taken as the definition of the second law of thermodynamics, or even a corollary thereof.

## 12.6  Useful Proxies for Predicting Spontaneity, Reversibility, Equilibrium, etc.

In this section, we apply the general law to some important special cases. We derive some simplified laws that are convenient to apply in such cases.

### 12.6.1  Reduced Dimensionality

As our first simplification, we don’t need to consider all imaginable ways in which Stotal could change, just changes along directions in which the system is free to move. To say the same thing the other way, if there are constraints which forbid certain changes, we can neglect the corresponding components of dStotal. This is a mild and harmless simplification.

As a further simplification, let’s restrict attention to situations where there are lots of constraints, enough so that the “reaction coordinate space” is effectively one-dimensional, such as the simple example expressed in equation 12.1. In such a situation, it suffices to look at the directional derivative of Stotal, namely the derivative in the direction in which the reaction coordinate is free to move. This is a corollary of the statement in the previous paragraph. It is slightly less general, but has the advantage of being easier to explain to students who don’t know what an exterior derivative is.

As a less-general corollary, which is even easier to explain to the mathematically unwashed, we can approximate the directional derivative by a finite difference, ΔStotal. As always, we define

 ΔStotal := Stotal(after) − Stotal(before)
(12.27)

and as always, we must be completely clear as to what we mean by “before” and “after”.

In this case, the only sensible choice is two points on the locus of the reaction coordinate, both very close to the “current” reaction coordinate, and ordered according to the “forward” direction of the reaction in question. This version is particularly easy to sketch, along the lines of figure 12.10. In this example, ΔStotal is positive, so the reaction will proceed forward, left-to-right as written in equation 12.1.

Figure 12.10: Simple Reaction Space with Entropy Curve

### 12.6.2  Constant V and T

We now rescind the simplifications mentioned in section 12.6.1, so that we can consider a different genre of simplifications.

In this section, we restrict attention to conditions of constant volume and constant positive temperature. (Section 12.6.1 was free of such restrictions). We shall see that this allows us to answer questions about spontaneity using the system Helmholtz free energy F as a proxy for the overall entropy Stotal.

The situation is shown in figure 12.11. We have divided the universe into three regions:

• the interior region – inside the blue cylinder;
• the neighborhood – outside the cylinder but inside the black rectangle; and
• the rest of the universe – outside the black rectangle.

The combination of interior region + neighborhood will be called the local region.

Figure 12.11: Constant Volume and Temperature; Cylinder + Neighborhood

Inside the blue cylinder is some gas. In the current scenario, the volume of the cylinder is constant. (Compare this to the constant-pressure scenario in section 12.6.3, where the volume is not constant.)

Inside the neighborhood is a heat bath, as represented by the magenta region in the figure. It is in thermal contact with the gas inside the cylinder. We assume the heat capacity of the heat bath is very large.

We assume the combined local system (interior + neighborhood) is isolated from the rest of the universe. Specifically, no energy or entropy can flow across the boundary of the local system (the black rectangle in figure 12.11).

We use unadorned symbols such as F and S etc. to denote the free energy and entropy etc. inside the interior region. We use a subscript “n” as in En and Sn etc. to represent the energy and entropy etc. in the neighborhood.

Here’s an outline of the usual calculation that shows why dF is interesting. Note that F is the free enthalpy inside the interior region. Similarly S is the entropy inside the interior region (in contrast to Stotal, which includes all the local entropy, Stotal = S + Sn). This is important, because it is usually much more convenient to keep track of what’s going on in the interior region than to keep track of everything in the neighborhood and the rest of the universe.

We start by doing some math:

 F := E − TS by definition of F dF = dE − T dS − S dT by differentiating = dE − T dS since dT=0 by hypothesis dS − dE/T = −dF/T by rearranging
(12.28)

Next we relate certain inside quantities to the corresponding outside quantities:

 T = Tn temperature the same everywhere E + En = const local conservation of energy dE = − dEn by differentiating
(12.29)

Next we assume that the heat bath is internal equilibrium. This is a nontrivial assumption. We are making use of the fact that it is a heat bath, not a bubble bath. We are emphatically not assuming that the interior region is in equilibrium, because one of the major goals of the exercise is to see what happens when it is not in equilibrium. In particular we are emphatically not going to assume that E is a function of S and V alone. Therefore we cannot safely expand dE = T dSP dV in the interior region. We can, however, use the corresponding expansion in the neighborhood region, because it is in equilibrium:

 dEn = T dSn − P dVn bath in equilibrium = T dSn constant V by hypothesis dSn = dEn / T by rearranging = − dE / T by conservation of energy, equation 12.29
(12.30)

Next, we assume the entropy is an extensive quantity. That is tantamount to assuming that the probabilities are uncorrelated, specifically that the distribution that characterizes the interior is uncorrelated with the distribution that characterizes the neighborhood. This is usually a very reasonable assumption, especially for macroscopic systems.

We are now in a position to finish the calculation.

 dStotal = dS + dSn entropy is extensive = dS − dE/T by equation 12.30 = −dF/T by equation 12.28
(12.31)

This tells us that under the stated constraints (constant V and constant positive T), anything we could predict based on maximizing Stotal we can equally well predict based on minimizing F.

Beware: Under other conditions (other than constant V and constant positive T), you cannot reliably predict the direction of spontaneity based on F or derivatives of F. Indeed there are plenty of cases where S is well defined but F is not even definable.

### 12.6.3  Constant P and T

We now restrict attention to conditions of constant pressure and constant positive temperature. This is closely analogous to section 12.6.2; the only difference is constant pressure instead of constant volume. We shall see that this allows us to answer questions about spontaneity using the system’s Gibbs free enthalpy G as a proxy for the overall entropy Stotal.

In figure 12.12, we have divided the universe into three regions:

• the interior region – inside the blue cylinder;
• the neighborhood – outside the cylinder but inside the black rectangle; and
• the rest of the universe – outside the black rectangle.

The combination of interior region + neighborhood will be called the local region.

Figure 12.12: Constant Pressure and Temperature; Cylinder + Neighborhood

Inside the blue cylinder is some gas. The cylinder is made of two pieces that can slide up and down relative to each other, thereby changing the boundary between the interior region and the neighborhood. (Compare this to the constant-volume scenario in section 12.6.2.)

Inside the neighborhood is a heat bath, as represented by the magenta region in the figure. It is in thermal contact with the gas inside the cylinder. We assume the heat capacity of the heat bath is very large.

Also in the neighborhood there is a complicated arrangement of levers and springs, which maintains a constant force (and therefore a constant force per unit area, i.e. pressure) on the cylinder.

We also assume that the kinetic energy of the levers and springs is negligible. This is a nontrivial assumption. It is tantamount to assuming that whatever changes are taking place are not too sudden, and that the springs and levers are somehow kept in thermal equilibrium with the heat bath.

We assume the combined local system (interior + neighborhood) is isolated from the rest of the universe. Specifically, no energy or entropy can flow across the boundary of the local system (the black rectangle in figure 12.12).

Note that G is the free enthalpy inside the interior region. Similarly S is the entropy inside the interior region (in contrast to Stotal, which includes all the local entropy, Stotal = S + Sn). This is important, because it is usually much more convenient to keep track of what’s going on in the interior region than to keep track of everything in the neighborhood and the rest of the universe.

We start by doing some math:

 G := H − TS by definition of G dG = dH − T dS − S dT by differentiating = dH − T dS since dT=0 by hypothesis dS − dH/T = −dG/T by rearranging
(12.32)

Next we relate certain inside quantities to the corresponding outside quantities. We will make use of the fact that in this situation, enthalpy is conserved, as discussed in section 12.6.5.

 T = Tn temperature the same everywhere H + Hn = const local conservation of enthalpy dH = − dHn by differentiating
(12.33)

Next we assume that the heat bath is internal equilibrium. This is a nontrivial assumption. We are making use of the fact that it is a heat bath, not a bubble bath. We are emphatically not assuming that the interior region is in equilibrium, because one of the major goals of the exercise is to see what happens when it is not in equilibrium. In particular we are emphatically not going to assume that H is a function of S and P alone. Therefore we cannot safely expand dH = T dS + V dP in the interior region. We can, however, use the corresponding expansion in the neighborhood region, because it is in equilibrium:

 dHn = T dSn + V dPn bath in equilibrium = T dSn constant P by hypothesis dSn = dHn / T by rearranging = − dH / T by conservation of enthalpy, equation 12.33
(12.34)

Again we assume the entropy is extensive. We are now in a position to finish the calculation.

 dStotal = dS + dSn entropy is extensive = dS − dH/T by equation 12.34 = −dG/T by equation 12.32
(12.35)

This tells us that under the stated constraints (constant P and constant positive T), anything we could predict based on maximizing Stotal we can equally well predict based on minimizing G.

Beware: Under other conditions (other than constant P and constant positive T), you cannot reliably predict the direction of spontaneity based on G or derivatives of G. Indeed there are plenty of cases where S is well defined but G is not even definable.

### 12.6.4  Externally Damped Oscillator: Constant S and Decoupled V

In section 12.6.2 and section 12.6.3 we showed that F/T and G/T could be used to predict spontaneity and irreversibility, under appropriate conditions. The astute reader may be wondering, what about E/T and H/T? Are there conditions where they can be used as proxies for Stotal?

The answer is yes, under appropriate conditions, as we now discuss. The physical systems that we will consider have some important differences from the systems considered in section 12.6.2 and section 12.6.3.

In figure 12.13, we have divided the universe into three regions:

• the interior region – the mass and the spring
• the neighborhood – the damper, separate from the interior region but inside the black rectangle; and
• the rest of the universe – outside the black rectangle.

The combination of interior region + neighborhood will be called the local region. We have arranged for the linkage from the internal region to the damper to be thermally insulating, so that the interior region cannot exchange entropy with the damper.

Figure 12.13: Oscillator with Damper

The decision to consider the damper as not part of the interior was a somewhat arbitrary choice. Note that there are plenty of real-world systems where this makes some sense, such as a charged harmonic oscillator (where radiative damping is not considered interior to the system) or a marble oscillating in a bowl full of fluid (where the viscous damping is not considered interior to the marble).

Local conservation of energy tells us:

 dE = −dEn
(12.36)

As in previous examples, we assume the probabilities are uncorrelated, so that entropy is extensive:

 dStotal = dS + dSn
(12.37)

Now, suppose the oscillator starts out with a large amount of energy, large compared to kT. As the oscillator moves, energy will be dissipated in the damper. The entropy of the damper will increase. The entropy of the interior region is unknown and irrelevant, because it remains constant:

 dS = 0
(12.38)

We imagine that the damper’s energy is a one-to-one function of its entropy. That allows us to write

 dEn = Tn dSn
(12.39)

with no other terms on the RHS. There are undoubtedly other equally-correct ways of expanding dEn, but we need not bother with them, because equation 12.39 is correct and sufficient for present purposes.

Physically, the simplicity of equation 12.39 depends on (among other things) the fact that the energy of the neighborhood does not depend on the position of the piston within the damper (x), so we do not need an F·x term in equation 12.39. The frictional force depends on the velocity (dx/dt) but not on the position (x).

At this point, we notice another striking reminder:

 Thinking in terms of energy and entropy is good practice. Thinking in terms of “heat” and “work” would be a fool’s errand.

 Energy is conserved. Neither heat nor work is separately conserved.

 We can easily understand that the linkage that connects the interior to the damper carries zero entropy and carries nonzero energy. We see “work” leaving the interior in the form of PdV or F·dx. We see no heat leaving the interior. Meanwhile, we see heat showing up in the damper, in the form of TdS. This would be confusing, if we cared about heat and work, but we don’t care, so we escape unharmed.

Fortunately, keeping track of the energy and the entropy is sufficient to solve the problem. Combining the previous equations, we find:

 dStotal = −dE/Tn
(12.40)

That means that anything we could have predict by maximizing Stotal we can equally well predict by minimizing the interior energy E ... under appropriate conditions.

Note: In elementary non-thermal mechanics, there is an unsophisticated rule that says “balls roll downhill” or something like that. That is not an accurate statement of the physics, because in the absence of dissipation, a ball that rolls down the hill will immediately roll up the hill on the other side of the valley.

If you want the ball to roll down and stay down, you need some dissipation, and we can understand this in terms of equation 12.40.

Note that the relevant temperature is the temperature of the damper, Tn. The non-dissipative components need not even have a well-defined temperature. The temperature (if any) of the non-dissipative components is irrelevant, because it doesn’t enter into the calculation of the desired result (equation 12.40). This is related to the fact that dS is zero, so we know TdS is zero even if we don’t know T.

### 12.6.5  Lemma: Conservation of Enthalpy, Maybe

Energy is always strictly and locally conserved. It is conserved no matter whether the volume is changing or not, no matter whether the ambient pressure is changing or not, no matter whatever.

Enthalpy is sometimes conserved, subject to a few restrictions and provisos. The primary, crucial restriction requires us to work under conditions of constant pressure.

Another look at figure 12.12 will help us fill in the details.

As always, the enthalpy is:

 H = E + PV (interior region) Hn = En + PVn (neighborhood region)
(12.41)

By conservation of energy, we have

 E + En = E0 = const
(12.42)

Next we are going to argue for a “local conservation of volume” requirement. There are several ways to justify this. One way is to argue that it is corollary of the previous assumption that the local system is isolated and not interacting with the rest of the universe. Another way is to just impose it as a requirement, explicitly requiring that the volume of the local region (the black rectangle in figure 12.12) is not changing. A third way is to arrange, as we have in this case, that there is no pressure acting on the outer boundary, so for purposes of the energy calculation we don’t actually care where the boundary is; this would require extra terms in equation 12.44 but the extra terms would all turn out to be zero.

We quantify the idea of constant volume in the usual way:

 V + Vn = V0 = const
(12.43)

Now we do some algebra:

 E + En = E0 by equation 12.42 E + PV + En − PV = E0 add and subtract PV E + PV + En − P(V0 − Vn) = E0 by equation 12.43 E + PV + En  + PVn = E0 + PV0 by rearranging H + Hn = const by equation 12.41
(12.44)

### 12.6.6  Local Conservation

In ultra-simple situations, it is traditional to divide the universe into two regions: “the system” versus “the environment”. Sometimes other terminology is used, such as “interior” versus “exterior”, but the idea is the same.

In more complicated situations, such as fluid dynamics, we must divide the universe into a great many regions, aka parcels. We can ask about the energy, entropy, etc. internal to each parcle, and also ask about the transfer of energy, entropy, etc. to adjacent parcels.

This is important because, as discussed in section 1.4, a local conservation law is much more useful than a global conservation law. If some energy disappears from my system, it does me no good to have a global law that says the energy will eventually reappear “somewhere” in the universe. The local laws says that energy is conserved right here, right now.

For present purposes, we can get by with only three regions: the interior, the immediate neighborhood, and the rest of the universe. Examples of this can be seen in section 12.6.2, section 12.6.3, and section 12.6.4.

## 12.7  Natural Variables, or Not

### 12.7.1  The “Big Four” Thermodynamic Potentials

Suppose we assume, hypothetically and temporarily, that you are only interested in questions of stability, reversibility, and spontaneity. Then in this scenario, you might choose to be interested in one of the following thermodynamic potentials:

 ΔE at constant S and V : Energy ΔF at constant T and V : Helmholtz Free Energy ΔG at constant T and P : Gibbs Free Enthalpy ΔH at constant S and P : Enthalpy
(12.45)

For more about these potentials and the relationships between them, see chapter 13.

There is nothing fundamental about the choice of what you are interested in, or what you choose to hold constant. All that is a choice, not a law of nature. The only fundamental principle here is the non-decrease of overall entropy, Stotal.

In particular, there is no natural or fundamental reason to think that there are any “natural variables” associated with the big four potentials. Do not believe any assertions such as the following:

 E is “naturally” E(S, V) ☠ F is “naturally” F(T, V) ☠ G is “naturally” G(T, P) ☠ H is “naturally” H(S, P) ☠
(12.46)

I typeset equation 12.46 on a red background with skull-and-crossbones symbols to emphasize my disapproval. I have never seen any credible evidence to support the idea of “natural variables”. Some evidence illustrating why it cannot be generally true is presented in section 12.7.2.

### 12.7.2  A Counterexample: Heat Capacity

Consider an ordinary heat capacity measurement that measures ΔT as a function of ΔE. This is perfectly well behaved operationally and conceptually.

The point is that E is perfectly well defined even when it is not treated as the dependent variable. Similarly, T is perfectly well defined, even when it is not treated as the independent variable. We are allowed to express T as T(V,E). This doesn’t directly tell us much about stability, reversibility, or spontaneity, but it does tell us about the heat capacity, which is sometimes a perfectly reasonable thing to be interested in.

## 12.8  Going to Completion

Suppose we are interested in the following reaction:

 N2 + O2 → 2NO (x=0) (x=1)

(12.47)

which we carry out under conditions of constant pressure and constant temperature, so that when analyzing spontaneity, we can use G as a valid proxy for Stotal, as discussed in section 12.6.3.

Equation 12.47 serves some purposes but not all.

• It defines what we mean by reactants and products.
• It defines a direction (a very specific direction) in parameter space.
• In some cases but not all, the reaction will essentially go to completion, so that the ΔG of interest will be G(RHS)−G(LHS) where RHS and LHS refer to this equation, namely equation 12.47.

Let x represent some notion of reaction coordinate, proceeding in the direction specified by equation 12.47, such that x=0 corresponds to 100% reactants, and x=1 corresponds to 100% products.

Equation 12.47 is often interpreted as representing the largest possible Δx, namely leaping from x=0 to x=1 in one step.

That’s OK for some purposes, but when we are trying to figure out whether a reaction will go to completion, we usually need a more nuanced notion of what a reaction is. We need to consider small steps in reaction-coordinate space. One way of expressing this is in terms of the following equation:

 a N2 + b O2 + c NO → (a−є) N2 + (b−є) O2 + (c+2є) NO
(12.48)

where the parameters a, b, and c specify the “current conditions” and є represents a small step in the x-direction. We see that the RHS of this equation has been displaced from the LHS by an amount є in the direction specified by equation 12.47.

To say the same thing another way, equation 12.47 is the derivative of equation 12.48 with respect to x (or, equivalently, with respect to є). Equation 12.47 is obviously more compact, and is more convenient for most purposes, but you should not imagine that it describes everything that is going on; it only describes the local derivative of what’s going on.

The amount of free enthalpy liberated by equation 12.48 will be denoted δG. It will be infinitesimal, in proportion to є. If divide δG by є, we get the directional derivative ∇x G.

Terminology note: In one dimension, the directional derivative ∇x G is synonymous with the ordinary derivative dG/dx.

This tells us what we need to know. If ∇x G is positive, the reaction is allowed proceed spontaneously by (at least) an infinitesimal amount in the +x direction. We allow it to do so, then re-evaluate ∇x G at the new “current” conditions. If ∇x G is still positive, we take another step. We iterate until we come to a set of conditions where ∇x G is no longer positive. At this point we have found the equilibrium conditions (subject of course to the initial conditions, and the constraint that the reaction equation 12.47 is the only allowed transformation).

Naturally, if we ever find that ∇x G is negative, we take a small step in the −x direction and iterate.

If the equilibrium conditions are near x=1, we say that the reaction goes to completion as written. By the same token, if the equilibrium conditions are near x=0, we say that the reaction goes to completion in the opposite direction, opposite to equation 12.47.

## 12.9  Example: Shift of Equilibrium

Let’s consider the synthesis of ammonia:

 N2  +  3 H2 ⇔ 2 NH3 (x=0) (x=1)

(12.49)

We carry out this reaction under conditions of constant P and T. We let the reaction reach equilibrium. We arrange the conditions so that the equilibrium is nontrivial, i.e. the reaction does not go to completion in either direction.

The question for today is, what happens if we increase the pressure? Will the reaction remain at equilibrium, or will it now proceed to the left or right?

We can analyze this using the tools developed in the previous sections. At constant P and T, subject to mild restrictions, the reaction will proceed in whichever direction minimizes the free enthalpy:

 dG/dx > 0 ⇒ proceed to the left dG/dx < 0 ⇒ proceed to the right dG/dx = 0 ⇒ equilibrium
(12.50)

where x is the reaction coordinate, i.e. the fraction of the mass that is in the form of NH3, on the RHS of equation 12.49. See section 12.6.3 for an explanation of where equation 12.50 comes from.

Note that P and T do not need to be constant “for all time”, just constant while the d/dx equilibration is taking place.

As usual, the free enthalpy is defined to be:

 G = E + PV − TS              (12.51)

so the gradient of the free enthalpy is:

 dG = dE + PdV − TdS              (12.52)

There could have been terms involving VdP and SdT, but these are not interesting since we are working at constant P and T.

In more detail: If (P1,T1) corresponds to equilibrium, then we can combine equation 12.52 with equation 12.50 to obtain:

 dG dx

 (P1,T1)
=

 dE dx
+ P
 dV dx
− T
 dS dx

 (P1,T1)

= 0
(12.53)

To investigate the effect of changing the pressure, we need to compute

 dG dx

 (P2,T1)
=

 dE dx
+ P
 dV dx
− T
 dS dx

 (P2,T1)

(12.54)

where P2 is slightly different from the equilibrium pressure; that is:

 P2 = (1+δ)P1              (12.55)

We now argue that E and dE are insensitive to pressure. The potential energy of a given molecule depends on the bonds in the given molecule, independent of other molecules, hence independent of density (for an ideal gas). Similarly the kinetic energy of a given molecule depends on temperature, not on pressure or molar volume. Therefore:

 dE dx

 (P2,T1)
=
 dE dx

 (P1,T1)
(12.56)

Having examined the first term on the RHS of equation 12.54, we now examine the next term, namely the P dV/dx term. It turns out that this term is insensitive to pressure, just as the previous term was. We can understand this as follows: Let N denote the number of gas molecules on hand. Let’s say there are N1 molecules when x=1. Then for general x we have:

 N = N1 (2−x)              (12.57)

That means dN/dx = −N1, independent of pressure. Then the ideal gas law tells us that

 d dx
(P V)

 (P,T)
=
 d dx
(N kT)

 (P,T)
(12.58)

Since the RHS is independent of P, the LHS must also be independent of P, which in turn means that P dV/dx is independent of P. Note that the V dP/dx term is automatically zero.

Another way of reaching the same conclusion is to recall that PV is proportional to the kinetic energy of the gas molecules: PV = N kT = (γ−1) E, as discussed in section 24.5. So when the reaction proceeds left to right, for each mole of gas that we get rid of, we have to account for RT/(γ−1) of energy, independent of pressure.

This is an interesting result, because you might have thought that by applying pressure to the system, you could simply “push” the reaction to the right, since the RHS has a smaller volume. But it doesn’t work that way. Pressurizing the system decreases the volume on both sides of the equation by the same factor. In the PdV term, the P is larger but the dV is smaller.

Also note that by combining the pressure-independence of the dE/dx term with the pressure-independence of the P dV/dx term, we find that dH/dx is pressure-independent also, where H is the enthalpy.

Now we come to the −TdS/dx term. Entropy depends on volume.

By way of background, let us temporarily consider a slightly different problem, namely one that has only N1 molecules on each side of the equation. Consider what would happen if we were to run this new reaction backwards, i.e. right to left, i.e. from x=1 to x=0. The system volume would double, and we would pick up N1 k ln(2) units of entropy ... independent of pressure. It is the volume ratio that enters into this calculation, inside the logarithm. The ratio is independent of pressure, and therefore cannot contribute to explaining any pressure-related shift in equilibrium.

Returning to the main problem of interest, we have not N1 but 2N1 molecules when x=0. So when we run the real reaction backwards, in addition to simply letting N1 molecules expand, we have to create another N1 molecules from scratch.

For that we need the full-blown Sackur-Tetrode equation, equation 24.17, which we repeat here. For a pure monatomic nondegenerate ideal gas in three dimensions:

S/N =
k ln(
 V/N Λ3
) +
 5 2
k
(12.59)

which gives the entropy per particle in terms of the volume per particle, in an easy-to-remember form.

For the problem at hand, we can re-express this as:

Si =
Ni(xk ln(
 kT/P Λi3
) +
 5 2
k
(12.60)

where the index i runs over the three types of molecules present (N2, H2, and NH3). We have also used the ideal gas law to eliminate the V dependence inside the logarithm in favor of our preferred variable P. We have (finally!) identified a contribution that depends on pressure and also depends on x.

We can understand the qualitative effect of this term as follows: The −TS term always contributes a drive to the left. According to equation 24.17, at higher pressure this drive will be less. So if we are in equilibrium at pressure P1 and move to a higher pressure P2, there will be a net drive to the right.

We can quantify all this as follows: It might be tempting to just differentiate equation 12.60 with respect to x and examine the pressure-dependence of the result. However, it is easier if we start by subtracting equation 12.53 from equation 12.54, and then plug in equation 12.60 before differentiating. A lot of terms are unaffected by the change from P1 to P2, and it is helpful if we can get such terms to drop out of the calculation sooner rather than later:

 dG dx
(P2)
− 0

=

 dE dx
(P2)
+
P
 dV dx
(P2)
T
 dS dx
(P2)
 dE dx
(P1)
P
 dV dx
(P1)
+
T
 dS dx
(P1)

=
0 + 0 − T
 d dx
[S(P2) − S(P1)]
=
kT
 d dx

 ∑ i
Ni(x) [ln(1/P2) − ln(1/P1)]
=
+kT
 d dx

 ∑ i
Ni(x) ln(1+δ)
= kT ln(1+δ)
(12.61)

This quantitative result reinforces the previous qualitative analysis: If P2 is greater than P1, the reaction will proceed in the +x direction, since that is what will minimize G.

The calculation involved many steps, but each step was reasonably easy.

Remark: The result is surprisingly simple. Whenever a complicated calculation produces a simple result, I take it as a sign that I don’t really understand what’s going on. I suspect there is an easier way to obtain this result. In particular, since we have figured out that the entropy term is running the show, I conjecture that it may be possible to start from first principles, all the way back to the entropy in equation 12.3, and just keep track of the entropy.

Remark: In equation 12.61, by a first order expansion of the logarithm on the last line, you can verify that when the reaction is pushed toward equilibrium, the amount of push is proportional to δ, which makes sense.

Exercise: Use a similar argument to show that increasing the temperature will shift the equilibrium of equation 12.49 to the left. Hint: In a gas-phase reaction such as this, the side with more moles of gas will have more entropy.

## 12.10  Le Châtelier’s Principle, Or Not

One sometimes sees equilibrium (and or the shift of equilibrium) “explained” by reference to Le Châtelier’s principle. This is highly problematic.

Le Châtelier in his lifetime gave two inconsistent statements of his so-called principle. Restating them in modern terms:

1.
The first version says, in effect, that all chemical equilibria are stable.
2.
The second version says, in effect, that all stable chemical equilibria are stable.

Version 2 is tautological. As such, it is not wrong ... but it is utterly uninformative.

Version 1 is just wrong, as we can see from the following examples:

Example #1: As a familiar, important situation, consider ice in equilibrium with liquid water, under the usual constant-pressure conditions. Let x represent the reaction coordinate, i.e. the fraction of the mass of the system that is in the form of ice. If you perturb the system by changing x – perhaps by adding water, adding ice, or adding energy – the system will exhibit zero tendency to return to its previous x-value. This system exhibits zero stability, aka neutral instability, aka neutral stability, as defined in figure 12.2.

Example #2: Consider an equilibrium mixture of helium and neon. Let the reaction coordinate (x) be the fraction of the mass that is helium. If you perturb the system by increasing x, there will be no tendency for the system to react so as to decrease x. This is another example of zero stability aka neutral instability.

Example #3: Consider the decomposition of lead azide, as represented by the following reaction:

 Pb(N3)2 → Pb + 3N2 (x=0) (x=1)
(12.62)

Initially we have x=0. That is, we have a sample of plain lead azide. It is in thermodynamic equilibrium, in the sense that it has a definite pressure, definite temperature, et cetera.

If we perturb the system by increasing the temperature a small amount, the system will not react so as to counteract the change, not even a little bit. Indeed, if we increase the temperature enough, the system will explode, greatly increasing the temperature.

We say that this system is unstable. More specifically, it exhibits negative stability, as defined in figure 12.2.

Suggestion: If you want to talk about equilibrium and stability, use the standard terminology, namely equilibrium and stability, as defined in figure 12.1 and figure 12.2. There is no advantage to mentioning Le Châtelier’s ideas in any part of this discussion, because the ideas are wrong. If you want to remark that “most” chemical equilibria encountered in the introductory chemistry course are stable equilibria, that’s OK ... but such a remark must not be elevated to the status of a “principle”, because there are many counterexamples.

Note that Lyapunov’s detailed understanding of what stability means actually predates Le Châtelier’s infamous “principle” by several years.

When first learning about equilibrium, stability, and damping, it is best to start with a one-dimensional system, such as the bicycle wheel depicted in figure 12.2. Then move on to multi-dimensional systems, such as an egg, which might be stable in one direction but unstable in another direction. Also note that in a multi-dimensional system, even if the system is stable, there is no reason to expect that the restoring force will be directly antiparallel to the perturbation. Very commonly, the system reacts by moving sideways, as discussed in section 12.5.

## 12.11  Appendix: The Cyclic Triple Derivative Rule

In this section, we derive a useful identity involving the partial derivatives of three variables. This goes by various names, including cyclic chain rule, cyclic partial derivative rule, cyclic identity, Euler’s chain rule, et cetera. We derive it twice: once graphically in section 12.11.1 and once analytically section 12.11.3. For an example where this rule is applied, see section 12.4.

### 12.11.1  Graphical Derivation

In figure 12.14 the contours of constant x are shown in blue, the contours of constant y are shown in black, and the contours of constant z are shown in red. Even though there are three variables, there are only two degrees of freedom, so the entire figure lies in the plane.

Figure 12.14: Cyclic Triple Derivative

The red arrow corresponds to:

 ∂x ∂y

 z
=
 3 −1

=

 # of blue  contours # of black contours
crossed by red arrow
(12.63)

and can be interpreted as follows: The arrow runs along a contour of constant z, which is the z on the LHS of equation 12.63. The arrow crosses three contours of constant x, which is the numerator in equation 12.63. It crosses one contour of constant y, in the direction of decreasing y, which is the denominator.

Collecting results for all three vectors, we have:

 ∂x ∂y

 z
=
 3 −1

=

 # of blue  contours # of black contours
crossed by red arrow
 ∂y ∂z

 x
=
 1 −2

=

 # of black contours # of red   contours
crossed by blue arrow
 ∂z ∂x

 y
=
 2 −3

=

 # of red  contours # of blue contours
crossed by black arrow
(12.64)

and if we multiply those together, we get:

 ∂x ∂y

 z
 ∂y ∂z

 x
 ∂z ∂x

 y
= −1
(12.65)

### 12.11.2  Validity is Based on Topology

Note the contrast:

 The cyclic triple derivative identity is a topological property. That is, you can rotate or stretch figure 12.14 however you like, and the result will be the same: the product of the three partial derivatives will always be −1. All we have to do is count contours, i.e. the number of contours crossed by each of the arrows. The result does not depend on any geometrical properties of the situation. No metric is required. No dot products are required. No notion of length or angle is required. For example, as drawn in figure 12.14, the x contours are not vertical, the y contours are not horizontal, and the x and y contours are not mutually perpendicular ... but more generally, we don’t even need to have a way of knowing whether the contours are horizontal, vertical, or perpendicular.

Similarly, if you rescale one set of contours, perhaps by making the contours twice as closely spaced, it has no effect on the result, because it just increases one of the numerators and one of the denominators in equation 12.65.

The validity of equation 12.65 depends on the following topological requirement: The three vectors must join up to form a triangle. This implies that the contours {dx, dy, dz} must not be linearly independent. In particular, you cannot apply equation 12.65 to the Cartesian X, Y, and Z axes.

Validity also depends on another topological requirement: The contour lines must not begin or end within the triangle formed by the three vectors. We are guaranteed this will always be true, because of the fundamental theorem that says d(anything) is exact ... or, equivalently, d(d(anything)) = 0. In words, the theorem says “the boundary of a boundary is zero” or “a boundary cannot peter out”. This theorem is discussed in reference 4.

We apply this idea as follows: Every contour line that goes into the triangle has to go out again. From now on, let’s only count net crossings, which means if a contour goes out across the same edge where it came in, that doesn’t count at all. Then we can say that any blue line that goes in across the red arrow must go out across the black arrow. (It can’t go out across the blue arrow, since the blue arrow lies along a blue contour, and contours can’t cross.) To say the same thing more quantitatively, the number of net blue crossings inward across the red arrow equals the number of net blue crossings outward across the black arrow. This number of crossings shows up on the LHS of equation 12.65 twice, once as a numerator and once as a denominator. In one place or the other, it will show up with a minus sign. Assuming this number is nonzero, its appearance in a numerator cancels its appearance in a denominator, so all in all it contributes a factor of −1 to the product. Taking all three variables into account, we get three factors of −1, which is the right answer.

Here is yet another way of saying the same thing. To simplify the language, let’s interpret the x-value as “height”. The blue arrow lies along a contour of constant height. The black arrow goes downhill a certain amount, while the red arrow goes uphill by the same amount. The amount must be the same, for the following two reasons: At one end, the red and black arrows meet at a point, and x must have some definite value at this point. At the other end, the red and black arrows terminate on the same contour of constant x. This change in height, this Δx, shows up on the LHS of equation 12.65 twice, once as a numerator and once as a denominator. In one place or the other, it shows up with a minus sign. This is guaranteed by the fact that when the arrows meet, they meet tip-to-tail, so if one of the pair is pointed downhill, the other must be pointed uphill.

### 12.11.3  Analytic Derivation

Let’s start over, and derive the result again. Assuming z can be expressed as a function of x and y, and assuming everything is sufficiently differentiable, we can expand dz in terms of dx and dy using the chain rule:

dz =
 ∂z ∂x

 y
dx +
 ∂z ∂y

 x
dy
(12.66)

By the same token, we can expand dx in terms of dy and dz:

dx =
 ∂x ∂y

 z
dy +
 ∂x ∂z

 y
dz
(12.67)

Using equation 12.67 to eliminate dx from equation 12.66, we obtain:

dz =
 ∂z ∂x | y

 ∂x ∂y | z
dy
 ∂x ∂z | y
dz

 ∂z ∂y | x
dy
(12.68)

hence

1 −
 ∂z ∂x | y

 ∂x ∂z | y

dz
=

 ∂z ∂x | y

 ∂x ∂y | z
+
 ∂z ∂y | x

dy
(12.69)

We are free to choose dz and dy arbitrarily and independently, so the only way that equation 12.69 can hold in general is if the parenthesized factors on each side are identically zero. From the LHS of this equation, we obtain the rule for the reciprocal of a partial derivative. This rule is more-or-less familiar from introductory calculus, but it is nice to know how to properly generalize it to partial derivatives:

 ∂z ∂x

 y

 ∂x ∂z

 y
= 1
(12.70)

Meanwhile, from the parenthesized expression on the RHS of equation 12.69, with a little help from equation 12.70, we obtain the cyclic triple chain rule, the same as in section 12.11.1:

 ∂x ∂y

 z
 ∂y ∂z

 x
 ∂z ∂x

 y
= −1
(12.71)

### 12.11.4  Independent and Dependent Variables, or Not

In this situation, it is clearly not worth the trouble of deciding which are the “independent” variables and which are the “dependent” variables. If you decide based on equation 12.66 (which treats z as depending on x and y) you will have to immediately change your mind based on equation 12.67 (which treats x as depending on y and z).

Usually it is best to think primarily in terms of abstract points in thermodynamic state-space. You can put your finger on a point in figure 12.14 and thereby identify a point without reference to its x, y, or z coordinates. The point doesn’t care which coordinate system (if any) you choose to use. Similarly, the vectors in the figure can be added graphically, tip-to-tail, without reference to any coordiate system or basis.

If and when we have established a coordinate system:

• Given a point, you can determine its x, y, and z coordinates.
• Given x and y, you can locate the point and determine its properties, including its z coordinate.
• Equally well, given y and z, you can locate the point and determine its properties, including its x coordinate.

### 12.11.5  Axes, or Not

Note that there no axes in figure 12.14, strictly speaking. There are contours of constant x, constant y, and constant z, but no actual x-axis, y-axis, or z-axis.

In other, simpler situations, you can of course get away with a plain horizontal axis and a plain vertical axis, but you don’t want to become too attached to this approach. Even in cases where you can get away with plain axes, it is a good habit to plot the grid also (unless there is some peculiar compelling reason not to). Modern software makes it super-easy to include the grid.

 Make it a habit to include the contours.

For more on this, see reference 37.

## 12.12  Entropy versus “Irreversibility” in Chemistry

In chemistry, the word “irreversible” is commonly used in connection with multiple inconsistent ideas, including:

• The reaction is spontaneous.
• The reaction strongly goes to completion.
• The reaction is thermodynamically irreversible.

Those ideas are not completely unrelated … but they are not completely identical, and there is potential for serious confusion.

You cannot look at a chemical reaction (as written in standard form) and decide whether it is spontaneous, let alone whether it goes to completion. For example, consider the reaction

 3 Fe + 4 H2O  →   Fe3O4  +  4 H2
(12.72)

If you flow steam over hot iron, you produce iron oxide plus hydrogen. This reaction is used to produce hydrogen on an industrial scale. It goes to completion in the sense that the iron is used up. Conversely, if you flow hydrogen over hot iron oxide, you produce iron and H2O. This is the reverse of equation 12.72, and it also goes to completion, in the sense that the iron oxide is used up.

What’s more, none of that has much to do with whether the reaction was thermodynamically reversible or not.

In elementary chemistry classes, people tend to pick up wrong ideas about thermodynamics, because the vast preponderance of the reactions that they carry out are grossly irreversible, i.e. irreversible by state, as discussed in section 12.1.3. The reactions are nowhere near isentropic.

Meanwhile, there are plenty of chemical reactions that are very nearly reversible, i.e. irreversible by rate, as discussed in section 12.1.3. In everyday life, we see examples of this, such as electrochemical reactions, e.g. storage batteries and fuel cells. Another example is the CO2/carbonate reaction discussed below. Alas, there is a tendency for people to forget about these reversible reactions and to unwisely assume that all reactions are grossly irreversible. This unwise assumption can be seen in the terminology itself: widely-used tables list the “standard heat of reaction” (rather than the standard energy of reaction), apparently under the unjustifiable assumption that the energy liberated by the reaction will always show up as heat. Similarly reactions are referred to as “exothermic” and “endothermic”, even though it would be much wiser to refer to them as exergonic and endergonic.

It is very difficult, perhaps impossible, to learn much about thermodynamics by studying bricks that fall freely and smash against the floor. Instead, thermodynamics is most understandable and most useful when applied to situations that have relatively little dissipation, i.e. that are nearly isentropic.

Lots of people get into the situation where they have studied tens or hundreds or thousands of reactions, all of which are irreversible by state. That’s a trap for the unwary. It would be unwise to leap to the conclusion that all reactions are far from isentropic … and it would be even more unwise to leap to the conclusion that “all” natural processes are far from isentropic.

Chemists are often called upon to teach thermodynamics, perhaps under the guise of a “P-Chem” course (i.e. physical chemistry). This leads some people to ask for purely chemical examples to illustrate entropy and other thermodynamic ideas. I will answer the question in a moment, but first let me register my strong objections to the question. Thermodynamics derives its great power and elegance from its wide generality. Specialists who cannot cope with examples outside their own narrow specialty ought not be teaching thermodynamics.

Here’s a list of reasons why a proper understanding of entropy is directly or indirectly useful to chemistry students.

1. Consider electrochemical reactions. Under suitable conditions, some electrochemical reactions can be made very nearly reversible in the thermodynamic sense. (See reference 38 for some notes on how such cells work.) In these cases, the heat of reaction is very much less than the energy of reaction, and the entropy is very much less than the energy divided by T.
2. Consider the reaction that children commonly carry out, adding vinegar to baking soda, yielding sodium acetate and carbon dioxide gas. Let’s carry out this reaction in a more grown-up apparatus, namely a sealed cylinder with a piston. By pushing on the piston with weights and springs, we can raise the pressure of the CO2 gas. If we raise the pressure high enough, we push CO2 back into solution. This in turn raises the activity of the carbonic acid, and at some point it becomes a strong enough acid to attack the sodium acetate and partially reverse the reaction, liberating acetic acid. So this is clearly and inescapably a chemistry situation.

Much of the significance of this story revolves around the fact that if we arrange the weights and springs just right, the whole process can be made thermodynamically reversible (nearly enough for practical purposes). Adding a tiny bit of weight will make the reaction go one way, just as removing a tiny bit of weight will make the reaction go the other way.

Now some interesting questions arise: Could we use this phenomenon to build an engine, in analogy to a steam engine, but using CO2 instead of steam, using the carbonate ↔ CO2 chemical reaction instead of the purely physical process of evaporation? How does the CO2 pressure in this system vary with temperature? How much useful work would this CO2 engine generate? How much waste heat? What is the best efficiency it could possibly have? Can we run the engine backwards so that it works as a refrigerator?

There are more questions of this kind, but you get the idea: once we have a reaction that is more-or-less thermodynamically reversible, we can bring to bear the entire machinery of thermodynamics.

3. Consider the colligative effects of a solute on the on freezing point, boiling point, and vapor pressure of a solvent. The fact that they’re colligative – i.e. insensitive to the chemical properties of the solute – is strong evidence that entropy is what’s driving these effects, not enthalpy, energy, or free energy.
4. Similarly: consider the Gibbs Gedankenexperiment (section 10.6). Starting with a sample of 4He, we get an increase in entropy if we mix it with 3He, or Ne, or Xe … but we get no effect if we “mix” it with more of the same 4He.
5. People who take chemistry classes often go on to careers in other fields. For example, you might need knowledge of chemistry, physics, and engineering in order to design a rocket engine, or a jet engine, or a plain old piston engine. Such things commonly involve a chemical reaction followed by a more-or-less isentropic expansion. Even though the chemical reaction is grossly irreversible, understanding the rest of the system requires understanding thermodynamics.

To be really specific, suppose you are designing something with multiple heat engines in series. This case is considered as part of the standard “foundations of thermodynamics” argument, as illustrated figure 12.15. Entropy is conserved as it flows down the totem-pole of heat engines. The crucial conserved quantity that is the same for all the engines is entropy … not energy, free energy, or enthalpy. No entropy is lost during the process, because entropy cannot be destroyed, and no entropy (just work) flows out through the horizontal arrows. No entropy is created, because we are assuming the heat engines are 100% reversible. For more on this, see reference 6.

Figure 12.15: Heat Engines In Series
6. Consider “Design of Experiment”, as discussed in reference 11. In this case the entropy of interest is not the entropy of the reaction, but still it is entropy, calculated in accordance with equation 2.2, and it is something a chemist ought to know. Research chemists and especially chemical engineers are often in the situation where experiments are very expensive, and someone who doesn’t understand Design of Experiment will be in big trouble.

1
We exclude consideration of irreversible weak nuclear processes, such as decay of neutral Kaons.
2
Conversely, if the reaction cannot proceed in any direction, because of conservation laws or some such, it is pointless to ask whether it is reversible and/or isentropic.
[Previous] [Contents] [Next]