24 Partition Function

In thermal equilibrium, the probability of each microstate is proportional to its Boltzmann factor:

where P_i is the probability of the ith microstate, and Ê_i is the energy of the ith microstate. You can think of the Boltzmann factor exp(−Ê_i/kT) as an unnormalized probability. In some cases an unnormalized probability is satisfactory, or even desirable, but in other cases you really want the normalized probability, normalized so that ∑P_i = 1. That is easily arranged:

The normalization denominator in equation 24.2 is something we are going to encounter again and again, so we might as well give it a name. It is called the partition function and is denoted Z. That is:

Actually there is more to the story; we shall see that Z serves in many roles, not just as a normalization denominator. However, that is more than we need to know at the moment. For the time being, it suffices to think of Z as the normalization denominator. Additional motivation for caring about Z will accumulate in the next few sections.

The probability of the ith state is P_i.

The Boltzmann factor for state i is exp(−β Ê_i), where Ê_i is the energy of the state.

The probabilities are normalized such that

∑P_i = 1 (24.5)

The sum of the Boltzmann factors is called the partition function:

Z :=

∑

e^−β Ê_i (24.6)

Knowing the probability P_i for every state somewhat useful, but as we shall see, it is not nearly as useful as knowing the Boltzmann factors exp(−β Ê_i).

If you know the Boltzmann factors, you can calculate all the probabilities in accordance with equation 24.7, but the converse does not hold: knowing all the probabilities does not suffice to calculate the Boltzmann factors.

In fact, we shall see that if you know the partition function, you can calculate everything there is to know about Locrian thermodynamics.

24.2 Calculations Using the Partition Function

⟨X⟩

∑

x_i P_i

for any probability distribution

∑

x_i e^−β Ê_i

for a Boltzmann distribution

(24.8)

where ⟨⋯⟩ denotes the expectation value of some property. The idea of expectation value applies to the macrostate. Here x_i is the value of the X-property in the ith microstate. So we see that equation 24.8 is a weighted average, such that each x_i is weighted by the probability of state i. This averaging process relates a macroscopic property X to the corresponding microscopic property x_i.

As a sanity check, you should verify that ⟨1⟩ = 1 by plugging into equation 24.8.

We now begin to explore the real power of the partition function, using it for much more than just a normalization factor.

We can start from the observation that Z, as defined by equation 24.6, is a perfectly good state function, just as P, V, T, S, et cetera are state functions. We will soon have more to say about the physical significance of this state function.

We now illustrate what Z is good for. Here is a justly-famous calculation that starts with ln(Z) and differentiates with respect to β:

∂ ln(Z)

∂ β | {Ê_i}

∑

(− Ê_i) e^−β Ê_i

−⟨Ê⟩

−E

(24.9)

Recall that Ê_i is the energy of the ith microstate, while E is the energy of the macrostate.

Equation 24.9 tells us that one of the directional derivatives of the partition function is related to the energy. For a particle in a box, or for an ideal gas, all the energy levels are determined by the volume of the box, in which case we can write E = −∂ln(Z) / ∂β at constant volume.

You have to pay attention to understand what is happening here. How can the macroscopic energy ⟨E⟩ be changing when we require all the Ê_i to be constant? The answer is that the expectation value ⟨⋯⟩ is a weighted average, weighted according to the probability of finding the system in the ith microstate, and by changing the inverse temperature β we change the weighting.

As another example calculation using the partition function, it is amusing to express the entropy in terms of the partition function. We start with the workhorse expression for entropy, equation 2.2 or equation 9.3, and substitute the probability from equation 24.7.

S[P]

−k

∑

P_i ln(P_i)

(24.10a)

−k

∑

e^−β Ê_i

ln(

e^−β Ê_i

)

(24.10b)

−k

∑

e^−β Ê_i

[−β Ê_i − ln(Z)]

(24.10c)

kβ

∑

Ê_i e^−β Ê_i

+ kln(Z)

∑

e^−β Ê_i

(24.10d)

kβ ⟨Ê⟩ + kln(Z) ⟨1⟩

(24.10e)

kβ E + kln(Z)

(24.10f)

−k

∂ ln(Z)

∂ ln(β) | {Ê_i}

+ kln(Z)

(24.10g)

We obtained the last line by plugging in the value of E obtained from equation 24.9. This gives us a handy formula for calculating the entropy directly from the partition function.

Here we have used the fact that ⟨ln(Z)⟩≡ln(Z), as it must be since Z is not a function of the dummy index i. Also, in the last line we have used equation 24.9.

The next-to-last line of equation 24.10 tells us that E − TS = −kTln(Z) … and equation 15.13 tells us that the free energy is F := E − TS. Combining these expressions yields a surprisingly simple expression for the free energy:

As an exercise in algebra, you find the entropy in terms of the free energy, namely

We have just established a connection between the free energy F, the temperature T, and the partition function Z. If at any point you know two of the three, you can immediately calculate the third.

As another example, consider the case where the microstate energy depends linearly on some parameter B:

The notation was chosen to suggest that B might be an overall applied magnetic field, and M_i might be the magnetization of the ith state … but this interpretation is not mandatory. The idea applies for any parameter that affects the energy linearly as in equation 24.13. Remember Feynman’s proverb: the same equations have the same solutions.

24.3 Example: Harmonic Oscillator

The partition function Z is defined in terms of a series, but sometimes it is possible to sum the series analytically to obtain a closed-form expression for Z. The partition function of a quantum harmonic oscillator is a simple example of this. As discussed in reference 54, it involves a summing a geometric series, which is about as easy as anything could be. The result is

where csch is the hyperbolic cosecant, i.e. the reciprocal of the hyperbolic sine.

Using methods described in section 24.2 we can easily the energy of the harmonic oscillator in thermal equilibrium. The result is given by equation 24.16 and diagrammed in figure 24.1.

kβ E + kln(Z)

S/k

½βℏω coth(½βℏω) + ln[½ csch(½βℏω)]

βℏω

e^−βℏω

1−e^−βℏω

− ln(1 − e^−βℏω)

(24.17)

The microstates of a harmonic oscillator are definitely not equally populated, but we remark that the entropy in equation 24.18 is the same as what we would get for a system with e kT/ℏω equally-populated microstates. In particular it does not correspond to a picture where every microstate with energy Ê < kT is occupied and others are not; the probability is spread out over approximately e times that many states.

This is most easily understood by reference to the definition of entropy, as expressed by e.g. equation 2.3. At low temperature, all of the probability is in the ground state, except for a very very small bit of probability in the first excited state.

24.4 Example: Two-State System

Suppose we have a two-state system. Specifically, consider a particle such as an electron or proton, which has two spin states, up and down, or equivalently |↑⟩ and |↓⟩. Let’s apply a magnetic field B, so that the two states have energy

where µ is called the magnetic moment. For a single particle, the partition function is simply:

Next let us consider N such particles, and assume that they are very weakly interacting, so that when we calculate the energy we can pretend they are non-interacting. Then the overall partition function is

You can easily verify that at high temperature (β = 0), this reduces to S/N = kln(2) i.e. one bit per spin, as it should. Meanwhile, at low temperatures (β → ∞), it reduces to S = 0.

It is interesting to plot the entropy as a function of entropy, as in figure 24.2.

In this figure, the slope of the curve is β, i.e. the inverse temperature. It may not be obvious from the figure, but the slope of the curve is infinite at both ends. That is, at the low-energy end the temperature is positive but only slightly above zero, whereas at the high-energy end the temperature is negative but only slightly below zero. Meanwhile, the peak of the curve corresponds to infinite temperature, i.e. β=0. The temperature is shown in figure 24.3.

In this system, the curve of T as a function of E has infinite slope when E=E_min. You can prove that by considering the inverse function, E as a function of T, and expanding to first order in T. To get a fuller understanding of what is happening in the neighborhood of this point, we can define a new variable b := exp(−µB/kT) and develop a Taylor series as a function of b. That gives us

E − E_min

2µB e^−2µB/kT

for T near zero

2µB

ln(2NµB) − ln(E − E_min)

(24.25)

which is what we would expect from basic principles: The energy of the excited state is 2µB above the ground state, and the probability of the excited state is given by a Boltzmann factor.

Let us briefly mention the pedestrian notion of “equipartition” (i.e. 1/2 kT of energy per degree of freedom, as suggested by equation 25.7). This notion makes absolutely no sense for our spin system. We can understand this as follows: The pedestrian result calls for 1/2 kT of energy per quadratic degree of freedom in the classical limit, whereas (a) this system is not classical, and (b) it doesn’t have any quadratic degrees of freedom.

For more about the advantages and limitations of the idea of equipartiation, see chapter 25.

Indeed, one could well ask the opposite question: Given that we are defining temperature via equation 7.7, how could «equipartition» ever work at all? Partly the answer has to do with “the art of the possible”. That is, people learned to apply classical thermodynamics to problems where it worked, and learned to stay away from systems where it didn’t work. If you hunt around, you can find systems that are both harmonic and non-quantized, such as the classical ideal gas, the phonon gas in a solid (well below the melting point), and the rigid rotor (in the high temperature limit). Such systems will have 1/2 kT of energy in each quadratic degree of freedom. On the other hand, if you get the solid too hot, it becomes anharmonic, and if you get the rotor too cold, it becomes quantized. Furthermore, the two-state system is always anharmonic and always quantized. Bottom line: Sometimes equipartition works, and sometimes it doesn’t.

24.5 Rescaling the Partition Function

This section is a bit of a digression. Feel free to skip it if you’re in a hurry.

We started out by saying that the probability P_i is “proportional” to the Boltzmann factor exp(−βÊ_i).

If P_i is proportional to one thing, it is proportional to lots of other things. So the question arises, what reason do we have to prefer exp(−βÊ_i) over other expressions, such as the pseudo-Boltzmann factor α exp(−βÊ_i).

We assume the fudge factor α is the same for every microstate, i.e. for every term in the partition function. That means that the probability P_i^† we calculate based on the pseudo-Boltzmann factor is the same as what we would calculate based on the regular Boltzmann factor:

All the microstate probabilities are the same, so anything – such as entropy – that depends directly on microstate probabilities will be the same, whether or not we rescale the Boltzmann factors.

Our next steps depend on whether α depends on β or not. If α is a constant, independent of β, then rescaling the Boltzmann factors by a factor of α has no effect on the entropy, energy, or anything else. You should verify that any factor of α would drop out of equation 24.9 on the first line.

We now consider the case where α depends on β. (We are still assuming that α is the same for every microstate, i.e. independent of i, but it can depend on β.)

If we were only using Z as a normalization denominator, having a fudge factor that depends on β would not matter. We could just pull the factor out front in the numerator and denominator of equation 24.26 whereupon it would drop out.

In contrast, if we are interested in derivatives, the derivatives of Z′ := β Z are different from the derivatives of plain Z. You can easily verify this by plugging Z′ into equation 24.9. The β-dependence matters in equation 24.9 even though it doesn’t matter in equation 24.10. We summarize this by saying that Z is not just a normalization factor.

A particularly interesting type of fudge factor is exp(−βφ) for some constant φ. You can easily verify that this corresponds to shifting all the energies in the problem by φ. This can be considered a type of gauge invariance. In situations where relativity is not involved, such as the present situation, you can shift all the energies in the problem by some constant without changing the observable physics. The numerical value of the energy is changed, but this has no observable consequences. In particular, shifting the energy does not shift the entropy.

24 Partition Function

24.1 Basic Properties

24.2 Calculations Using the Partition Function

24.3 Example: Harmonic Oscillator

24.4 Example: Two-State System

24.5 Rescaling the Partition Function