statistical mechanics - Deriving the Boltzmann distribution using the information entropy

Sunday, 24 August 2014

statistical mechanics - Deriving the Boltzmann distribution using the information entropy

I was going through my lecture notes and I found something I could not quite understand. First, it starts by deriving an expression for the information entropy (as used in physics?):

Let $p_i$ be the probability of finding the system in microstate $i$ . With the total number of accessible Microstates $N(A)$ , this can be written $p_i = \frac{1}{N}$ for all microstates compatible with the macrostate $A$ . We can write $\ln N(A) = -1\cdot\ln p_i = -\left(\sum_{i = 1}^{N(A)}p_i\right)\ln p_i$ due to the normalization of the $p_i$ . [...]

Therefore, we can write for the information entropy of a macrostate $A$ :
$S(A) = -k_\mathrm{B} \sum_{i = 1}^{N(A)}p_i\ln p_i$

Later, it tries to derive the Boltzmann distribution for the ideal gas:

We will do so by finding the extremum of
$\phi = S(A) - \lambda_1\left(\sum_i p_i - 1\right) - \lambda_2 \left(\sum_i p_i E_i - E_\mathrm{avg}\right)$ using the method of Lagrange multipliers. With $S(A) = -k_\mathrm{B} \sum_{i = 1}^{N(A)}p_i\ln p_i$ .

It goes on to find the correct formula.

My question is, why this expression for the entropy $S(A)$ can be used, even though for the second example the $p_i$ are obviously not constant and equal to $\frac{1}{N}$ ?

Answer

You can actually derive the Gibbs entropy from purely mathematical concerns and the properties of probability. The properties we require entropy to have are:

Extensivity - the entropy of two independent systems, considered as a whole, should be the sum of the entropies of the individual systems $S(A\cap B) = S(A) + S(B).$

Continuity - the entropy should be a smoothly differentiable function of the probabilities assigned to each state.

Minimum - the entropy should be zero if and only if the system is in a single state with probability $1$ .

Maximum - the entropy should be maximized when every state is equally probable.

It follows from probability theory that, when $A$ and $B$ are independent, then

$P(A\cap B) = P(A)\, P(B).$ The unique continuous function that changes multiplication to addition is the logarithm, and which base is chosen is a matter of convention. Requirements 1 and 4 also imply that the entropy increases when states become less probable, requiring the constant of proportionality to be negative. Since

$A$ and

$B$ are microscopic random variables, the macroscopic entropy has to be an expectation value that averages over the microstates. Therefore

$\begin{align} S & = \langle -\ln(p_i) \rangle \\ & = -k\sum_{i=1}^N p_i \ln(p_i). \end{align}$

In physics we choose the constant of proportionality to be $k_B$ , Boltzmann's constant, and assign it units $\operatorname{J} \operatorname{K}^{-1}$ , Joules per Kelvin, in order to match Clausius's formula for classical entropy. When all of the $p_i$ are equally probable, the formula reduces to the Boltzmann entropy.

You get the classical canonical ensembles and their corresponding distributions when you maximize the entropy of a system that is interacting with a 'bath' in a way that constrains the average value of a parameter (e.g. energy, volume, particle number) without specifying the value that parameter takes. The MB distribution comes, as the questioner saw, when the average energy is constrained but the total energy is allowed to vary; total energy would be fixed by adding a Lagrange multiplier of the form $\lambda_E (E_i - E_{\mathrm{tot}})$ , producing the microcanonical ensemble.

Blog

Sunday, 24 August 2014

statistical mechanics - Deriving the Boltzmann distribution using the information entropy

No comments:

Post a Comment

Understanding Stagnation point in pitot fluid