# The Density Matrix

*Michael Fowler*

### Pure States and Mixed States

Our treatment here more or less follows that of Sakurai,
beginning with two imagined Stern-Gerlach experiments. In that experiment, a stream of (non-ionized)
silver atoms from an oven is directed through an inhomogeneous vertical
magnetic field, and the stream splits into two.
The silver atoms have nonzero magnetic moments, and a magnetic moment in
an inhomogeneous magnetic field experiences a nonzero force, causing the atom
to veer from its straight line path, the
magnitude of the deflection being proportional to the component of the atom’s
magnetic moment in the vertical (field) direction. The observation of the beam splitting into
two, and no more, means that the vertical component of the magnetic moment, and
therefore the associated angular momentum, can only have *two* different
values. From the basic analysis of
rotation operators and the properties of angular momentum that follow, this
observation forces us to the conclusion that the total angular momentum of a
silver atom is ${\scriptscriptstyle \frac{1}{2}}\hslash $. Ordinary orbital angular momenta cannot have
half-integer values; this experiment was one of the first indications that the
electron has a spin degree of freedom, an angular momentum that cannot be
interpreted as orbital angular momentum of constituent parts. The silver atom has 47 electrons, 46 of them
have total spin and orbital momenta that separately cancel, the 47^{th}
has no orbital angular momentum, and its spin is the entire angular momentum of
the atom.

Here we shall use the
Stern-Gerlach stream as an example of a large collection of quantum systems
(the atoms) to clarify just how to describe such a collection, often called an *ensemble*.
To avoid unnecessary complications, we only consider the *spin* degrees of
freedom. We begin by examining two
different streams:

Suppose experimentalist $A$ prepares a stream of silver atoms such that each atom is in the spin state ${\psi}_{A}$:

$|{\psi}_{A}\rangle =\frac{1}{\sqrt{2}}\left(|\uparrow \rangle +|\downarrow \rangle \right)$.

Meanwhile, experimentalist $B$ prepares a stream of silver atoms which is a *mixture*:
half the atoms are in state $|\uparrow \rangle $ and half are in the state $|\downarrow \rangle $:
call this mix $B.$

*Question*: can we distinguish the $A$ stream from the $B$ stream?

Evidently, not by measuring the spin in the $z\text{-}$direction! Both will give up 50% of the time, down 50%.

But: we *can *distinguish them by measuring the spin in
the $x\text{-}$direction: the ${\psi}_{A}$ quantum state is in fact just that of a spin
in the $x\text{-}$direction, so it will give “up” in the $x\text{-}$direction every time$\u2014$from now on we
call it $|{\uparrow}_{x}\rangle $,
whereas the state $|\uparrow \rangle $ (“up” in the $z\text{-}$direction) will yield “up” in the $x\text{-}$direction only 50% of the time, as will $|\downarrow \rangle $.

The state ${\psi}_{A}=|{\uparrow}_{x}\rangle $ is called a ** pure** state, it’s the kind of quantum state we’ve been
studying this whole course.

The stream $B,$ in contrast, is in a ** mixed** state: the kind that actually occurs to a greater or
lesser extent in a real life stream of atoms, different pure quantum states
occurring with different probabilities, but with no phase coherence between
them. In other words, these relative
probabilities in $B$ of different quantum states do

*not*derive from probability amplitudes, as they do in finding the probability of spin up in stream $A:$ the probabilities of the different quantum states in the mixed state $B$ are exactly like classical probabilities.

That being said, though, to find the probability of
measuring spin up in some such mixed state, one *first* uses the
classical-type probability for each component state, *then* for each
quantum state in the mix, one finds the probability of spin up *in that state* by the standard quantum
technique.

Therefore, for a mixed state in which the system is in state $|{\psi}_{i}\rangle $ with probability ${w}_{i},$$\sum {w}_{i}=1},$ the expectation value of an operator $\widehat{A}$ is

$\langle \widehat{A}\rangle ={\displaystyle \sum {w}_{i}\langle {\psi}_{i}|\widehat{A}|{\psi}_{i}\rangle}$

and we should emphasize that these $|{\psi}_{i}\rangle $ do *not* need to be orthogonal (but they
are of course normalized): for example one could be $|{\uparrow}_{x}\rangle $,
another $|{\uparrow}_{z}\rangle $.
(We put the usually omitted $z$ in for emphasis.) The reason we put a hat on $\widehat{A}$ here is to emphasize that this is an operator,
but the ${w}_{i}$ are just numbers.

### The Density Matrix

The equation for the expectation value $\langle \widehat{A}\rangle $ can be written:

$\langle \widehat{A}\rangle =\mathrm{Trace}\left(\widehat{\rho}\widehat{A}\right)\text{where}\widehat{\rho}={\displaystyle \sum {w}_{i}|{\psi}_{i}\rangle \langle {\psi}_{i}|}\text{\hspace{0.17em}}.$

To see exactly how this comes about, recall that for an operator $\widehat{B}$ in a finite-dimensional vector space with an orthonormal basis set $|j\rangle $, $\text{Tr}\widehat{B}={\displaystyle \sum _{j=1}^{n}\langle j|\widehat{B}|j\rangle}={B}_{jj}$, where the repeated suffix implies summation of the diagonal matrix elements of the operator.

Therefore,

$\begin{array}{c}\mathrm{Tr}\left(\widehat{\rho}\widehat{A}\right)\text{=}{\displaystyle \sum _{j=1}^{n}{\displaystyle \sum _{i=1}^{n}{w}_{i}\langle j|{\psi}_{i}\rangle \langle {\psi}_{i}|\widehat{A}|j\rangle}}\text{}\\ \text{=}{\displaystyle \sum _{j=1}^{n}{\displaystyle \sum _{i=1}^{n}{w}_{i}\langle {\psi}_{i}|\widehat{A}|j\rangle \langle j|{\psi}_{i}\rangle}}\text{}\\ ={\displaystyle \sum _{i=1}^{n}{w}_{i}\langle {\psi}_{i}|\widehat{A}|{\psi}_{i}\rangle}\end{array}$

since $\sum |j\rangle \langle j|}=I$, the identity.

This $\widehat{\rho}$ is called the *density matrix*: its
matrix form is made explicit by considering states $|{\psi}_{i}\rangle $ in a finite $N\text{-}$dimensional vector space (such as spins or
angular momenta)

$$|{\psi}_{i}\rangle ={\displaystyle \sum _{j}{\left({V}_{i}\right)}_{j}}|j\rangle $$

where the $|j\rangle $ are an orthonormal basis set, and ${\left({V}_{i}\right)}_{j}$ is the ${j}^{\text{th}}$ component of a normalized vector ${V}_{i}.$ It is convenient to express $\widehat{\rho}$ in terms of kets and bras belonging to this orthonormal basis,

$\widehat{\rho}={\displaystyle \sum {w}_{i}|{\psi}_{i}\rangle \langle {\psi}_{i}|}={\displaystyle \sum _{i,j,k}{w}_{i}{\left({V}_{i}\right)}_{j}{\left({V}_{i}^{\u2020}\right)}_{k}|j\rangle}\langle k|={\displaystyle \sum _{j,k}{\rho}_{jk}|j\rangle}\langle k|$

and evidently

$\langle \widehat{A}\rangle =\mathrm{Trace}\left(\widehat{\rho}\widehat{A}\right)={\displaystyle \sum _{n,j,k}\langle n|{\rho}_{jk}}|j\rangle \langle k|\widehat{A}|n\rangle ={\displaystyle \sum _{j,k}{\rho}_{jk}}\langle k|\widehat{A}|j\rangle ={\displaystyle \sum _{j,k}{\rho}_{jk}}{A}_{kj}.$

(Since ${\rho}_{jk}$ is just a *number*,
$\langle n|{\rho}_{jk}|j\rangle ={\rho}_{jk}\langle n|j\rangle ={\rho}_{jk}{\delta}_{nj}$.)

$\mathrm{Trace}\left(\widehat{\rho}\widehat{A}\right)$ is *basis-independent*, the trace of a
matrix being unchanged by a unitary transformation, since it follows from $\mathrm{Tr}\left(ABC\right)=\mathrm{Tr}\left(BCA\right)$ that

$\mathrm{Tr}{U}^{\u2020}AU=\mathrm{Tr}AU{U}^{\u2020}=\mathrm{Tr}A\text{for}U{U}^{\u2020}=1$.

Note that since the vectors ${V}_{i}$ are normalized, $\sum _{j}{\left({V}_{i}\right)}_{j}{\left({V}_{i}^{\u2020}\right)}_{j}}=1,$ with the $i$ not summed over, and $\sum {w}_{i}=1},$ it follows that

$\mathrm{Tr}\widehat{\rho}=1$

(also evident by putting $A=1$ in the equation for $\langle A\rangle $ ).

For a system in a ** pure** quantum state $|\psi \rangle $, $\widehat{\rho}=|\psi \rangle \langle \psi |$,
just the projection operator into that state, and

${\widehat{\rho}}^{2}=\widehat{\rho}$,

as for all projection operators.

It’s worth spelling out how this differs from the mixed state by looking at the form of the density matrix.

For the pure state $|\psi \rangle $, if a basis is chosen so that $|\psi \rangle $ is a member of the basis (this can always be done), $\widehat{\rho}$ is a matrix with every element zero except the one diagonal element corresponding to $|\psi \rangle \langle \psi |$, which will be unity. Obviously, ${\widehat{\rho}}^{2}=\widehat{\rho}$. This is less obvious in a general basis, where $\widehat{\rho}$ will not necessarily be diagonal. But the statement ${\widehat{\rho}}^{2}=\widehat{\rho}$ remains true under a transformation to a new basis.

For a mixed state, let’s say for example a mixture of orthogonal states $|{\psi}_{1}\rangle ,\text{\hspace{0.17em}}|{\psi}_{2}\rangle $, if we choose a basis including both states, the density matrix will be diagonal with just two entries ${w}_{1},{w}_{2}.$ Both these numbers must be less than unity, so ${\widehat{\rho}}^{2}\ne \widehat{\rho}.$ A mix of nonorthogonal states is left as an exercise for the reader.

### Some Simple Examples

**First, our case $A$ above (pure state)**: all spins in state $|{\uparrow}_{x}\rangle =\left(1/\sqrt{2}\right)\left(|\uparrow \rangle +|\downarrow \rangle \right)$.

In the standard $|\uparrow \rangle ,\text{\hspace{0.17em}}|\downarrow \rangle $ basis,

$\widehat{\rho}=|{\uparrow}_{x}\rangle \langle {\uparrow}_{x}|=\left(\begin{array}{c}1/\sqrt{2}\\ 1/\sqrt{2}\end{array}\right)\left(1/\sqrt{2}\text{\hspace{1em}}1/\sqrt{2}\right)=\left(\begin{array}{cc}1/2& 1/2\\ 1/2& 1/2\end{array}\right)$

and

$\begin{array}{l}\langle {s}_{x}\rangle =\mathrm{Tr}\left(\widehat{\rho}{s}_{x}\right)=\frac{\hslash}{2}\mathrm{Tr}\left(\begin{array}{cc}1/2& 1/2\\ 1/2& 1/2\end{array}\right)\left(\begin{array}{cc}0& 1\\ 1& 0\end{array}\right)=\frac{\hslash}{2}\\ \\ \langle {s}_{z}\rangle =\mathrm{Tr}\left(\widehat{\rho}{s}_{z}\right)=\frac{\hslash}{2}\mathrm{Tr}\left(\begin{array}{cc}1/2& 1/2\\ 1/2& 1/2\end{array}\right)\left(\begin{array}{cc}1& 0\\ 0& -1\end{array}\right)=0.\end{array}$

Notice that ${\widehat{\rho}}^{2}=\widehat{\rho}$.

**Now, case $B$ (50-50
mixed up and down)**: $$ 50% in the state $|\uparrow \rangle $, 50% $|\downarrow \rangle $.

The density matrix is

$\begin{array}{l}\widehat{\rho}=\frac{1}{2}|\uparrow \rangle \langle \uparrow |+\frac{1}{2}|\downarrow \rangle \langle \downarrow |\\ \\ =\frac{1}{2}\left(\begin{array}{c}1\\ 0\end{array}\right)\left(1\text{\hspace{1em}}0\right)+\frac{1}{2}\left(\begin{array}{c}0\\ 1\end{array}\right)\left(\begin{array}{cc}0& 1\end{array}\right)=\frac{1}{2}\left(\begin{array}{cc}1& 0\\ 0& 1\end{array}\right).\end{array}$

This is proportional to the unit matrix, so

$\mathrm{Tr}\widehat{\rho}{s}_{x}=\frac{1}{2}\frac{\hslash}{2}\mathrm{Tr}{\sigma}_{x}=0,$

and similarly for ${s}_{y}$ and ${s}_{z},$ since the Pauli $\sigma \text{-}$matrices are all traceless. Note also that ${\widehat{\rho}}^{2}={\scriptscriptstyle \frac{1}{2}}\widehat{\rho}\ne \widehat{\rho}$, as is true for all mixed states.

**Finally, a 50-50
mixed state relative to the x-axis**:

That is, 50% of the spins in the state $|{\uparrow}_{x}\rangle =\left(1/\sqrt{2}\right)\left(|\uparrow \rangle +|\downarrow \rangle \right)$, “up” along the $x\text{-}$ axis, and 50% in $|{\downarrow}_{x}\rangle =\left(1/\sqrt{2}\right)\left(|\uparrow \rangle -|\downarrow \rangle \right)$, “down” in the $x\text{-}$direction.

It is easy to check that

$\widehat{\rho}=\frac{1}{2}|{\uparrow}_{x}\rangle \langle {\uparrow}_{x}|+\frac{1}{2}|{\downarrow}_{x}\rangle \langle {\downarrow}_{x}|=\frac{1}{2}\left(\begin{array}{cc}1/2& 1/2\\ 1/2& 1/2\end{array}\right)+\frac{1}{2}\left(\begin{array}{cc}1/2& -1/2\\ -1/2& 1/2\end{array}\right)=\frac{1}{2}\left(\begin{array}{cc}1& 0\\ 0& 1\end{array}\right).$

This is exactly the same density matrix we found for 50% in the state $|\uparrow \rangle $, 50% $|\downarrow \rangle $!

The reason is that both formulations describe a state about
which we know nothing$\u2014$we are in a state
of *total ignorance*, the spins are
completely random, all directions are equally likely. The density matrix describing such a state
cannot depend on the direction we choose for our axes.

Another two-state quantum system that can be analyzed in the same way is the polarization state of a beam of light, the basis states being polarization in the $x\text{-}$direction and polarization in the $y\text{-}$direction, for a beam traveling parallel to the $z\text{-}$ axis. Ordinary unpolarized light corresponds to the random mixed state, with the same density matrix as in the last example above.

### Time Evolution of the Density Matrix

In the mixed state, the quantum states evolve independently according to Schrödinger’s equation, so

$i\hslash \frac{d\widehat{\rho}}{dt}={\displaystyle \sum {w}_{i}H|{\psi}_{i}\rangle \langle {\psi}_{i}|}\text{\hspace{0.17em}}-\text{\hspace{0.17em}}{\displaystyle \sum {w}_{i}|{\psi}_{i}\rangle \langle {\psi}_{i}|H}=\left[H,\widehat{\rho}\right].$

Note that this has the opposite sign from the evolution of a Heisenberg operator, not surprising since the density operator is made up of Schrödinger bras and kets.

The equation is the quantum analogue of *Liouville’s theorem* in statistical mechanics. Liouville’s theorem describes the evolution in
time of an ensemble of identical classical systems, such as many boxes each
filled with the same amount of the same gas at the same temperature, but the
positions and momenta of the individual atoms are randomly different in
each. Each box can be classically
described by a single point in a huge dimensional space, a space having six
dimensions for each atom (position and momentum, we ignore possible internal
degrees of freedom). The whole ensemble,
then, is a gas of these points in this huge space, and the rate of change of
local density of this gas, from *Classical
Mechanics* notes). Anyway, this
is the classical precursor of, and the reason for the name of, the density
matrix.

### Thermal Equilibrium

A system in thermal equilibrium is represented in
statistical mechanics by a *canonical ensemble*. If the eigenstate $|i\rangle $ of the Hamiltonian has energy ${E}_{i},$ the relative probability of the system being
in that state is ${e}^{-{E}_{i}/kT}={e}^{-\beta {E}_{i}}$ in the standard notation. Therefore the density matrix is:

$\widehat{\rho}=\frac{1}{Z}{\displaystyle \sum _{i}{e}^{-\beta {E}_{i}}|i\rangle \langle i|}=\frac{{e}^{-\beta H}}{Z},\text{}$

where

$Z={\displaystyle \sum _{i}{e}^{-\beta {E}_{i}}}=\mathrm{Tr}{e}^{-\beta H}.$

Notice that in this formulation, apart from the normalization constant $Z,$ the density operator is analogous to the propagator $U\left(t\right)={e}^{-iHt/\hslash}$ for an imaginary time $t=-i\hslash \beta $. Incidentally, for interacting quantum fields, the propagator can be constructed as a set of Feynman diagrams corresponding to all possible sequences of particle scatterings by interaction. To find the thermodynamic properties of a field theory at finite temperature, essentially the same set of diagrams is used to find the free energy: the diagrams now describe the system propagating for a finite imaginary time, the same mathematical tools can be used.

At zero temperature ( $\beta =\infty $ ) the probability coefficients ${w}_{i}={e}^{-\beta {E}_{i}}/Z$ are all zero except for the ground state: the
system is in a pure state, and the density matrix has every element zero except
for a single element on the diagonal. At
infinite temperature, all the ${w}_{i}$ are equal: the density matrix is just $1/N$ times the unit matrix, where $N$ is the total number of states available to the
system. In fact, the *entropy* of
the system can be expressed in terms of the density matrix: $S=-k\mathrm{Tr}\left(\widehat{\rho}\mathrm{ln}\widehat{\rho}\right)$.
This is not as bad as it looks: both operators are diagonal in the energy
subspace.