# Angular Momentum Operator Algebra

*Michael Fowler *

### Preliminaries: Translation and Rotation Operators

As a warm up to analyzing how a wave function transforms
under rotation, we review the effect of *linear translation* on a single
particle wave function $\psi \left(x\right)$. We have already seen an example of this: the
coherent states of a simple harmonic oscillator discussed earlier were (at $t=0$ ) identical to the ground state except that
they were centered at some point displaced from the origin. In fact, the
operator creating such a state from the ground state is a translation operator.

The *translation operator* $T\left(a\right)$ is *defined* at that operator which when
it acts on a wave function ket $|\psi \left(x\right)\rangle $ gives the ket corresponding to that wave
function moved over by $a,$ that is,

$T\left(a\right)|\psi \left(x\right)\rangle =|\psi \left(x-a\right)\rangle ,$

so, for example, if $\psi \left(x\right)$ is a wave function centered at the origin, $T\left(a\right)$ moves it to be centered at the point $a.$

We have written the wave function as a ket here to emphasize
the parallels between this operation and some later ones, but it is simpler at
this point to just work with the wave function as a function, so we will drop
the ket bracket for now. The form of $T\left(a\right)$ as an operator on a function is made evident
by rewriting the

$\begin{array}{c}\psi \left(x-a\right)=\psi \left(x\right)-a\frac{d}{dx}\psi \left(x\right)+\frac{{a}^{2}}{2!}\frac{{d}^{2}}{d{x}^{2}}\psi \left(x\right)-\dots \\ ={e}^{-a\frac{d}{dx}}\psi \left(x\right)\\ =T\left(a\right)\psi \left(x\right).\end{array}$

Now for the quantum connection: the differential operator appearing in the exponential is in quantum mechanics proportional to the momentum operator ( $\widehat{p}=-i\hslash d/dx$ ) so the translation operator

$T\left(a\right)={e}^{-ia\widehat{p}/\hslash}.$

An important special case is that of an infinitesimal translation,

$T\left(\epsilon \right)={e}^{-i\epsilon \widehat{p}/\hslash}=1-i\epsilon \widehat{p}/\hslash .$

The momentum operator $\widehat{p}$ is said to be the *generator* of the
translation.

(*A note on possibly confusing notation*: Shankar
writes (page 281) $T\left(\epsilon \right)|x\rangle =|x+\epsilon \rangle .$ Here $|x\rangle $ denotes a delta-function type wave function
centered at $x.$ It might be better if he had written $T\left(\epsilon \right)|{x}_{0}\rangle =|{x}_{0}+\epsilon \rangle $,
then we would see right away that this translates into the wave function
transformation $T\left(\epsilon \right)\delta \left(x-{x}_{0}\right)=\delta \left(x-{x}_{0}-\epsilon \right)$,
the sign of $\epsilon $ now obviously consistent with our usage
above.)

It is important to be clear about whether the *system*
is being translated by $a,$ as we have done above or whether, alternately,
the *coordinate axes* are being translated by $a,$ that latter would result in the *opposite*
change in the wave function. Translating the coordinate axes, along with the
apparatus and any external fields by $-a$ relative to the wave function would of course
give the same physics as translating the wave function by $+a.$ In fact, these two equivalent operations are
analogous to the time development of a wave function being described either by
a Schrödinger picture, in which the bras and kets change in time, but not the
operators, and the Heisenberg picture in which the operators develop but the
bras and kets do not change. To pursue
this analogy a little further, in the “Heisenberg” case

$\widehat{x}\to {T}^{-1}\left(\epsilon \right)\widehat{x}T\left(\epsilon \right)={e}^{i\epsilon \widehat{p}/\hslash}\widehat{x}{e}^{-i\epsilon \widehat{p}/\hslash}=\widehat{x}+i\epsilon \left[\widehat{p},\widehat{x}\right]/\hslash =\widehat{x}+\epsilon $

and $\widehat{p}$ is unchanged since it commutes with the
operator. So there are two possible ways
to deal with translations: transform the bras and kets, *or* transform the
operators. We shall almost always leave the operators alone, and transform the
bras and kets.

We have established that *the momentum operator is the
generator of spatial translations* (the generalization to three dimensions
is trivial). We know from earlier work
that the Hamiltonian is the generator of *time* translations, by which we
mean

$\psi \left(t+a\right)={e}^{-iHa/\hslash}\psi \left(t\right).$

It is tempting to conclude that the *angular momentum*
must be the operator generating *rotations* of the system, and, in fact,
it is easy to check that this is correct.
Let us consider an infinitesimal rotation $\delta \overrightarrow{\theta}$ about some axis through the origin (the
infinitesimal vector being in the direction of the axis). A wavefunction $\psi \left(\overrightarrow{r}\right)$^{ }initially localized at ${\overrightarrow{r}}_{0}$ will shift to be localized at ${\overrightarrow{r}}_{0}+\delta {\overrightarrow{r}}_{0}$,
where $\delta {\overrightarrow{r}}_{0}=\delta \overrightarrow{\theta}\times {\overrightarrow{r}}_{0}.$ So, how does a wave function transform under
this small rotation? Just as for the
translation case, $\psi \left(\overrightarrow{r}\right)\to \psi \left(\overrightarrow{r}-\delta \overrightarrow{r}\right)$. If you don’t understand the minus sign,
reread the discussion on translations and the sign of $\epsilon $.

Thus

$\psi \left(\overrightarrow{r}\right)\to \psi \left(\overrightarrow{r}\right)-\frac{i}{\hslash}\delta \overrightarrow{r}.\widehat{\overrightarrow{p}}\psi \left(\overrightarrow{r}\right)$

to first order in the infinitesimal quantity, so the rotation operator

$\begin{array}{c}R\left(\delta \overrightarrow{\theta}\right)\psi \left(\overrightarrow{r}\right)=\left(1-\frac{i}{\hslash}\delta \overrightarrow{\theta}\times \overrightarrow{r}.\widehat{\overrightarrow{p}}\right)\psi \left(\overrightarrow{r}\right)\\ =\left(1-\frac{i}{\hslash}\delta \overrightarrow{\theta}.\overrightarrow{r}\times \widehat{\overrightarrow{p}}\right)\psi \left(\overrightarrow{r}\right)\\ =\left(1-\frac{i}{\hslash}\delta \overrightarrow{\theta}.\widehat{\overrightarrow{L}}\right)\psi \left(\overrightarrow{r}\right).\end{array}$

If we write this as

$R\left(\delta \overrightarrow{\theta}\right)\psi \left(\overrightarrow{r}\right)={e}^{-\frac{i}{\hslash}\delta \overrightarrow{\theta}.\widehat{\overrightarrow{L}}}\psi \left(\overrightarrow{r}\right)$

it is clear that a finite rotation is given by multiplying together a large number of these operators, which just amounts to replacing $\delta \overrightarrow{\theta}$ by $\overrightarrow{\theta}$ in the exponential. Another way of going from the infinitesimal rotation to a full rotation is to use the identity

$\underset{N\to \infty}{\mathrm{lim}}{\left(1+\frac{A\theta}{N}\right)}^{N}={e}^{A\theta}$

which is clearly valid even if $A$ is an operator.

We have therefore established that the orbital angular
momentum operator $\widehat{\overrightarrow{L}}$ is the generator of spatial rotations, by
which we mean that if we rotate our apparatus, and the wave function with it,
the appropriately transformed wave function is generated by the action of $R\left(\overrightarrow{\theta}\right)$ on the original wave function. It is perhaps worth
giving an explicit example: suppose we rotate the system, and therefore the
wave function, through an infinitesimal angle $\delta {\theta}_{z}$_{ }about the $z$ -axis. Denote the rotated wave function by ${\psi}_{rot}\left(x,y\right)$. Then

$\begin{array}{c}{\psi}_{rot}\left(x,y\right)=\left(1-\frac{i}{\hslash}\left(\delta {\theta}_{z}\right){\widehat{L}}_{z}\right)\psi \left(x,y\right)\\ =\left(1-\frac{i}{\hslash}\left(\delta {\theta}_{z}\right)\left(-i\hslash \left(x\frac{d}{dy}-y\frac{d}{dx}\right)\right)\right)\psi \left(x,y\right)\\ =\left(1-\left(\delta {\theta}_{z}\right)\left(x\frac{d}{dy}-y\frac{d}{dx}\right)\right)\psi \left(x,y\right)\\ =\psi \left(x+\left(\delta {\theta}_{z}\right)y,\text{\hspace{0.17em}}y-\left(\delta {\theta}_{z}\right)x\right).\end{array}$

That is to say, the value of the new wave function at $\left(x,y\right)$ is the value of the old wave function at the point which was rotated into $\left(x,y\right).$

### Quantum Generalization of the Rotation Operator

However, it has long been known that in quantum mechanics,
orbital angular momentum is *not* the whole story. Particles like the electron are found
experimentally to have an internal angular momentum, called spin. In contrast to the spin of an ordinary
macroscopic object like a spinning top, the electron’s spin is *not* just
the sum of orbital angular momenta of internal parts, and any attempt to
understand it in that way leads to contradictions.

To take account of this new kind of angular momentum, we
generalize the orbital angular momentum $\widehat{\overrightarrow{L}}$ to an operator $\widehat{\overrightarrow{J}}$ which is *defined* as the generator of
rotations on *any* wave function, including possible spin components, so

$R\left(\delta \overrightarrow{\theta}\right)\psi \left(\overrightarrow{r}\right)={e}^{-\frac{i}{\hslash}\delta \overrightarrow{\theta}.\widehat{\overrightarrow{J}}}\psi \left(\overrightarrow{r}\right).$

This is of course identical to the equation we found for $\widehat{\stackrel{\rightharpoonup}{L}},$ but there we derived if from the quantum
angular momentum operator including the momentum components written as
differentials. But up to this point $\psi \left(\overrightarrow{r}\right)$ has just been a complex valued function of
position. From now on, the wave function at a point can have *several components*, so it is in some
vector space, and the rotation operator will operate in this space as well as
being a differential operator with respect to position. For example, the wave function could be a
vector at each point, so rotation of the system could rotate this vector as
well as moving it to a different $\overrightarrow{r}$.

To summarize: $\psi \left(\overrightarrow{r}\right)$ is in general an $n$ -component function at each point in space, $R\left(\delta \overrightarrow{\theta}\right)$ is an $n\times n$ matrix in the component space, and the above
equation is the *definition* of $\widehat{\overrightarrow{J}}.$ Starting from this definition, we will find $\widehat{\overrightarrow{J}}$ ’s properties.

The first point to make is that in contrast to translations, rotations do not commute even for a classical system. Rotating a book through $\pi /2$ first about the $z$ -axis then about the $x$ -axis leaves it in a different orientation from that obtained by rotating from the same starting position first $\pi /2$ about the $x$ -axis then $\pi /2$ about the $z$ -axis. Even small rotations do not commute, although the commutator is second order. Since the $R$ -operators are representations of rotations, they will reflect this commutativity structure, and we can see just how they do that by considering ordinary classical rotations of a real vector in three-dimensional space.

The matrices rotating a vector by $\theta $ about the $x,y$ and $z$ axes are

${R}_{x}\left(\theta \right)=\left(\begin{array}{ccc}1& 0& 0\\ 0& \mathrm{cos}\theta & -\mathrm{sin}\theta \\ 0& \mathrm{sin}\theta & \mathrm{cos}\theta \end{array}\right),\text{\hspace{1em}}{R}_{y}\left(\theta \right)=\left(\begin{array}{ccc}\mathrm{cos}\theta & 0& \mathrm{sin}\theta \\ 0& 1& 0\\ -\mathrm{sin}\theta & 0& \mathrm{cos}\theta \end{array}\right),\text{\hspace{1em}}{R}_{z}\left(\theta \right)=\left(\begin{array}{ccc}\mathrm{cos}\theta & -\mathrm{sin}\theta & 0\\ \mathrm{sin}\theta & \mathrm{cos}\theta & 0\\ 0& 0& 1\end{array}\right).$

In the limit of rotations about infinitesimal angles (ignoring higher order terms),

${R}_{x}\left(\epsilon \right)=1+\epsilon \left(\begin{array}{ccc}0& 0& 0\\ 0& 0& -1\\ 0& 1& 0\end{array}\right),\text{\hspace{1em}}{R}_{y}\left(\epsilon \right)=1+\epsilon \left(\begin{array}{ccc}0& 0& 1\\ 0& 0& 0\\ -1& 0& 0\end{array}\right),\text{\hspace{1em}}{R}_{z}\left(\epsilon \right)=1+\epsilon \left(\begin{array}{ccc}0& -1& 0\\ 1& 0& 0\\ 0& 0& 0\end{array}\right).\text{\hspace{1em}}$

It is easy to check that

$\left[{R}_{x}\left(\epsilon \right),{R}_{y}\left(\epsilon \right)\right]={\epsilon}^{2}\left(\begin{array}{ccc}0& -1& 0\\ 1& 0& 0\\ 0& 0& 0\end{array}\right)={R}_{z}\left({\epsilon}^{2}\right)-1.$

The rotation operators on quantum mechanical kets must, like all rotations, follow this same pattern, that is, we must have

$\left(\left(1-\frac{i}{\hslash}\epsilon {J}_{x}\right)\left(1-\frac{i}{\hslash}\epsilon {J}_{y}\right)-\left(1-\frac{i}{\hslash}\epsilon {J}_{y}\right)\left(1-\frac{i}{\hslash}\epsilon {J}_{x}\right)+\frac{i}{\hslash}{\epsilon}^{2}{J}_{z}\right)|\psi \rangle =0$

where we have used the definition of the infinitesimal rotation operator on kets, $R\left(\delta \overrightarrow{\theta}\right)\psi \left(\overrightarrow{r}\right)={e}^{-\frac{i}{\hslash}\delta \overrightarrow{\theta}.\widehat{\overrightarrow{J}}}\psi \left(\overrightarrow{r}\right)$. The zeroth and first-order terms in $\epsilon $ all cancel, the second-order term gives $\left[{J}_{x},{J}_{y}\right]=i\hslash {J}_{z}$. The general statement is:

$\left[{J}_{i},{J}_{j}\right]=i\hslash {\epsilon}_{ijk}{J}_{k}$

This is one of the most important formulas in quantum mechanics.

### Consequences of the Commutation Relations

The commutation formula $\left[{J}_{i},{J}_{j}\right]=i\hslash {\epsilon}_{ijk}{J}_{k},$ which is, after all, a straightforward extension of the result for ordinary classical rotations, has surprisingly far-reaching consequences: it leads directly to the directional quantization of spin and angular momentum observed in atoms subject to a magnetic field.

It is by now very clear that in quantum mechanical systems
such as atoms the total angular momentum, and also the component of angular
momentum in a given direction, can only take certain values. Let us try to construct a basis set of
angular momentum states for a given system: a complete set of kets
corresponding to all allowed values of the angular momentum. Now, angular momentum is a *vector *quantity:
it has magnitude and direction. Let’s
begin with the magnitude, the natural parameter is the length squared:

${J}^{2}={J}_{x}^{2}+{J}_{y}^{2}+{J}_{z}^{2}$.

Now we must specify direction -- but here we run into a
problem. ${J}_{x},{J}_{y}$ and ${J}_{z}$ are all mutually non-commuting, so we cannot
construct a set of common eigenkets of any two of them, which we would need for
a precise specification of direction.
They *do* all commute with ${J}^{2}$,
since it is spherically symmetric and therefore cannot be affected by any
rotation (and, it’s easy to check this commutation explicitly).

The bottom line, then, is that in attempting to construct
eigenkets describing the different possible angular momentum states of a
quantum system, the best we can do is to find the common eigenkets of ${J}^{2}$ and *one* direction, say ${J}_{z}.$ The commutation relations do not allow us to
be more precise about direction, analogous to the Uncertainty Principle for
position and momentum, which also comes from noncommutativity of the relevant
operators.

We conclude that the appropriate angular momentum basis is the set of common eigenkets of the commuting Hermitian matrices ${J}^{2},{J}_{z}$ :

$\begin{array}{l}{J}^{2}|a,b\rangle =a|a,b\rangle \\ {J}_{z}|a,b\rangle =b|a,b\rangle .\end{array}$

Our next task is to find the allowed values of $a$ and $b.$

### Ladder Operators

The sets of allowed eigenvalues $a,b$ can be found using the “ladder operator” trick previously discussed for the simple harmonic oscillator. It turns out

${J}_{\pm}={J}_{x}\pm i{J}_{y}$

are closely analogous to the simple harmonic oscillator raising and lowering operators ${a}^{\u2020}$ and $a.$

${J}_{+}$ and ${J}_{-}$ have commutation relations with ${J}_{z}$:

$\left[{J}_{z},{J}_{\pm}\right]=\pm \hslash \text{\hspace{0.05em}}{J}_{\pm}$

and they of course *commute * with ${J}^{2}$,
as do ${J}_{z},{J}_{x}$ and ${J}_{y}.$

Therefore, ${J}_{\pm}$ operating on $|a,b\rangle $ cannot affect the value of $a.$ But they *do* change the value of $b:$

$\begin{array}{c}{J}_{z}{J}_{\pm}|a,b\rangle =\left[{J}_{z},{J}_{\pm}\right]|a,b\rangle +{J}_{\pm}{J}_{z}|a,b\rangle \\ =\pm \hslash \text{\hspace{0.05em}}{J}_{\pm}|a,b\rangle +b{J}_{\pm}|a,b\rangle \\ =\left(b\pm \hslash \right){J}_{\pm}|a,b\rangle \end{array}$

so if $|a,b\rangle $ is an eigenket of ${J}_{z}$ with eigenvalue $b,$ ${J}_{\pm}|a,b\rangle $ is either zero or an eigenket of ${J}_{z}$ with eigenvalue $b\pm \hslash $, that is, ${J}_{\pm}|a,b\rangle ={C}_{\pm}|a,b\pm \hslash \rangle $ where ${C}_{\pm}\left(a,b\right)$ is a normalization constant, taking the initial $|a,b\rangle $ to be normalized. Just as with the simple harmonic oscillator, we have to find these normalization constants in order to compute matrix elements. All the physics is in the matrix elements.

The squared norm of ${J}_{\pm}|a,b\rangle $

${\Vert {J}_{\pm}|a,b\rangle \Vert}^{2}=\langle a,b|{J}_{\pm}^{\u2020}{J}_{\pm}|a,b\rangle =\langle a,b|{J}_{\mp}{J}_{\pm}|a,b\rangle $

and

$\begin{array}{c}{J}_{\mp}{J}_{\pm}=\left({J}_{x}\mp i{J}_{y}\right)\left({J}_{x}\pm i{J}_{y}\right)={J}_{x}^{2}+{J}_{y}^{2}\pm i\left[{J}_{x},{J}_{y}\right]\\ ={J}^{2}-{J}_{z}^{2}\mp \hslash {J}_{z}\end{array}$

from which

${\Vert {J}_{\pm}|a,b\rangle \Vert}^{2}=\langle a,b|{J}^{2}-{J}_{z}^{2}\mp \hslash {J}_{z}|a,b\rangle =a-{b}^{2}\mp \hslash b,$

recalling that $\langle a,b|a,b\rangle =1.$

Now $a,$ being the eigenvalue of a sum of squares of
Hermitian operators, is necessarily nonnegative, and $b$ is real.
Hence for a given $a,\text{\hspace{0.17em}}b$ is *bounded*: there must be a ${b}_{\text{max}}$ and a (negative or zero) ${b}_{\text{min}}.$ But this must mean that

$\begin{array}{l}{\Vert {J}_{+}|a,{b}_{\mathrm{max}}\rangle \Vert}^{2}=a-{b}_{\mathrm{max}}^{2}-\hslash {b}_{\mathrm{max}}=0,\\ {\Vert {J}_{-}|a,{b}_{\mathrm{min}}\rangle \Vert}^{2}=a-{b}_{\mathrm{min}}^{2}+\hslash {b}_{\mathrm{min}}=0.\end{array}$

Note that for a given $a,\text{\hspace{0.17em}}\text{\hspace{0.17em}}{b}_{\text{max}}$ and ${b}_{\text{min}}$ are determined uniquely -- there cannot be two kets with the same $a$ but different $b$ annihilated by ${J}_{+}.$ It also follows immediately that $a={b}_{\mathrm{max}}\left({b}_{\mathrm{max}}+\hslash \right)\text{and}{b}_{\mathrm{min}}=-{b}_{\mathrm{max}.}$ Furthermore, we know that if we keep operating on $|a,{b}_{\mathrm{min}}\rangle $ with ${J}_{+},$ we generate a sequence of kets with ${J}_{z}$ eigenvalues ${b}_{\mathrm{min}}+\hslash ,\text{\hspace{1em}}{b}_{\mathrm{min}}+2\hslash ,\text{\hspace{1em}}{b}_{\mathrm{min}}+3\hslash ,\dots $. This series must terminate, and the only possible way for that to happen is for ${b}_{\mathrm{max}}$ to be equal to ${b}_{\mathrm{min}}+n\hslash $ with $n$ an integer, from which it follows that ${b}_{\text{max}}$ is either an integer or half an odd integer times $\hslash .$

At this point, we switch to the standard notation. We have established that the eigenvalues of ${J}_{z}$ form a finite ladder, spacing $\hslash $. We write them as $m\hslash $,
and $j$ is used to denote the maximum value of $m,$ so the eigenvalue of ${J}^{2},\text{\hspace{0.17em}}a=j\left(j+1\right){\hslash}^{2}.$ Both $j$ and $m$ will be integers or half odd integers, but the
*spacing* of the ladder of $m$ values is always unity. Although we have been
writing $|a,b\rangle $ with $a=j\left(j+1\right){\hslash}^{2},\text{\hspace{0.17em}}\text{\hspace{0.17em}}b=m\hslash $ we shall henceforth follow convention and
write $|j,m\rangle $.

### Summary

The operators ${\overrightarrow{J}}^{2},{J}_{z}$ have a common set of orthonormal eigenkets $|j,m\rangle $,

$\begin{array}{c}{\overrightarrow{J}}^{2}|j,m\rangle =j\left(j+1\right){\hslash}^{2}|j,m\rangle \\ {J}_{z}|j,m\rangle =m\hslash |j,m\rangle \\ \langle j,m|j,m\rangle =1\end{array}$

where $j,m$ are integers or half integers. The allowed quantum numbers $m$ form a ladder with step spacing unity, the maximum value of $m$ is $j,$ the minimum value is $-j.$

### Normalizing *J*_{+} and *J*_{-}

It is now straightforward to compute the normalization factors needed to find matrix elements:

${\Vert {J}_{\pm}|j,m\rangle \Vert}^{2}=\langle j,m|{J}^{2}-{J}_{z}^{2}\mp \hslash {J}_{z}|j,m\rangle =\left(j\left(j+1\right){\hslash}^{2}-m\left(m\pm 1\right){\hslash}^{2}\right)\langle j,m|j,m\rangle ,$

and $\langle j,m|j,m\rangle =1$, so

$\begin{array}{l}{J}_{+}|j,m\rangle =\sqrt{j\left(j+1\right)-m\left(m+1\right)}\hslash |j,m+1\rangle \\ {J}_{-}|j,m\rangle =\sqrt{j\left(j+1\right)-m\left(m-1\right)}\hslash |j,m-1\rangle .\end{array}$

With these formulas, and the base set of normalized eigenkets $|j,m\rangle $, we are in a position to construct explicit matrix representations of the angular momentum algebra for any integer or half integer value of angular momentum $j.$

*Historical note*: the use of $m$ to denote the component of angular momentum in
one direction came about because a Bohr-type electron in orbit is a current
loop, with a magnetic moment parallel to its angular momentum, so the $m$ measured the component of magnetic moment in a
chosen direction, usually along an external magnetic field, and $m$ is often called the *magnetic *quantum number.