Michael Fowler,
UVa 8/20/07
De Broglie’s doctoral thesis, defended at the end of 1924,
created a lot of excitement in European physics circles. Shortly after it was published in the fall of
1925 Pieter Debye, a theorist in
There is no rigorous derivation of Schrödinger’s equation from previously established theory, but it can be made very plausible by thinking about the connection between light waves and photons, and construction an analogous structure for de Broglie’s waves and electrons (and, later, other particles).
Let us examine what Maxwell’s equations tell us about the motion of the simplest type of electromagnetic wave—a monochromatic wave in empty space, with no currents or charges present.
As we discussed in the last lecture, Maxwell found the wave equation
![]()
which reduces to
![]()
for a plane wave moving in the x-direction, with solution
.
Applying the wave equation differential operator to this plane wave solution

so
.
This is just the familiar statement that the wave must travel at c.
We know from the photoelectric effect and

Notice, then, that the wave equation tells us that
and hence E = cp.
To put it another way, if we think of
as describing a particle (photon) it would be more natural to
write the plane wave as
![]()
that is, in terms of the energy and momentum of the particle.
In these terms, applying the (Maxwell) wave equation operator to the plane wave yields

or
E2 = c2p2.
The wave equation operator applied to the plane wave describing the particle propagation yields the energy-momentum relationship for the particle.
The discussion above suggests how we might extend the wave equation operator from the photon case (zero rest mass) to a particle having rest mass m0. We need a wave equation operator that, when it operates on a plane wave, yields
![]()
Writing the plane wave function
![]()
where A is a constant,
we find we can get
by adding a constant
(mass) term to the differentiation terms in the wave operator:

This wave equation is called the Klein-Gordon equation and correctly describes the propagation of relativistic particles of mass m0. However, it’s a bit inconvenient for nonrelativistic particles, like the electron in the hydrogen atom, just as E2 = m02c4 + c2p2 is less useful than E= p2/2m for this case.
Continuing along the same lines, let us assume that a nonrelativistic electron in free space (no potentials, so no forces) is described by a plane wave:
.
We need to construct a wave equation operator which, applied to this wave function, just gives us the ordinary nonrelativistic energy-momentum relationship, E = p2/2m. The p2 obviously comes as usual from differentiating twice with respect to x, but the only way we can get E is by having a single differentiation with respect to time, so this looks different from previous wave equations:
![]()
This is Schrödinger’s
equation for a free particle. It is
easy to check that if
has the plane wave
form given above, the condition for it to be a solution of this wave equation
is just E = p2/2m.
Notice one remarkable feature of the above equation—the i on the left means that
cannot be a real function.
The effect of a potential on a de Broglie wave was considered by Sommerfeld in an attempt to generalize the rather restrictive conditions in Bohr’s model of the atom. Since the electron was orbiting in an inverse square force, just like the planets around the sun, Sommerfeld couldn’t understand why Bohr’s atom had only circular orbits, no Kepler-like ellipses. (Recall that all the observed spectral lines of hydrogen were accounted for by energy differences between circular orbits.)
De Broglie’s analysis of the allowed circular orbits can be
formulated by assuming at some instant in time the spatial variation of the
wave function on going around the orbit includes a phase term of the form
, where here the parameter q measures distance around the orbit. Now for an acceptable wave function, the
total phase change on going around the orbit must be
where n is an integer. For the usual Bohr circular orbit, p is constant on going around, q changes by
where r is the
radius of the orbit, giving
![]()
the usual angular momentum quantization.
What Sommerfeld did was to consider a general Kepler ellipse
orbit, and visualize the wave going around such an orbit. Assuming the usual relationship
the wavelength will
vary as the particle moves around the orbit, being shortest where the particle
moves fastest, at its closest approach to the nucleus. Nevertheless, the phase
change on moving a short distance
should still be
and requiring the wave
function to link up smoothly on going once around the orbit gives
![]()
Thus only certain elliptical orbits are allowed. The mathematics is nontrivial, but it turns out that every allowed elliptical orbit has the same energy as one of the allowed circular orbits. That is why Bohr’s theory gave all the energy levels. Actually, this whole analysis is old fashioned (it’s called the “old quantum theory”) but we’ve gone over it to introduce the idea of a wave with variable wavelength, changing with the momentum as the particle moves through a varying potential.
The reader may well be wondering at this point why it is at
all useful to visualize a real wave going round an orbit, when we have stated
that any solution of Schrödinger’s equation is necessarily a complex
function. As we shall see, it is often
possible to find solutions, including those corresponding to Bohr’s energy
levels, in which the complex nature of the wave function only appears in a time
varying phase factor,
We should also add
that if the spatial dependence is a real function, such as sinkx, it represents a standing wave, not a
particle circling in one direction, which would be eikx, or
Bearing all this in
mind, it is still often instructive to sketch real wave functions, especially
for one-dimensional problems.
Let us consider first the one-dimensional situation of a particle going in the x-direction subject to a “roller coaster” potential. What do we expect the wave function to look like? We would expect the wavelength to be shortest where the potential is lowest, in the valleys, because that’s where the particle is going fastest—maximum momentum.
With a nonzero potential present, the energy-momentum relationship for the particle becomes the energy equation
.
We need to construct a wave equation which leads naturally to this relationship. In contrast to the free particle cases discussed above, the relevant wave function here will no longer be a plane wave, since the wavelength varies with the potential. However, at a given x, the momentum is determined by the “local wavelength”, that is,
![]()
It follows that the appropriate wave equation is:
![]()
This is the standard one-dimensional Schrödinger equation.
In three dimensions, the argument is precisely
analogous. The only difference is that
the square of the momentum is now a sum of three squared components, for the x, y
and z directions, so
,
so now
![]()
This
is the complete Schrödinger equation. So
far, of course, it is based on plausibility arguments and hand-waving.
Why should anyone believe that it really describes an electron wave? Schrödinger’s
test of his equation was the hydrogen atom. He looked for Bohr’s “stationary
states”: states in which the electron was localized somewhere near the proton,
and having a definite energy. The time dependence would be the same as for
a plane wave of definite energy,
the spatial dependence
would be a time-independent function decreasing rapidly at large distances from
the proton. That is, he took
![]()
He took advantage of the spherical symmetry by re-expressing the spatial wave function in spherical polar coordinates, and found his equation became a standard differential equation solved in the nineteenth century. The solution gave the shape of possible wave functions, and also allowed values of energy and angular momentum. These values were exactly the same as Bohr’s (except that the lowest allowed state in the new theory had zero angular momentum): impressive evidence that the new theory was correct.
When Schrödinger published this result in 1926, he also wrote down the complex conjugate equation, and proved that taking them together it was not difficult to deduce a continuity equation:

Schrödinger believed the above continuity equations represented the conservation of electric charge, and had no further significance. He thought that after all his own equation showed the electron to be just a smooth classical wave at the deepest level. In fact, he succeeded in solving the three-dimensional equation with a Coulomb potential and he found the Bohr energy levels of the hydrogen atom. Obviously, he was on the right track! This classical revival approach, however, couldn’t deal with the unpredictability of quantum mechanics, such as where a single photon—or electron—would land in a two-slit diffraction pattern.
The truth is, Schrödinger didn’t understand his own
equation. Another physicist, Max Born,
published a paper a few days after Schrödinger’s in which he suggested that
was the relative probability of finding the electron
in a small volume dxdydz at (x,y,z) at time t. This interpretation was
based directly on the analogy with light waves and photons, and has
turned out to be correct.
Notation note:
is called the “amplitude”
or sometimes the “probability amplitude”.
We have seen that electrons and photons behave in a very similar fashion—both exhibit diffraction effects, as in the double slit experiment, both have particle like or quantum behavior. As we have already discussed, we now have a framework for understanding photons—we first figure out how the electromagnetic wave propagates, using Maxwell’s equations, that is, we find E as a function of x,y,z,t. Having evaluated E(x,y,z,t), the probability of finding a photon in a given small volume of space dxdydz, at time t, is proportional to |E(x,y,z,t)|2dxdydz, the energy density.
Born assumed that Schrödinger’s wave function for the
electron corresponded to the electromagnetic wave for the photon in the sense
that the square of the modulus of the Schrödinger wave amplitude at a point was
the relative probability density for finding the electron at that point. So the routine is the same: for given
boundary conditions and a given potential, Schrödinger’s differential equation
can be solved and the wave function
evaluated. Then,
gives the relative
probability of finding the electron at (x,y,z)
at time t.
Notice, though, that this interpretation of the wave function is not essential in finding the allowed energy levels in a given potential, such as the Bohr orbit energies, which Schrödinger derived before the physical significance of his wave function was understood.
We mentioned above that for an electron traveling along a roller coaster potential, the local wavelength is related to the momentum of the electron as it passes that point.
Perhaps slightly less obvious is that the amplitude of the wave varies: it will be largest at the tops of the hills (provided the particle has enough energy to get there) because that’s where the particle is moving slowest, and therefore is most likely to be found.
Suppose following de Broglie we write down the relation between the “particle properties” of the electron and its “wave properties”:
![]()
It would seem that we can immediately figure out the speed
of the wave, just using
say. We find:
![]()
So the speed of the wave seems to be only half the speed of the electron! How could they stay together? What’s wrong with this calculation?
To answer this question, it is necessary to think a little more carefully about the wave function corresponding to an electron traveling through a cathode ray tube, say. The electron leaves the cathode, shoots through the vacuum, and impinges on the screen. At an intermediate point in this process, it is moving through the vacuum and the wave function must be nonzero over some volume, but zero in the places the electron has not possibly reached yet, and zero in the places it has definitely left.
However, if the electron has a precise energy, say exactly a
thousand electron volts, it also has a precise momentum. This necessarily
implies that the wave has a precise wavelength. But the only wave with a precise wavelength
has the form
![]()
where
The problem is that this
plane sine wave extends to infinity in both spatial directions, so cannot
represent a particle whose wave function is nonzero in a limited region of
space.
Therefore, to represent a localized particle, we must superpose waves having different wavelengths. Now, the waves representing electrons, unlike the light waves representing photons, travel at different speeds for different energies. Any intuition gained by thinking about superposing light waves of different wavelengths can be misleading if applied to electron waves!
Fortunately, there are many examples in nature of waves
whose speed depends on wavelength. A
simple example is water waves on the ocean. We consider waves having a
wavelength much shorter than the depth of the ocean. What is the
relationship for these
waves? We know it’s not
, with a constant C,
because waves of different wavelengths move at different speeds. In fact, it’s easy to figure out the
relationship, known as the dispersion relation, for these waves from a simple dimensional
argument. What physical parameters can the wave frequency depend on? Obviously, the wavelength
. We will use
as our variable. k
has dimensions L-1.
These waves are driven by gravity, so g, with dimensions LT -2, is relevant. Actually, that’s all. For ocean waves, surface tension is certainly negligible, as is the air density, and the water’s viscosity. You might think the density of the water matters, but these waves are rather like a pendulum, in that they are driven by gravity, so increasing the density would increase both force and inertial mass by the same amount.
For these deepwater waves, then, dimensional analysis immediately gives:
![]()
where C is some dimensionless constant we cannot fix by dimensional argument, but actually it turns out to be 1.
To return momentarily to the electron traveling through a vacuum, it is clear physically that it must have a wave function that goes to zero far away in either direction (we’ll still work in one dimension, for simplicity). A localized wave function of this type is called a “wavepacket”. We shall discover that a wavepacket can be constructed by adding plane waves together. Now, the plane waves we add together will individually be solutions of the Schrödinger equation.
But does it follow that the sum of such solutions of the Schrödinger equation is itself a solution to the equation? The answer is yes—in other words, the Schrödinger equation
![]()
is a linear
equation, that is to say, if
are both solutions of
the equation, then so is
![]()
where c1 and c2 are arbitrary constants, as is easy to check. This is called the Principle of Superposition.
The essential point is that in Schrödinger’s equation every
term contains a factor
, but no term contains a factor
(or a higher
power). That’s what is meant by a
“linear” equation. If the equation did
contain a constant term, or a term including
superposition wouldn’t
work—the sum of two solutions to the equation would not itself be a solution to
the equation.
In fact, we have been assuming this linearity all along: when we analyze interference and diffraction of waves, we just add the two wave amplitudes at each spot. For the double slit, we take it that if the wave radiating from one slit satisfies the wave equation, then adding the two waves together will give a new wave which also satisfies the equation.
If we add together two sine waves with frequencies close
together, we get beats. This pattern can be viewed as a string of wavepackets,
and is useful for gaining an understanding of why the electron speed calculated
from
above is apparently half what it should be.
We use the trigonometric addition formula:
![]()
This formula represents the phenomenon of beats between
waves close in frequency. The first term,
oscillates at the average of the two frequencies. It is
modulated by the slowly varying second term, often called the “envelope
function”, which oscillates once over a spatial extent of order
This is the distance
over which waves initially in phase at the origin become completely out of
phase. Of course, going a further distance of order
the waves will become
synchronized again.
That is, beating two close frequencies together breaks up
the continuous wave into a series of packets, the beats. To describe a single
electron moving through space, we need a single packet. This can be achieved by
superposing waves having a continuous distribution of wavelengths, or wave
numbers within of order
say, of k.
In this case, the waves will be out of phase after a distance of order
but since they have
many different wavelengths, they will never get back in phase again.
The best way to understand how these waves add up is to view
my applet. It will immediately become apparent that
there are two different velocities in the dynamics: first, the velocity
with which the individual peaks move to the right, and second the
velocity at which the slowly varying envelope function—the beat pattern—moves.
The
individual peak
velocity is determined by the term
it is
this is called the phase
velocity. The speed with which the
beat pattern moves, on the other hand, is determined by the term
this speed is
for close frequencies.
Going back one more time to the electron wavepacket, armed
with this new insight, we can see immediately that the wave speed we calculated
from
was the phase velocity of the waves. The packet itself will of course move at the group velocity—and it is easy to check
that this is just v.
We’ve seen how two sine waves of equal amplitude close
together in frequency produce beats: if the waves are in phase at the origin,
as we go along the x-axis they gradually fall out of phase, and cancel
each other at a distance
where
is the difference in k
of the two sinkx waves. (For the moment, we are ignoring the time
development of these waves: we’re just looking at t = 0.). If we continue along the x-axis to
the two waves will be
back in phase again, this is the next beat.
Now, if instead of adding two waves, we add many waves,
all of different k, but with the k’s taken from some small
interval of size of order
and all these waves
are in phase at the origin, then, again, they will all stay more or less in
phase for a distance of order
However, as we
proceed past that point, the chances of them all getting back in phase again
get rapidly smaller as we increase the number of different waves.
This suggests a way to construct a wavepacket: add together a lot of waves from within a narrow frequency range, and they will only be in phase in a region containing the origin.
Adding waves in this way leads to a more general derivation
of the formula
for the group
velocity. The standard approach is to
replace the sum over plane waves by an integral, with the wavenumber k
as the variable of integration, and the convention is to put a factor
in the denominator:

Since we are constructing a wavepacket with a fairly well-defined
momentum, we will take the function
to be strongly peaked at k0, and going
rapidly to zero away from that value, so the only significant contribution to
the integral is from the neighborhood of k0. Therefore, if
is reasonably smooth (which it is) it is safe to put
![]()
in the exponential.
This gives

The first term just represents a single wave at k0, and the peaks move at the phase velocity
![]()
The second term, the integral, is the envelope function: here x only appears in the combination
![]()
so the envelope, and hence the wavepacket, moves to the
right at the group velocity:
Note that if the next term in the Taylor expansion of
is also included, that
amounts to adding wavepackets with slightly different group velocities
together, and the initial (total) wavepacket will gradually widen.
Fortunately, there is a simple explicit mathematical realization of the addition of plane waves to form a localized function: the Gaussian wavepacket,
![]()
where
. For this wavepacket to
represent one electron, with the probability of finding the electron in a small
section of length dx at x equal to
and the total probability
of finding the electron somewhere equal to one, the constant A is
uniquely determined (apart from a possible phase multiplier
which would not affect
the probability).
Using the standard result

we find
so
![]()
But how do we construct this particular wavepacket by superposing plane waves? That is to say, we need a representation of the form:

The function
represents the
weighting of plane waves in the neighborhood of wavenumber k. This is a particular example of a Fourier
transform—we will be discussing the general case in detail a little later
in the course. Note that if
is a bounded function, any particular k value gives a
vanishingly small contribution, the plane-wave contribution to y(x)
from a range dk is
In fact,
is given in terms of
by

It is perhaps worth
mentioning at this point that this can be understood qualitatively by
observing that the plane wave prefactor e-ikx will
interfere destructively with all plane wave components of
except that of wavenumber k, where it may at first
appear that the contribution is infinite, but recall that as stated above, any
particular k component has a vanishingly small weight—and, in fact, this
is the right answer, as we shall show in more convincing fashion later.
In the present case, the above handwaving argument is unnecessary, because both the integrals can be carried out exactly, using the standard result:
![]()
giving
.
Putting this back in the integral for
shows that the
integral equations are consistent.
Note the normalization integrals in x-space and k-space are:

The physical significance of the second equation above is
that if the wavepacket goes through a diffraction grating so that the different
k-components are dispersed in different directions, like the colors in
white light, and a detector is arranged to register the electron if it has
wavenumber between k and k + dk, the probability of
finding it in that wavenumber range is ![]()
It is clear from the expressions for
and its Fourier transform
above that the
spreading of the wave function in x-space is inversely related to its
spreading in k-space: the x-space wavefunction has spread
the k-space
wavefunction
This is perhaps the
simplest example of Heisenberg’s famous Uncertainty Principle: in quantum
mechanics, both the position and momentum of a particle cannot be known
precisely at the same moment; the more exactly one is specified the less well
the other is known. This is an
inevitable consequence of the wave nature of the probability distribution. As we have already seen, a particle with an
exact momentum has a wave of specific wavelength, and the only such wave is a
plane wave extending from minus infinity to infinity, so the position of the
particle is completely undetermined. A
particle with precisely defined position is described by a wavepacket having
all wavelengths included with equal weight—the momentum is completely
undefined. We shall give more examples
of the Uncertainly Principle, of efforts to evade it and of its uses in estimates,
in the next lecture.
The standard notation for the expectation value of an operator in a given quantum state is
![]()
In other words,
would be the statistical
average outcome of making many measurements of x on identically prepared
systems all in the quantum state
(ignoring the time
dependence here for the sake of simplicity).
When we talk about the “uncertainty” in x, we mean
in quantum mechanics the root mean square deviation in the measurements. This is usually written
(unfortunate in view
of our—also standard—use of
in the Gaussian
function above, so the reader should watch carefully!).
Therefore
![]()
For our wavepacket,
It is easy to check
that
©
2007 Michael Fowler