100 Years of Quantum Physics: Rise of the Matrix (1925)

Niels Bohr’s atom, by late 1925, was a series of kludges cobbled together into a Rube Goldberg type of construction.  On the one hand, there was the Bohr-Sommerfeld quantization conditions that let Bohr’s originally circular orbits morph into ellipses like planets in the solar system.  On the other hand, there was Pauli’s exclusion principle that partially explained the building up of electron orbits in many-electron atoms, but to many it seemed like an ad hoc rule. 

The time was ripe for a new perspective. Enter the wunderkind, Werner Heisenberg.

Heisenberg’s Trajectory

Werner Heisenberg (1901 – 1976) was the golden boy—smart, dashing, ambitious. He excelled at everything he did and was a natural leader among his group of young friends. He entered the University of Munich in 1920 at the age of 19 to begin working towards his doctorate degree in mathematics, but he quickly became entranced with an advanced seminar course given by Arnold Sommerfeld (1868 – 1951) on quantum mechanics. His studies under Sommerfeld advanced quickly, and he was proficient enough to be “lent out” to the group of Max Born and David Hilbert at the University of Göttingen for the 1922-1923 semester when Sommerfeld was on sabbatical at the University of Wisconsin, Madison, in the United States. Born was impressed with the young student and promised him a post-doc position upon his graduation with a doctoral degree in theoretical physics the next year (when Heisenberg would be just 22 years old).

Unfortunately, his brilliantly ascending career ran headlong into “Willy” Wien who had won the Nobel Prize in 1911 for his displacement law of black body radiation. Wien was a hard-baked experimentalist who had little patience with the speculative flights of theoretical physics. Heisenberg, in contrast, had little patience with the mundane details of experimental science. The two were heading for an impasse.

The collision came during the oral examination for Heisenberg’s doctoral degree. Wien, determined to put Heisenberg in his place, opened with a difficult question about experimental methods. Heisenberg could not answer, so Wien asked a slightly less difficult but still detailed question that Heisenberg also could not answer. The examination went on like this until finally Wien asked Heisenberg to derive the resolving power of a simple microscope. Heisenberg was so flustered by this time that he could not do even that. Wien, in disgust, turned to Sommerfeld and pronounced a failing grade for Heisenberg. After Heisenberg stepped out of the room, the professors wrangled over the single committee grade that would need to be recorded. Sommerfeld’s top grade for Heisenberg’s mathematical performance and Wien’s bottom grade for his experimental performance led to the compromise grade of a “C” for the exam—the minimum grade sufficient to pass.

Heisenberg was mortified. Accustomed always to excelling and being lauded for his talents, Heisenberg left town that night, taking the late train to Göttingen where a surprised Born found him outside his office early the next morning—fully two months ahead of schedule. Heisenberg told him everything and asked if Born would still have him. After learning more about Wien’s “ambush”, Born assured Heisenberg that he still had a place for him.

Heisenberg was so successful at Göttingen, that when Born planned to spend a year sabbatical at MIT in the United States for the 1924-1925 semester, Heisenberg was “lent out” to Niels Bohr in Copenhagen. While there, Heisenberg, Bohr, Pauli and Kramers had intense discussions about the impending crisis in quantum theory. Bohr was fully aware of the precarious patches that made up the quantum theory of the many-electron atom, and the four physicists attempted to patch it yet again with a theoretical effort led by Kramers to try to reconcile optical transitions in the atomic spectra. But no one was satisfied, and the theory had serious internal inconsistencies, not the least of which was a need to sacrifice the sacrosanct principle of conservation of energy.

Through it all, Heisenberg was thrilled by his deep involvement in the most fundamental questions of physics of the day and was even more thrilled by his interactions with the great minds he found in Copenhagen. When he returned to Göttingen on April 27, 1925, the arguments and inconsistencies were ringing in his head, infecting the group at Göttingen with the challenging physics, especially Max Born and Pascual Jordan.

Little headway could be made, until Heisenberg had a serious attack of hay fever that sent him for respite on June 7 to the remote island of Helgoland in the North Sea far off of the coast from Bremerhaven. The trip cleared Heisenberg’s head—literally and figuratively—as he had time to come to grips with the core difficulties of quantum theory.

Trajectory’s End

The Mythology of Physics recounts the tale of when Heisenberg had his epiphany, watching from the beach as the sun rose over the sea. The repeated retelling has solidified the moment into revealed “truth”, but the origins are probably more prosaic. Strip a problem bare of all its superficial coverings and what remains must be the minimal set of what can be known. Yet to do so requires courage, for much of the superficial coverings are established dogma, embedded so deeply in the thought of the day that angry reactions must be expected.

Map of Heligoland, Germany
Fig. 1 Heligoland Germany. (From Google Maps and Wikipedia)

At some moment, Heisenberg realized that the superficial covering of atomic theory was the slavish devotion to the electron trajectory—to the Bohr-Sommerfeld electron orbits. Ever since Kepler, the mental image of masses in orbit around their force center had dominated physical theory. Quantum theory likewise was obsessed with images of trajectories—it persists to this day in the universal logo of atomic energy. Heisenberg now rejected this image as unknowable and hence not relevant for a successful theory. But if electron orbits were out, what was the minimal set of what can be known to be true? Heisenberg decided that it was simply the wavelengths and intensities of light absorbed and emitted by atoms. But what then? How do you create a theory constructed on transition energies and intensities alone? The epiphany was the answer—construct a dynamics by which the quantum system proceeds step-by-step, transition-by-transition, while retaining the sacrosanct energy conservation that had been discarded by Kramer’s unsuccessful theory.

The result, after returning to Göttingen, is Heisenberg’s paper, submitted July 29, 1925 to Zeitschrift für Physik titled Über quantentheoretische Umdeutung kinematischer und mechanischer Beziehungen (Over the quantum theoretical meaning of kinematic and mechanical relationships).

Heisenberg 1925 Over quantum theoretical meaning of kinematic and mechanical relationships
Fig. 2 The heading of Heisenberg’s 1925 Zeitschrift für Physik article. The abstract reads: “This work seeks to find fundamental principles for a quantum theoretical mechanics that is based exclusively on relationships among principal observable magnitudes.” [1]

Heisenberg begins with the fundamental energy relationship between the frequency of light and the energy difference in a transition

Heisenberg 1925 Zeitschrift fur Physik paper on transition energies
Fig. 3 Dynamics emerges from the transitions among different energy states in the atom [1].

His goal is to remove the electron orbit from the theory, yet positions cannot be removed entirely, so he takes the step of transforming position into a superposition of amplitudes with frequencies related to the optical transitions.

Heisenberg replacing the electron orbits with their Fourier coefficients based on transition frequencies.
Fig. 4 Replace the electron orbits with their Fourier coefficients based on transition frequencies [1].

Armed with the amplitude coefficients and the transition frequencies, he constructs sums of transitions that proceed step by step between initial and final states.

Heisenberg connecting initial and final states with series of steps that conserve energy.
Fig. 5 Consider all the possible paths between initial and final states that obey energy conservation [1].

After introducing the electric field, Heisenberg calculates the polarizability of the atom, the induced moment, using Kramer’s dispersion formula combined with his new superposition.

Heisenberg transition amplitude based on sums over possible energy steps
Fig. 6 Transition amplitude between initial and final states based on a series of energy-conserving transition steps [1].

Heisenberg applied his new theoretical approach to one-dimensional quantum systems, using as an explicit example the anharmonic oscillator, and it worked! Heisenberg had invented a new theoretical approach to quantum physics that relied only on transition frequencies and amplitudes—only what could be measured without any need to speculate on what types of motions electrons might be executing. Heisenberg published his new theory on his own, as sole author befitting his individual breakthrough. Yet it was done under the guidance of his supervisor Max Born, who recognized something within Heisenberg’s mathematics.

The Matrix

Heisenberg’s derivations involved numerous summations as amplitudes multiplied amplitudes in complicated sequences. The mathematical steps themselves were straightforward—just products and sums—but the numbers of permutations were daunting, and their sequential order mattered, requiring skill and care not to miss terms or to get minus signs wrong.

Yet Born recognized within Heisenberg’s mathematics the operations of matrix multiplication. The different permutations with sums of alternating signs were exactly what one obtained by taking determinants of matrices, and it was well known that the order of matrix multiplication mattered, where a*b ≠ b*a. With his assistant Pacual Jordan, the two reworked Heisenberg’s paper in the mathematical language of matrices, submitting their “mirror” paper to Zeitschrift on Sept. 27, 1925. Their title was prophetic: Towards Quantum Mechanics. This was the first time that the phrase “quantum mechanics” was used to encompass all of the widely varying aspects of quantum systems.

Born and Jordan 1925 Zur Quantenmechanik
Fig. 7 The header for Born and Jordan’s reworking of Heisenberg’s paper into matrix mathematics [2].

In the abstract, they state:

The approaches recently put forward by Heisenberg (initially for systems with one degree of freedom) are developed into a systematic theory of quantum mechanics. The mathematical tool is matrix calculus. After this is briefly outlined, the mechanical equations of motion are derived from a variational principle, and the proof is carried out that, on the basis of Heisenberg’s quantum condition, the energy theorem and Bohr’s frequency condition follow from the mechanical equations.  Using the example of the anharmonic oscillator, the question of the uniqueness of the solution and the significance of the phases in the partial oscillations are discussed.  The conclusion describes an attempt to incorporate the laws of the electromagnetic field into the new theory.

Born and Jordan begin by creating a matrix form for the Hamiltonian subject to Hamilton’s dynamical equations

Born Hamiltonian and Hamilton's equations
Fig. 8 Defining the Hamiltonian with matrix operators [2].

Armed with matrix quantities for position and momentum, Born and Jordan construct the commutator of p with q to arrive at one of the most fundamental quantum relationships: the non-zero difference in the permuted products related to Planck’s constant. This commutation relationship would become the foundation for many quantum theories to come.

Born 1925 matrix commutator
Fig. 9 Page 871 of Born and Jordan’s 1925 Zeitschrift article that introduces the commutation relationship between p and q [2].

As Heisenberg had done in his paper, Born and Jordan introduce the electric field of light to derive the dispersion of an atomic gas.

Max Born quantum transition amplitude from matrix elements
Fig. 10 Expression for the dispersion of light in an atomic gas [2].

The Born and Jordan paper appeared in the November issue of Zeitschrift für Physik, although a pre-print was picked up in England by Paul Dirac, who was working towards his doctoral degree under the mentorship of Ralph Fowler (1889 – 1944) at Cambridge. Dirac was deeply knowledgable in classical mechanics, and he recognized as soon as he saw it that the new quantum commutator was intimately connected to a quantity in classical mechanics known as a Poisson bracket. The Poisson bracket is part of Hamiltonian mechanics that defines how two variables, known as conjugate variables, are connected. For instance, the Poisson bracket of x with px is non-zero, meaning that these are conjugate variables, while the Poisson bracket of x with py is zero, meaning that these variables are fully independent. Conjugate variables are not “dependent” in an algebraic sense, but are linked through the structure of Hamilton’s equations—they are the “p’s and q’s” of phase space.

Dirac 1926 Proceedings of the Royal Society comparison of Poisson bracket to quantum commutator.
Fig. 11 The Poisson bracket in Dirac’s paper submitted on Nov. 7, 1925 [3].

Dirac submitted a paper on Nov. 7, 1925 to the Proceedings of the Royal Society of London where he showed that the Heisenberg commutator (a quantum quantity) directly proportional to the Poisson bracket (the classical quantity) with a proportionality factor that depended on Planck’s constant.

Dirac quantum commutation relationship
Fig. 12 Dirac relating the quantum commutator to the classical Poisson (Jacobi) bracket [3].

The Drei-Männer Quantum Mechanics Paper: Born, Heisenberg, and Jordan

Meanwhile, back in Göttingen, the three quantum physicists Born, Heisenberg and Jordan now combined forces to write a third foundational paper that established the full range of the new matrix mechanics. Heisenberg’s first paper had been the insight. Born and Jordan’s following paper had re-expressed Heistenberg’s formulas into matrix algebra. But both papers had used simple one-dimensional problems as test examples. Working together, they extended the new quantum mechanics to systems with many degrees of freedom.

Quantum Mechanics. Born, Heisenberg and Jordan three-man paper heading.
Fig. 13 Header for the completed new theory on quantum mechanics by Born, Heisenberg and Jordan [4].

With this paper, the matrix properties of dynamical variables are defined and used in their full form.

Quantum Mechanics. Born, Heisenberg and Jordan. View of a matrix.
Fig. 14 An explicit form for a dynamical matrix in the “three-man” paper [4].

With the theory out in the open, Pauli in Hamburg and Dirac at Cambridge used the new quantum mechanics to derive the transition energies of hydrogen, while Lucy Mensing and J. Robert Oppenheimer in Göttingen extended it to the spectra of more complicated molecules.

Open Issues

Heisenberg’s matrix mechanics might have exclusively taken hold of the quantum theory community and we would all be using matrices today to perform all our calculations. But within one month of the success of matrix mechanics, an alternative quantum theory would be proposed by Erwin Schrödinger based on waves, a theory that came to be called wave mechanics. There was a minor battle fought over matrix mechanics versus wave mechanics, but in the end, Bohr compromised with his complementarity principle, allowing each to stand as equivalent viewpoints of quantum phenomena (but more about Schrödinger and his waves in my next Blog).

Further Reading

For more stories about the early days of quantum physics read Chapter 8, “On the Quantum Footpath” in D. D. Nolte, “Galileo Unbound: A Path Across Life, the Universe and Everything” (Oxford University Press, 2018)

For definitive accounts of Heisenberg’s life see D. Cassidy “Beyond Uncertainty: Heisenberg, Quantum Physics and the Bomb” (Bellevue Press, 2009)

References

[1] Heisenberg, W. (1925). “Über quantentheoretische Umdeutung kinematischer und mechanischer Beziehungen”. Zeitschrift für Physik, 33(1), 879–893.

[2] Born, M., & Jordan, P. (1925). “Zur Quantenmechanik”. Zeitschrift für Physik, 34(1), 858–888.

[3] Dirac, P. A. M. (1925). The fundamental equations of quantum mechanics. Proceedings of the Royal Society of London. Series A, Containing Papers of a Mathematical and Physical Character, 109(752), 642–653

[4] Born, M., W. Heisenberg and P. Jordan (1926). “Quantum mechanics II.” Zeitschrift Fur Physik, 35, (8/9): 557–615.

[5] Dirac, P. A. M. (1926). “Quantum mechanics and a preliminary investigation of the hydrogen atom.” Proceedings of the Royal Society of London Series A, 110(755): 561–79.

[6] Pauli, W. (1926). “The hydrogen spectrum from the view point of the new quantal mechanics.” Zeitschrift Fur Physik, 36(5): 336–63.

[7] Mensing, L. (1926). “Die Rotations-Schwingungsbanden nach der Quantenmechanik”. Zeitschrift für Physik, 36(11), 814–823.

[8] Born, M., & Oppenheimer, J. R. (1927). “Zur Quantentheorie der Molekeln”. Annalen der Physik, 389(20), 457–484.

Holonomy, Parallel Transport and Foucault's Pendulum

Whole World Holonomy

Once upon a time the World was flat, and sailors feared falling off the edge if they sailed too far…at least that is how the fairy tale is told.

Flat World was a simple World.  Its inhabitants, the Flatworlders, carried vectors abount with them, vectors that were thick bundles of arrows, bound in a sheaf, that tended to point in a fixed direction.  In Flat World, once a vector had been oriented one way, it kept that orientation, no matter what path the Flatworlder took, always being exactly the same when it returned to its starting point.

Then came Cristobal Colon, and Carl Gauss and Bernhard Riemann, and Flat World became Round World, and the carrying around of vectors was no longer such a simple thing.  The surprising part was, when vectors returned to their starting point, even if they were carried with the greatest of care never to turn them or twist them, carrying them always parallel to themselves, they were never the same.  At the end of every different journey, they pointed in a different direction. 

How can this be?  How can a vector, that was transported always parallel to itself, end up pointing in a different direction when it was carried around a closed loop?

The answer is Holonomy!

Great Circle Routes

We were taught in Euclidean Geometry that the shortest path between two points is a straight line.  This lesson was fine for Flat World.  But now that we live in Round Riemann World, we know that the shortest distance between two points on the surface of the Earth is along a great circle route.  All great circles are Earth circumferences.  They are defined by three points: the starting point, the ending point and the center of the Earth.  The three points define a plane that intersects the surface of the Earth along a great circle.

Fig. 1. Great circle route from Rio de Janeiro, Brazil, to Seoul, Korea on a Winkel Tripel projection of the Earth.

The practical demonstration that great circles are shortest paths can be done with a string and a globe.  Pick any two points on the surface of the globe and stretch the string tightly between them so that the string lies taught on the surface of the globe.  If the two points are far enough apart (but no so far that they are nearly antipodal), then the string takes on the arc of a circle centered on the center of the globe.

Shortest paths on a curved surface, like a globe, are also known as geodesics.  And now that we have our geodesic paths for the Earth, we can start playing around with the parallel transportation of vectors.

Parallel Transport

The game is simple.  Start at the Equator with a vector that is pointing due North.  Now, move the vector due North along a line of longitude (a geodesic) until you hit the North Pole.  All the while, as you carry it along, the vector continues pointing due North on the surface of the Earth, never deviating.  When you reach the North Pole, take a sharp turn right by 90 degrees, being careful not to twist or turn the vector in any way.  Now, carry the vector with you due South along a new line of longitude until you hit the Equator, where again you take a sharp right turn by 90 degrees, and once again careful not to twist or turn the vector in any way.  Then return to your starting point. 

What you find, when you return home, is that the vector has turned through an angle of 90 degrees.   Despite how careful you were never to twist or turn it—it has turned nonetheless. This is holonomy.

Fig. 2. Parallel transport around a geodesic triangle.Start at the equator with the vector pointing due North. Transport it without twisting it to the North Pole. Take a right turn, careful not to twist the vector, and proceed to the equator where you take another right turn and return to the starting point. Without ever twisting the vector, it has nonetheless rotated by 90-degrees over the closed path.

Holonomy

Holonomy, or more specifically Riemann holonomy, is the intrinsic twisting of vectors as they are transported parallel to themselves around a closed loop on a curved surface.  The twist is something “outside” of the local transport of the vector.  In the case of the Earth, this “outside” element is the curvature of the Earth’s surface.  Locally, the Earth looks flat, and the vector is moved so that it always points in the same direction.  But globally, the vector can slowly tilt as it moves along a geodesic path.

For the example of parallel transport on the Earth, look at the vector at its starting point and the vector when it reaches the North pole.  Clearly the vector has rotated by 90 degrees, even despite of, or actually because of, its perfect Northward orientation along the line of longitude.

In this specific example, the solid angle of the closed path is a perfect eighth part of the total 4π solid angle of the surface, or 4π/8 = π/2.  The angle by which the vector rotated on this path is precisely the same as the subtended solid angle.  This is no coincidence—it is the consequence of the Gauss-Bonnet Theorem.

This theorem holds for any arbitrary closed path because any path can be viewed as if it were made up of lots of little segments of great circles.  You can even pick a path that crosses itself, taking care to keep track of minus signs as solid angles add and subtract.  For example, a perfect figure eight, if followed around smoothly, has no holonomy, because the two halves cancel.

Here, alas, we must leave our simple geometric games with great circles on the globe. To delve deeper into the origins of holonomy, it is time to turn to differential geometry.

Differential Geometry

Differential geometry is the application of differential calculus to geometry, in particular to geometric subspaces, also known as manifolds.  A good example is the surface of a sphere embedded in three-dimensional space.  The surface of a sphere has intrinsic curvature, where the curvature is defined as the inverse of the radius of the sphere.

One of the cornerstones of differential geometry is the operator known as the covariant derivative.  It is defined as

This covariant derivative is a master at bookkeeping.  Notice how the item on the left has an a-up and a b-down, as does the first term on the right.  And if we think of the c-up and -down as canceling in the last term, then again we have a-up and b-down.  The small-case “del” on the right is the usual partial derivative of Va with respect to xb.  The second term on the right takes care of the intrinsic “twist” of the coordinates caused by curvature.  As a bookkeeping device, covariant derivatives take care of the variation of a function as well as of the underlying variation of the coordinate frame.  (The covariant derivative was crucial for Einstein when he was developing his General Theory of Relativity applied to gravity.)

One of the most important discoveries in differential geometry was may by Tullio Levi-Civita in 1917. Years before, Levi-Civita had helped develop tensor calculus with his advisor Gregorio Ricci-Curbastro at the University of Padua, and they published a seminal review paper in 1901 that Marcel Grossmann brought to Einstein’s attention when he was struggling to reconcile special relativity with gravity. Einstein and Levi-Civita corresponded in a series of famous letters in early 1915 as Einstein zeroed in on the final theory of General Relativity. Interestingly, that correspondence had as profound an effect on Levi-Civita as it had on Einstein. Once Einstein’s new theory was published in late 1915, Levi-Civita returned to tensor calculus to answer a critical question: What was the geometric meaning of the covariant derivative that was so crucial to the new theory of gravity?

To answer this question, Levi-Civita defined a new form of parallelism that held for vector fields on curved manifolds. The new definition stated that during the parallel transport of a vector along a path, its covariant derivative along that path vanishes. This definition is contained in the expression

where the ub on the left is the tangent vector of the path. Expanding this expression gives

and simplifying the first term yields the equation for parallel transport

For the surface of the sphere, with two variables θ and φ, these lead to two coupled ODEs

The Christoffel symbols for a spherical surface are

Yielding the equations for Parallel Transport of a vector on the Earth

These equations are all you need to calculate how much a vector rotates for any path taken across the face of the Earth.

Example: Parallel Transport Around a line of Latitude

One particularly simple path is a line of latitude. Lines of latitude are not geodesics (except for the Equator) and hence there will be a contribution to the twist of the vector caused by the curvature of the Earth. For lines of latitude, the equations of Parallel Transport become

The line element is

leading to the flow equations

where theta is fixed (in this case the angle relative to the North Pole) and φ is the dependent variable.

Fig. 3. A vector that originally points due East on the 30th Parallel is rotated by 90-degrees during parallel transport, ending by pointing due South.

These copled ODEs are recognized as simple oscillations with the flow

and the initial value problem has the solution

where the angular frequency is the geometric mean of the coefficients

where λ is the latitude.

For a full rotation around a closed line of latitude, the parallel-transported vector components are

As an example, take an initial vector [0, 1] pointing due East along the line of latitude. Then parallel-transport it around latitude 30o North. This gives the final vector the components

which is now pointing due South. The vector has been rotated by 90-degrees even though it was transported always parallel to itself on the surface of the Earth. The solid angle subtended by the 30-th parallel is exactly π/2.

One final bookkeeping step is needed. It looks like the magnitude of the vector changed during the transport: y0 = 1, but xf = cosλ. However, cosλ = sinθ, which is exactly the metric coefficient on the dφ term of the line element, and the vector retains its magnitude in spherical coordinates.

Foucault’s Pendulum

Holonomy is a subtle effect, and it’s hard to find good examples in real life where it matters. But there is one demonstration that almost anyone has seen, at least anyone with an interest in science: Foucault’s pendulum. This is the very long pendulum that is often found in science museums around the world. As the Earth turns, the plane of oscillation of the pendulum slowly turns, and the pendulum bob often knocks down blocks that the museum staff set up in the morning to track the precessing plane.

In a classical mechanics class, the precession of Foucault’s pendulum is usually derived through the effect of the Coriolis force on the moving pendulum bob. The answer for the precession frequency, after difficult integrations between non-inertial frames, is

Alternatively, the normal vector to the plane of oscillation can be viewed as parallel-transported around a closed loop at constant latitude. The amount of precession per day is

which is exactly the same thing. Therefore, Foucault’s pendulum is a striking and exact physical demonstration of Whole World Holonomy.

Additional Reference: Physics Stack Exchange

Examples of conformal maps with transform functions for field lines and potentials.

The Craft of Conformal Maps

Somewhere, within the craft of conformal maps, lie the answers to dark, difficult problems of physics.

For instance, can you map out the electric field lines around one of Benjamin Franklin’s pointed lightning rods?

Can you calculate the fluid velocities in a channel making a sharp right angle?

Now take it up a notch in difficulty: Can you find the distribution of the sizes of magnetic domains within a flat magnet that is about to depolarize?

Or take it to the max: Can you find the vibration frequencies of the cosmic strings of string theory?

The answers to all these questions starts with simple physics solutions within simple boundaries—sometimes a problem so simple even a freshman physics student can solve it—and then mapping the solution, point by point, onto the geometry of the desired problem.

Once the right mapping function is found, you can solve some of the stickiest, ugliest, crankiest problems of physics like a pro.

The Earliest Conformal Maps

What is a conformal map?  It is a transformation that takes one picture into another, keeping all local angles unchanged, no matter how distorted the overall transformation is.

This property was understood by the very first mathematicians, wrangling their sexagesimal numbers by the waters of Babylon as they mapped the heavens onto charts to foretell the coming of the seasons.

Sterographic projection of the celestial sphere onto a plane
Fig. 1 Geometry of stereographic projection

Hipparchus of Rhodes, around 150 BCE during the Hellenistic transition from Alexander to Caesar, was the first to describe the stereographic projection by which the locations of the stars were mapped on a plane by locating all the stars on the celestial sphere and tracing a line from the bottom of the sphere to the star and plotting where the line intersects the mid plane.  Why this method should be conformal, preserving the angles, was probably beyond his mathematical powers, but he probably intuitively knew that it did.

Astrolabe map of the stars based on stereographic projection
Fig. 2 An astrolabe for the location of Brussels, Belgium, generated by a stereographic projection

Ptolemy of Alexandria around 150 CE, expanded on Hipparchus’ star charts and then introduced his own conical-like projection to map all the known world of his day.  The Ptolemaic projection is almost conformal, but not quite.  He was more interested in keeping areas faithful than angles.

Ptolemy's map of the world as reconstructed in the Renaissance.
Fig. 3 A Renaissance rendering of Ptolemy’s map of the known world in pseudo-conic projection.

Mercator’s Rules of Rhumb

The first conformal mapping of the Earth’s surface onto the plane was constructed by Gerard Mercator in 1569.  His goal, as a map maker, was to construct a map that traced out a ship’s course of constant magnetic bearing as a straight line, known as a rhumb line.  This mapping property had important utility for navigators, especially on long voyages at sea beyond the sight of land, and was a hallmark of navigation maps of the Mediterranean, known as Portolan Charts.  Rhumb lines were easy to draw on the small scales of the middle ocean, but on the scale of the Earth, no one knew how to do it.

Mercator's North Atlantic map
Fig. 4 Mercator’s North Atlantic with compass roses and rhumb lines. (Note the fictitious islands south of Iceland and the large arctic continent north of Greenland.)

Though Mercator’s life and career have been put under the biographer’s microscope numerous times, the exact moment when he realized how to make his map—the now-famous Mercator Projection—is not known.  It is possible that he struck a compromise between a cylindrical point projection, that stretched the arctic regions, and a cylindrical line projection that compressed the arctic regions.  He also was a maker of large globes on which rhumb lines (actually curves) could be measured and transferred to a flat map.  Either way, he knew that he had invented something entirely new, and he promoted his map as an aid for the Age of Exploration.  There is some evidence that Frobisher took Mercator’s map with him during his three famous arctic expeditions seeking the Northwest Passage.

Mercator projection of the Earth
Fig. 5 Modern Mercator conformal projection of the Earth.

Mercator never explained nor described the mathematical function behind his projection.  This was first discovered by the English mathematician Thomas Harriot in 1589, 20 years after Mercator published his map, as Harriot was helping Sir Walter Rayleigh with his New World projects.  Like most of what Harriot did during his lifetime, he was years (sometimes decades) ahead of anyone else, but no one ever knew because he never published.  His genius remained buried in his personal notes until they were uncovered in the late 1800’s long after others had claimed credit for things he did first.

The rhumb lines of Mercator’s map maintain constant angle relative to all lines of longitude and hence the Mercator projection is a conformal map.  The mathematical proof of this fact was first given by James Gregory in 1668 (almost a century after Mercator’s feat) followed by a clearer proof by Isaac Barrow in 1670.  It was 25 years later that Edmund Halley (of Halley’s Comet fame) proved that the stereographic projection was also conformal.

A hundred years passed after Halley before anyone again looked into the conformal properties of mapping—and then the field exploded.

The Rube in Frederick’s Berlin

In 1761, the Swiss contingent of the Prussian Academy of Sciences nominated a little-known self-taught Swiss mathematician to the membership of the Academy.  The process was pro forma, but everyone nominated was interviewed personally by Frederick the Great who had restructured the Academy years before from a backwater society to a leading scientific society in Europe.  When Frederick met Johann Lambert, he thought it must be a practical joke.  Lambert looked strange, dressed strangely, and his manners were even stranger.  He was born poor, had never gone to school, and he looked it and he talked it. 

Portrait of Johann Heinrich Lambert
Fig. 6 Portrait of Johann Heinrich Lambert

Frederick rejected the nomination.

But the Swiss contingent, led by Leonhard Euler himself, persisted, because they knew what Frederick did not—Lambert was a genius.  He was an autodidact who had pulled himself up so thoroughly, that he had self-published some of the greatest works of philosophy and science of his generation.  One of these was on the science of optics which established standards of luminance that we still use today.  (In my own laboratory, my students and I routinely refer to Lambertian surfaces in our research on laser speckle.  And we use the Lambert-Beer law of optical attenuation every day in our experiments.)

Frederick finally relented after a delay of two years, and admitted Lambert to his Academy, where Lambert went on a writing rampage, publishing a paper a month over the next ten years, like a dam letting loose.

One of Lambert’s many papers was on projection maps of the Earth.  He not only picked up where Halley had left off a hundred years earlier, but he invented 7 new projections, three of which were conformal and four of which were equal area.  Three of Lambert’s projections are in standard use today in cartography

The Lambert conformal conic projection of the Earth
Fig. 7 The Lambert conformal conic projection centered on the 36th parallel.

Although Lambert worked at the time of Euler, Euler’s advances in complex-valued mathematics was still young and not well known, so Lambert worked his projections using conventional calculus.  It would be another 100 years before the power of complex analysis was brought fully to bear on the problem of conformal mappings.

Riemann’s Sphere

It seems like the history of geometry can be divided into two periods: the time before Bernhard Riemann and the time after Bernhard Riemann

Bernhard Riemann portrait
Fig. 8 Bernhard Riemann.

Bernhard Riemann was a gentle giant, a shy and unimposing figure with a Herculean mind.  He transformed how everyone thought about geometry, both real and complex.  His doctoral thesis was the most complete exposition to date on the power of complex analysis, and his Habilitation Lecture on the foundations of geometry shook those very foundations to their core.

In the hands of Riemann, the stereographic projection became a complex transform of the simplest type

where x, y and z are the spherical coordinates of a point on the sphere.

Riemann sphere projection as a conformal map
Fig. 9 Conformal mapping of the surface of the Riemann sphere onto the complex plane.

The projection in Fig. 9 is from the North Pole, which represents Antarctica faithfully but distorts the lands of the Northern Hemisphere. Any projection can be centered on a chosen point of the Earth by projecting from the opposite point, called the antipode. For instance, the stereographic projection centered on Chicago is shown in Fig. 10.

Stereographic projection centered on Chicago.
Fig. 10 Stereographic projection centered on Chicago, Illinois. Note the size of Greenland relative to its size in the Mercator projection of Fig. 5.

Building on the work of Euler and Cauchy, Riemann dove into conformal maps and emerged in 1851 with one of the most powerful theorems in complex analysis, known as the Riemann Mapping Theorem:

Any non-empty, simply connected open subset of the complex plane (which is not the entire plane itself) can be conformally mapped to the open unit disk. 

An immediate consequence of this is that all non-empty, simply connected open subsets of the complex plane are equivalent, because any domain can be mapped onto the unit disk, and then the unit disk can be mapped to any domain.

The consequences of this are astounding:  Solve a simple physics problem in a simple domain and then use the Riemann mapping theorem to transform it into the most complex, ugly, convoluted, twisted problem you can think of (as long as it is simply connected) and then you have the answer.

The reason that conformal maps (that are purely mathematical) allow the transformation of physics problems (that are “real”) is because physics is based on orthogonal sets of fields and potentials that govern how physical systems behave.  In other words, the solution to the Laplacian operator on one domain can be transformed to a solution of the Laplacian operator on a different domain.

Powerful! Great!  But how do you do it?  Riemann’s theorem was an existence proof—not a solution manual.  The mapping transformations still needed to be found.

Schwarz-Christoffel

On the heels of Bernard Riemann, who had altered the course of geometry, Hermann Schwarz at the University of Halle, Germany, and Elwin Bruno Christoffel at the Technical University in Zürich, Switzerland, took up Riemann’s mapping theorem to search for the actual mappings that would turn the theorem from “nice to know” to an actual formula.

Working independently, Christoffel in 1867 drew on his expertise in differential geometry while Schwarz in 1869 drew on his expertise in the calculus of variations, both with a solid background in geometry and complex analysis.  They focused on conformal maps of polygons because general domains on the complex plane can be described with polygonal boundaries.  The conformal map they sought would take simple portions of the complex plane and map them to the interior angle of a polygonal vertex.  With sufficient constraints, the goal was to map all the vertexes and hence the entire domain.

The surprisingly simple result is known as the Schwarz-Christoffel equation

where a, b, c … are the positions of the vertices, and α, β, γ … are the interior angles of the vertices.  The integral needs to be carried out on the complex plane, but it has closed-form solutions for many common cases.

This equation solves the problem of “how” that allows any physics solution on one domain to be mapped to another domain.

Conformal Maps

The list of possible conformal maps is literally limitless, yet there are a few that are so common that they deserve to be explored in some detail here.

One conformal map is so “famous” it has the name of the Joukowski Map that takes the upper half plane and transforms it (through an open strip) onto the full complex plane. The field lines and potentials are shown in Fig. 11 as a simple transform of straight lines. To calculate these fields and potentials directly would require the solution of a partial differential equation (PDE) through numerical methods.

Field and potential lines near an aperture in conducting plates by a conformal map.
Fig. 11 Field lines and potentials around a gap in a charged conductor by the Joukowski conformal map.

Other common conformal maps are power-law transformations, taking the upper half plane into the full plane. Fig. 12 shows three of these, the first an inner half corner, the second the outer half corner, and the third transforming the upper half plane onto the full plane. All three of these show the field lines and the potentials near charged conducting plates.

Conformal maps from the half-plane to the full plane  with field and potential lines.
Fig. 12 Maps from the half-plane to the full plane: a) Inner corner, b) outer corner, c) charged thin plate.

Conformal maps can also be “daisy-chained”. For instance, in Fig. 13, the unit circle is transformed into the upper half plane, providing the field lines and equipotentials of a point charge near a conducting plate. The fields are those of a point charge and its image charge, creating a dipole potential. This charge and its image are transformed again into the fields and potentials of a point charge near a conducting corner.

Compound conformal maps of the unit circle to the half-plane to the outside of a corner.
Fig. 13 Point charge fields and potentials near a conducting corner by a compound conformal map: a) Unit circle to the half-plane, b) Half-plane to the outside corder.

But we are not quite done with conformal maps. They have reappeared in recent years in exciting new areas of physics in the form of conformal field theory.

Conformal Field Theory

The importance being conformal extends far beyond solutions to Laplace’s equation.  Physics is physics, regardless of how it comes about and how it is described, and transformations cannot change the physics.  As an example, when a many-body system is at a critical point, then the description of the system is scale independent.  In this case, changing scale is one type of transformation that keeps the physics the same.  Conformal maps also keep the physics the same by preserving angles.  Taking this idea into the quantum realm, a quantum field theory of a scale-invariant system can be conformally mapped onto other, more complex systems for which answers are not readily derived. 

This is why conformal field theory (CFT) has become an important new field of physics with applications ranging as widely as quantum phase transitions and quantum strings.


Books by David Nolte at Oxford University Press
Read more in Books by David D. Nolte at Oxford University Press.

Anant K. Ramdas in the Golden Age of Physics

The physicist, as a gentleman and a scholar, who, in his leisure, pursues physics as both vocation and hobby, is an endangered species, though they once were endemic.  Classic examples come from the turn of the last century, as Rayleigh and de Broglie and Raman built their own laboratories to follow their own ideas.  These were giants in their fields. But there are also many quiet geniuses, enthralled with the life of ideas and the society of scientists, working into the late hours, following the paths that lead them forward through complex concepts and abstract mathematics as a labor of love.

One of these quiet geniuses, of late, was a colleague of mine and a friend, Anant K. Ramdas.  He was the last PhD student of the Nobel Prize Laureate, C. V. Raman, and he may have been the last of his kind as a gentleman and a scholar physicist.

Anant K. Ramdas

Anant Ramdas was born in May, 1930, in Pune, India, not far from the megalopolis of Mumbai when it had just over a million inhabitants (the number is over 22 million today, nearly a hundred years later).  His father, Lakshminarayanapuram A. Ramdas, was a scientist, a meteorologist who had studied under C. V. Raman at the University of Calcutta.  Raman won the Nobel Prize in Physics the same year that Anant Ramdas was born. 

Ramdas received his BS in Physics from the University of Pune in 1950, then followed in his father’s footsteps by studying for his MS (1953) and PhD (1956) degrees in Physics under Raman, who had established the Raman Institute in Bangalore, India. 

While facing the decision, after his graduation, on what to do and where to go, Ramdas read a review article published by Prof. H. Y. Fan of Purdue University on infrared spectroscopy of semiconductors.  After corresponding with Fan, and with the Purdue Physics department head, Prof. Karl Lark-Horowitz, Ramdas decided to accept the offer of a research associate (a post-doc position), and he prepared to leave India.

Within only a few months, he met and married his wife, Vasanti, and they hopped on a propeller plane to London that stopped along the way in Cairo, Beirut, Lebanon, and Paris before arriving in London.  From there, they caught a cargo ship making a two-week passage across the Atlantic, after stopping at ports in France and Portugal.  In New York City, they took a train to Chicago, getting off during a brief stop in the little corn-town of Lafayette, Indiana, home of Purdue University.  It was 1956, and Anant and Vasanti were, ironically, the first Indians that some people in the Indiana town had ever seen.

Semiconductor Physics at Purdue

Semiconductors became the ascendent electronic material during the Second World War when it was discovered that their electrical properties were ideal for military radar applications.  Many of the top physicists of the time worked at the “Rad Lab”, the Radiation Laboratory of MIT, and collaborations spread out across the US, including to the Physics Department at Purdue University.  Researchers at Purdue were especially good at growing the semiconductor Germanium, which was used in radar rectifiers.  The research was overseen by Lark-Horowitz.

After the war, semiconductor research continued to be a top priority in the Purdue Physics department as groups around the world competed to find ways to use semiconductors instead of vacuum tubes for information and control.  Friendly competition often meant the exchange of materials and samples, and sometime in early 1947, several Germanium samples were shipped to the group of Bardeen and Brattain at Bell Labs, where, several months later, they succeeded in making the first point contact transistor using Germanium (with some speculation today that it may have been with the samples sent from Purdue).  It was a close thing. Ralph Bray, a professor at Purdue, had seen nonlinear current dependences in the Purdue-grown Germanium samples that were precursers of transistor action, but Bell made the announcement before Bray had a chance to take the next step. Lark-Horowitz (and Bray) never forgot how close Purdue had come to making the invention themselves [1].

In 1948, Lark-Horowitz hired H. Y. Fan, who had received his PhD at MIT in 1937 and had been teaching at Tsinghua University in China.  Fan was an experimental physicist specializing in the infrared properties of semiconductors, and when Ramdas arrived at Purdue in 1956, he worked directly under Fan.  They published their definitive work on the infrared absorption of irradiated silicon in 1959 [2].

Absorption spectrum of “effective-mass” shallow defect levels in irradiated silicon.

One day, while Ramdas was working in Fan’s lab, Lark-Horowitz stopped by, as he was accustomed to do, and he casually asked if Ramdas would be interested in becoming a professor at Purdue.  Ramdas of course said “Yes”, and Lark-Horowitz gave him the job on the spot.  Ramdas was appointed as an assistant professor in 1960.  These things were less formal in those days, and it was only later that Ramdas learned that Fan had already made a strong case for him.

The Golden Age of Physics

The period from 1960 to 2015, which spanned Ramdas’ career, start to finish, might be called “The Golden Age of Physics”. 

This time span saw the completion of the Standard Model of particle physics with the theory of quarks (1964), the muon neutrino (1962), electro-weak unification (1968), quantum chromodynamics (1970s), the tau lepton (1975), the bottom quark (1977), the top quark (1995), the W and Z bosons (1983), the tau neutrino (2000), neutrino mass oscillations (2004), and of course capping it off with the detection of the Higgs boson (2012). 

This was the period in solid state physics that saw the invention of the laser (1960), the quantum Hall effect (1980), the fractional quantum Hall effect (1982), scanning tunneling microscopy (1981), quasi-crystals (1982), high-temperature superconductors (1986), and graphene (2005).

This was also the period when astrophysics witnessed the discovery of the Cosmic Background Radiation (1964), the first black hole (1964), pulsars (1967), confirmation of dark matter (1970s), inflationary cosmology (1980s), Baryon Acoustic Oscillations (2005), and capping the era off with the detection of gravitational waves (2015).

The period from 1960 – 2015 stands out relative to the “first” Golden Age of Physics from 1900 – 1930 because this later phase is when the grand programs from early in the century were brought largely to completion.

But these are the macro-events of physics from 1960-2015.  This era was also a Golden Age in the micro-events of the everyday lives of the physicists.  It is this personal aspect where this later era surpassed the earlier era (when only a handful of physicists were making progress).  In the later part of the century, small armies of physicists were advancing rapidly along all the frontiers at the same time, and doing it with the greatest focus.

This was when a single NSF grant could support a single physicist with several grad students and an undergraduate or two.  The grants could be renewed with near certainty, as long as progress was made and papers were published.  Renewal applications, in those days, were three pages.  Contrast that to today when 25 pages need to be honed to perfection—and then the renewal rate is only about 10% (soon to be even lower with the recent budget cuts to science in the USA).  In those earlier days, the certainty of success, and the absence of the burden of writing multiple long grant proposals, bred confidence to dispose of the conventional, to try anything new.  In other words, the vast amount of time spent by physicists during this Golden Age was in the pursuit of physics, in the classroom and in the laboratory.

And this was the time when Anant Ramdas and his cohort—Sergio Rodriguez, Peter Fisher, Jacek Furdyna, Eugene Haller, the Chandrasekhar’s, Manuel Cardona, and the Dresselhaus’s—rode the wave of semiconductor physics when money was easy, good students were plentiful, and a vibrant intellectual community rallied around important problems.

Selected Topics of Research from Anant Ramdas

It is impossible to give justice to the breadth and depth of research performed by Anant over his career. So here is my selection of some of my favorite examples of his work:

Diamond

Anant had a life-long fascination for diamonds. As a rock and gem collector, he was fond of telling stories about the famous Cullinan diamond (weighed 1.3 pounds as a raw diamond at 3000 carats) and the blue Hope diamond (discovered in India). One of his earliest and most cited papers was on the Raman spectrum of Diamond [3], and he published several papers on his favorite color for diamonds—Blue [4]!

Raman Spectrum of Diamond.

His work on diamond helped endear Anant with the husband-wife team of Milly Dresselhaus and Gene Dresselhaus at MIT. Milly was the “Queen” of carbon, known for her work on graphite, carbon nanotubes and Fullerenes. Purdue had made an offer of an assistant professorship to Gene Dresselhaus when the two were looking for faculty positions after their post-docs at the University of Chicago, but Purdue would not give Milly a position (she was viewed as a “trailing” spouse). Anant was already at Purdue at that time and got to know both of them, maintaining a life-long friendship. Milly went on to become the president of the APS and was elected a member of the National Academy of Sciences, the National Academy of Engineering and the American Academy of Arts and Sciences.

Magneto-Optics

Purdue was a hot-bed of II-VI semiconductor research in the 1980’s, spearheaded by Jacek Furdyna. The substitution of the magnetic ion Mn for Zn, Cd or Hg created a unique class of highly magnetic semiconductors. Anant was the resident expert on the optical properties of the materials and collected one of the best examples of Giant Faraday Rotation [5].

Giant Faraday Effect in CdMnTe

Anant and the Purdue team were the world leaders in the physics and materials science of diluted magnetic semiconductors.

Shallow Defects in Semiconductors

My own introduction to Anant was through his work on shallow effective-mass defect states in semiconductors. I was working towards my PhD with Eugene ‘Gene” Haller at Lawrence Berkeley Lab (LBL) in the early 1980’s, and Gene was an expert on the spectroscopy of the shallow levels in Germanium. My co-physics graduate student colleague was Joe Kahn, and the two of us were tasked with studying the review article that Anant had written with his long-time theoretical collaborator Sergio Rodriguez on the physics of effective-mass shallow defects in semiconductors [6]. We called it “The Bible”, and spent months studying it. Gene Haller’s principal technique was photothermal ionization spectroscopy (PTIS), and Joe was building the world’s finest PTIS instrument. Joe met Anant for dinner one night at the March meeting of the APS in 1986, and when he got back to the room, he waxed poetic about Anant for an hour. It was like he had met his hero. I don’t remember how I missed that dinner, so my personal introduction to Anant Ramdas would have to wait.

PTIS spectra of donors in GaAs

My own research went into deep-level transient spectroscopy (DLTS) working with Gene and his group theorist, Wladek Walukiewicz, where we discovered a universal pressure derivative in III-V semiconductors. This research led me to a post-doc position at Bell Labs under Alastair Glass and later to a faculty position at Purdue, where I did finally meet Anant, who became my long-time champion and mentor. But Joe had stayed with the shallow defects, and in particular defects that showed interesting dynamical properties, known as tunneling defects.

Dynamic Defects in Semiconductors

Dynamic defects in semiconductors are multicomponent defects (often involving vacancies or interstitial defects) in which one of the components tunnels quantum mechanically, or hops, rapidly on a time scale short compared to the measurement interaction time (electric dipole transition), so that the measurement sees increased symmetry compared to the instantaneous low-symmetry configuration of the defect.

Eugene Haller and his physics theory collaborator, Leo Falicov, were pioneers in tunneling defects related to hydrogen, building on earlier work by George Watkins who studied dynamical defects using EPR measurements. In my early days doing research under Eugene, we thought we had discovered a dynamical effect in FeB defects in silicon, and I spent two very interesting weeks at Lehigh University, visiting Watkins, to test out our idea, but it turned out to be a static effect. Later, Joe Kahn found that some of the early hydrogen defects in Germanium that Gene and Leo had proposed as dynamical defects were also, in fact, static. So the class of dynamical defects in semiconductors was actually shrinking over time rather than expanding. Joe did go on to find clear proof of a hydrogen-related dynamical defect in Germanium, saving the Haller-Falicov theory from the dust bin of Physics History.

In 2006 and in 2008, Ramdas was working on Oxygen-related defect complexes in CdSe when his student, G. Chen [7-8], discovered a temperature-induced symmetry raising. It showed clear evidence for a lower symmetry defect that converged into a higher symmetry mode at high temperatures, very much in agreement with the Haller-Falicov theory of dynamical symmetry raising.

At that time, I was developing my course notes for my textbook Introduction to Modern Dynamics, where some of the textbook problems in synchronization looked just like Anant’s data. Using a temperature-dependent coupling in a model of nonlinear (anharmonic) oscillators, I obtained the following fits (solid curves) to the Ramdas data (data points):

Quantum synchronization in CdSe and CdTe.

The fit looks too good to be a coincidence, and Anant and I debated on whether the Haller-Falicov theory, or a theory based on nonlinear synchronization, would be better descriptions of the obviously dynamical properties of these defects. Alas, Anant is now gone, and so are Gene and Leo, so I am the last one left thinking about these things.

Beyond the Golden Age?

Anant Ramdas was fortunate to have spent his career during the Golden Age of Physics, when the focus was on the science and on the physics, as healthy communities helped support one another in friendly competition. He was a gentleman scholar, an avid reader of books on history and philosophy, much of it (but not all) on the history and philosophy of physics. His “Coffee Club” at 9:30 AM every day in the Physics Department at Purdue was a must-not-miss event that was attended by all of the Old Guard as well as by myself, where the topics of conversation ran the gamut, presided over by Anant. He had his NSF grant, year after year (and a few others), and that was all he needed to delve into the mysteries of the physics of semiconductors.

Is that age over? Was Anant one of the last of that era? I can only imagine what he would say about the current war against science and against rationality raging across the USA right now, and the impending budget cuts to all the science institutes. He spent his career and life upholding the torch of enlightenment. Today, I fear he would be holding it in the dark. He passed away Thanksgiving, 2024.

Vasanti and Anant, 2022.

References

[1] Ralph Bray, “A Case Study in Serendipity”, The Electrochemical Society, Interface, Spring 1997.

[2] H. Y. Fan and A. K. Ramdas, “INFRARED ABSORPTION AND PHOTOCONDUCTIVITY IN IRRADIATED SILICON,” Journal of Applied Physics, Article vol. 30, no. 8, pp. 1127-1134, 1959, doi: 10.1063/1.1735282.

[3] S. A. Solin and A. K. Ramdas, “RAMAN SPECTRUM OF DIAMOND,” Physical Review B, Article vol. 1, no. 4, pp. 1687-&, 1970, doi: 10.1103/PhysRevB.1.1687

[4] H. J. Kim, Z. Barticevic, A. K. Ramdas, S. Rodriguez, M. Grimsditch, and T. R. Anthony, “Zeeman effect of electronic Raman lines of accepters in elemental semiconductors: Boron in blue diamond,” Physical Review B, Article vol. 62, no. 12, pp. 8038-8052, Sep 2000, doi: 10.1103/PhysRevB.62.8038.

[5] D. U. Bartholomew, J. K. Furdyna, and A. K. Ramdas, “INTERBAND FARADAY-ROTATION IN DILUTED MAGNETIC SEMICONDUCTORS – ZN1-XMNXTE AND CD1-XMNXTE,” Physical Review B, Article vol. 34, no. 10, pp. 6943-6950, Nov 1986, doi: 10.1103/PhysRevB.34.6943.

[6] A. K. Ramdas and S. Rodriguez, “SPECTROSCOPY OF THE SOLID-STATE ANALOGS OF THE HYDROGEN-ATOM – DONORS AND ACCEPTORS IN SEMICONDUCTORS,” Reports on Progress in Physics, Review vol. 44, no. 12, pp. 1297-1387, 1981, doi: 10.1088/0034-4885/44/12/002

[7] G. Chen, I. Miotkowski, S. Rodriguez, and A. K. Ramdas, “Stoichiometry driven impurity configurations in compound semiconductors,” Physical Review Letters, Article vol. 96, no. 3, Jan 2006, Art no. 035508, doi: 10.1103/PhysRevLett.96.035508.

[8] G. Chen, J. S. Bhosale, I. Miotkowski, and A. K. Ramdas, “Spectroscopic Signatures of Novel Oxygen-Defect Complexes in Stoichiometrically Controlled CdSe,” Physical Review Letters, Article vol. 101, no. 19, Nov 2008, Art no. 195502, doi: 10.1103/PhysRevLett.101.195502.

Other Notable Papers:

[9] E. S. Oh, R. G. Alonso, I. Miotkowski, and A. K. Ramdas, “RAMAN-SCATTERING FROM VIBRATIONAL AND ELECTRONIC EXCITATIONS IN A II-VI QUATERNARY COMPOUND – CD1-X-YZNXMNYTE,” Physical Review B, Article vol. 45, no. 19, pp. 10934-10941, May 1992, doi: 10.1103/PhysRevB.45.10934.

[10] R. Vogelgesang, A. K. Ramdas, S. Rodriguez, M. Grimsditch, and T. R. Anthony, “Brillouin and Raman scattering in natural and isotopically controlled diamond,” Physical Review B, Article vol. 54, no. 6, pp. 3989-3999, Aug 1996, doi: 10.1103/PhysRevB.54.3989.

[11] M. H. Grimsditch and A. K. Ramdas, “BRILLOUIN-SCATTERING IN DIAMOND,” Physical Review B, Article vol. 11, no. 8, pp. 3139-3148, 1975, doi: 10.1103/PhysRevB.11.3139.

[12] E. S. Zouboulis, M. Grimsditch, A. K. Ramdas, and S. Rodriguez, “Temperature dependence of the elastic moduli of diamond: A Brillouin-scattering study,” Physical Review B, Article vol. 57, no. 5, pp. 2889-2896, Feb 1998, doi: 10.1103/PhysRevB.57.2889.

[13] A. K. Ramdas, S. Rodriguez, M. Grimsditch, T. R. Anthony, and W. F. Banholzer, “EFFECT OF ISOTOPIC CONSTITUTION OF DIAMOND ON ITS ELASTIC-CONSTANTS – C-13 DIAMOND, THE HARDEST KNOWN MATERIAL,” Physical Review Letters, Article vol. 71, no. 1, pp. 189-192, Jul 1993, doi: 10.1103/PhysRevLett.71.189.

.

The Light in Einstein’s Elevator

Gravity bends light!

Of all the audacious proposals made by Einstein, and there were many, this one takes the cake because it should be impossible.

There can be no force of gravity on light because light has no mass.  Without mass, there is no gravitational “interaction”.  We all know Newton’s Law of gravity … it was one of the first equations of physics we ever learned

Newtonian gravitation

which shows the interaction between the masses M and m through their product.  For light, this is strictly zero. 

How, then did Einstein conclude, in 1907, only two years after he proposed his theory of special relativity, that gravity bends light? If it were us, we might take Newton’s other famous equation and equate the two

Newton's second law

and guess that somehow the little mass m (though it equals zero) cancels out to give

Acceleration

so that light would fall in gravity with the same acceleration as anything else, massive or not. 

But this is not how Einstein arrived at his proposal, because this derivation is wrong!  To do it right, you have to think like an Einstein.

“My Happiest Thought”

Towards the end of 1907, Einstein was asked by Johannes Stark to contribute a review article on the state of the relativity theory to the Jahrbuch of Radioactivity and Electronics. There had been a flurry of activity in the field in the two years since Einstein had published his groundbreaking paper in Annalen der Physik in September of 1905 [1]. Einstein himself had written several additional papers on the topic, along with others, so Stark felt it was time to put things into perspective.

Photo of Einstein around 1905 during his Annis Mirabalis.
Fig. 1 Einstein around 1905.

Einstein was still working at the Patent Office in Bern, Switzerland, which must not have been too taxing, because it gave him plenty of time think. It was while he was sitting in his armchair in his office in 1907 that he had what he later described as the happiest thought of his life. He had been struggling with the details of how to apply relativity theory to accelerating reference frames, a topic that is fraught with conceptual traps, when he had a flash of simplifying idea:

“Then there occurred to me the ‘glucklichste Gedanke meines Lebens,’ the happiest thought of my life, in the following form. The gravitational field has only a relative existence in a way similar to the electric field generated by magnetoelectric induction. Because for an observer falling freely from the roof of a house there exists —at least in his immediate surroundings— no gravitational field. Indeed, if the observer drops some bodies then these remain relative to him in a state of rest or of uniform motion… The observer therefore has the right to interpret his state as ‘at rest.'”[2]

In other words, the freely falling observer believes he is in an inertial frame rather than an accelerating one, and by the principle of relativity, this means that all the laws of physics in the accelerating frame must be the same as for an inertial frame. Hence, his great insight was that there must be complete equivalence between a mechanically accelerating frame and a gravitational field. This is the very first conception of his Equivalence Principle.

Cover of the Jahrbuch for Radioactivity and Electronics from 1907.
Fig. 2 Front page of the 1907 volume of the Jahrbuch. The editor list reads like a “Whos-Who” of early modern physics.

Title page to Einstein's 1907 Jahrbuch review article
Fig. 3 Title page to Einstein’s 1907 Jahrbuch review article “On the Relativity Principle and its Consequences” [3]

After completing his review of the consequences of special relativity in his Jahrbuch article, Einstein took the opportunity to launch into his speculations on the role of the relativity principle in gravitation. He is almost appologetic at the start, saying that:

“This is not the place for a detailed discussion of this question.  But as it will occur to anybody who has been following the applications of the principle of relativity, I will not refrain from taking a stand on this question here.”

But he then launches into his first foray into general relativity with keen insights.

The beginning of the section where Einstein first discusses the effects of accelerating frames and effects of gravity
Fig. 4 The beginning of the section where Einstein first discusses the effects of accelerating frames and effects of gravity.

He states early in his exposition:

“… in the discussion that follows, we shall therefore assume the complete physical equivalence of a gravitational field and a corresponding accelerated reference system.”

Here is his equivalence principle. And using it, in 1907, he derives the effect of acceleration (and gravity) on ticking clocks, on the energy density of electromagnetic radiation (photons) in a gravitational potential, and on the deflection of light by gravity.

Over the next several years, Einstein was distracted by other things, such as obtaining his first university position, and his continuing work on the early quantum theory. But by 1910 he was ready to tackle the general theory of relativity once again, when he discovered that his equivalence principle was missing a key element: the effects of spatial curvature, which launched him on a 5-year program into the world of tensors and metric spaces that culminated with his completed general theory of relativity that he published in November of 1915 [4].

The Observer in the Chest: There is no Elevator

Einstein was never a stone to gather moss. Shortly after delivering his triumphal exposition on the General Theory of Relativity, he wrote up a popular account of his Special and now General Theories to be published as a book in 1916, first in German [5] and then in English [6]. What passed for a “popular exposition” in 1916 is far from what is considered popular today. Einstein’s little book is full of equations that would be somewhat challenging even for specialists. But the book also showcases Einstein’s special talent to create simple analogies, like the falling observer, that can make difficult concepts of physics appear crystal clear.

In 1916, Einstein was not yet thinking in terms of an elevator. His mental image at this time, for a sequestered observer, was someone inside a spacious chest filled with measurement apparatus that the observer could use at will. This observer in his chest was either floating off in space far from any gravitating bodies, or the chest was being pulled by a rope hooked to the ceiling such that the chest accelerates constantly. Based on the measurement he makes, he cannot distinguish between gravitational fields and acceleration, and hence they are equivalent. A bit later in the book, Einstein describes what a ray of light would do in an accelerating frame, but he does not have his observer attempt any such measurement, even in principle, because the deflection of the ray of light from a linear path would be far too small to measure.

But Einstein does go on to say that any curvature of the path of the light ray requires that the speed of light changes with position. This is a shocking admission, because his fundamental postulate of relativity from 1905 was the invariance of the speed of light in all inertial frames. It was from this simple assertion that he was eventually able to derive E = mc2. Where, on the one hand, he was ready to posit the invariance of the speed of light, on the other hand, as soon as he understood the effects of gravity on light, Einstein did not hesitate to cast this postulate adrift.

Position-dependent speed of light in relativity.

Fig. 5 Einstein’s argument for the speed of light depending on position in a gravitational field.

(Einstein can be forgiven for taking so long to speak in terms of an elevator that could accelerate at a rate of one g, because it was not until 1946 that the rocket plane Bell X-1 achieved linear acceleration exceeding 1 g, and jet planes did not achieve 1 g linear acceleration until the F-15 Eagle in 1972.)

Aircraft with greater than 1:1 thrust to weight ratios
Fig. 6 Aircraft with greater than 1:1 thrust to weight ratios.

The Evolution of Physics: Enter Einstein’s Elevator

Years passed, and Einstein fled an increasingly autocratic and belligerent Germany for a position at Princeton’s Institute for Advanced Study. In 1938, at the instigation of his friend Leopold Infeld, they decided to write a general interest book on the new physics of relativity and quanta that had evolved so rapidly over the past 30 years.

Title page of "Evolution of Physics" 1938 written with his friend Leopold Infeld at Princeton's Institute for Advanced Study.
Fig. 7 Title page of “Evolution of Physics” 1938 written with his friend Leopold Infeld at Princeton’s Institute for Advanced Study.

Here, in this obscure book that no-one remembers today, we find Einstein’s elevator for the first time, and the exposition talks very explicitly about a small window that lets in a light ray, and what the observer sees (in principle) for the path of the ray.

One of the only figures in the Einstein and Infeld book: The origin of "Einstein's Elevator"!
Fig. 8 One of the only figures in the Einstein and Infeld book: The origin of “Einstein’s Elevator”!

By the equivalence principle, the observer cannot tell whether they are far out in space, being accelerated at the rate g, or whether they are statinary on the surface of the Earth subject to a gravitational field. In the first instance of the accelerating elevator, a photon moving in a straight line through space would appear to deflect downward in the elevator, as shown in Fig. 9, because the elevator is accelerating upwards as the photon transits the elevator. However, by the equivalence principle, the same physics should occur in the gravitational field. Hence, gravity must bend light. Furthermore, light falls inside the elevator with an acceleration g, just as any other object would.

The accelerating elevator and what an observer inside sees (From "Galileo Unbound" (Oxford, 2018).
Fig. 9 The accelerating elevator and what an observer inside sees (From “Galileo Unbound” (Oxford, 2018). [7])

Light Deflection in the Equivalence Principle

A photon enters an elevator at right angles to its acceleration vector g.  Use the geodesic equation and the elevator (Equivalence Principle) metric [8]

to show that the trajectory is parabolic. (This is a classic HW problem from Introduction to Modern Dynamics.)

The geodesic equation with time as the dependent variable

This gives two coordinate equations

Note that x0 = ct and x1 = ct are both large relative to the y-motion of the photon.  The metric component that is relevant here is

and the others are unity.  The geodesic becomes (assuming dy/dt = 0)

The Christoffel symbols are

which give

Therefore

or

where the photon falls with acceleration g, as anticipated.

Light Deflection in the Schwarzschild Metric

Do the same problem of the light ray in Einstein’s Elevator, but now using the full Schwarzschild solution to the Einstein Field equations.

Schwarzschild metric

Einstein’s elevator is the classic test of virtually all heuristic questions related to the deflection of light by gravity.  In the previous Example, the deflection was attributed to the Equivalence Principal in which the observer in the elevator cannot discern whether they are in an acceleration rocket ship or standing stationary on Earth.  In that case, the time-like metric component is the sole cause of the free-fall of light in gravity.  In the Schwarzschild metric, on the other hand, the curvature of the field near a spherical gravitating body also contributes.  In this case, the geodesic equation, assuming that dr/dt = 0 for the incoming photon, is

where, as before, the Christoffel symbol for the radial displacements are

Evaluating one of these

The other Christoffel symbol that contributes to the radial motion is

and the geodesic equation becomes

with

The radial acceleration of the light ray in the elevator is thus

The first term on right is free-fall in gravity, just as was obtained from the Equivalence Principal.  The second term is a higher-order correction caused by curvature of spacetime.  The third term is the motion of the light ray relative to the curved ceiling of the elevator in this spherical geometry and hence is a kinematic (or geometric) artefact.  (It is interesting that the GR correction on the curved-ceiling correction is of the same order as the free-fall term, so one would need to be very careful doing such an experiment … if it were at all measurable.) Therefore, the second and third terms are curved-geometry effects while the first term is the free fall of the light ray.


  

Post-Script: The Importance of Library Collections

I was amused to see the library card of the scanned Internet Archive version of Einstein’s Jahrbuch article, shown below. The volume was checked out in August of 1981 from the UC Berkeley Physics Library. It was checked out again 7 years later in September of 1988. These dates coincide with when I arrived at Berkeley to start grad school in physics, and when I graduated from Berkeley to start my post-doc position at Bell Labs. Hence this library card serves as the book ends to my time in Berkeley, a truly exhilarating place that was the top-ranked physics department at that time, with 7 active Nobel Prize winners on its faculty.

During my years at Berkeley, I scoured the stacks of the Physics Library looking for books and journals of historical importance, and was amazed to find the original volumes of Annalen der Physik from 1905 where Einstein published his famous works. This was the same library where, ten years before me, John Clauser was browsing the stacks and found the obscure paper by John Bell on his inequalities that led to Clauser’s experiment on entanglement that won him the Nobel Prize of 2022.

That library at UC Berkeley was recently closed, as was the Physics Library in my department at Purdue University (see my recent Blog), where I also scoured the stacks for rare gems. Some ancient books that I used to be able to check out on a whim, just to soak up their vintage ambience and to get a tactile feel for the real thing held in my hands, are now not even available through Interlibrary Loan. I may be able to get scans from Internet Archive online, but the palpable magic of the moment of discovery is lost.

References:

[1] Einstein, A. (1905). Zur Elektrodynamik bewegter Körper. Annalen der Physik, 17(10), 891–921.

[2] Pais, A (2005). Subtle is the Lord: The Science and Life of Albert Einstein (Oxford University Press). pg. 178

[3] Einstein, A. (1907). Über das Relativitätsprinzip und die aus demselben gezogenen Folgerungen. Jahrbuch der Radioaktivität und Elektronik, 4, 411–462.

[4] A. Einstein (1915), “On the general theory of relativity,” Sitzungsberichte Der Koniglich Preussischen Akademie Der Wissenschaften, pp. 778-786, Nov.

[5] Einstein, A. (1916). Über die spezielle und die allgemeine Relativitätstheorie (Gemeinverständlich). Braunschweig: Friedr. Vieweg & Sohn.

[6] Einstein, A. (1920). Relativity: The Special and the General Theory (A Popular Exposition) (R. W. Lawson, Trans.). London: Methuen & Co. Ltd.

[7] Nolte, D. D. (2018). Galileo Unbound. A Path Across Life, the Universe and Everything. (Oxford University Press)

[8] Nolte, D. D. (2019). Introduction to Modern Dynamics: Chaos, Networks, Space and Time (Oxford University Press).

Read more in Books by David Nolte at Oxford University Press.

A Short History of Neural Networks

When it comes to questions about the human condition, the question of intelligence is at the top. What is the origin of our intelligence? How intelligent are we? And how intelligent can we make other things…things like artificial neural networks?

This is a short history of the science and technology of neural networks, not just artificial neural networks but also the natural, organic type, because theories of natural intelligence are at the core of theories of artificial intelligence. Without understanding our own intelligence, we probably have no hope of creating the artificial type.

Ramon y Cajal (1888): Visualizing Neurons

The story begins with Santiago Ramon y Cajal (1853 – 1934) who received the Nobel Prize in physiology in 1906 for his work illuminating natural neural networks. He built on work by Camillo Golgi, using a stain to give intracellular components contrast [1], and then went further to developed his own silver emulsions like those of early photography (which was one of his hobbies). Cajal was the first to show that neurons were individual constituents of neural matter and that their contacts were sequential: axons of sets of neurons contacted the dendrites of other sets of neurons, never axon-to-axon or dendrite-to-dendrite, to create a complex communication network. This became known as the neuron doctrine, and it is a central idea of neuroscience today.

Fig. 1 One of Cajal’s published plates demonstrating neural synapses. From Link.

McCulloch and Pitts (1943): Mathematical Models

In 1941, Warren S. McCulloch (1898–1969) arrived at the Department of Psychiatry at the University of Illinois at Chicago where he met with the mathematical biology group at the University of Chicago led by Nicolas Rashevsky (1899–1972), widely acknowledged as the father of mathematical biophysics in the United States.

An itinerant member of Rashevsky’s group at the time was a brilliant, young and unusual mathematician, Walter Pitts (1923– 1969). He was not enrolled as a student at Chicago, but had simply “showed up” one day as a teenager at Rashevsky’s office door.  Rashevsky was so impressed by Pitts that he invited him to attend the group meetings, and Pitts became interested in the application of mathematical logic to biological information systems.

When McCulloch met Pitts, he realized that Pitts had the mathematical background that complemented his own views of brain activity as computational processes. Pitts was homeless at the time, so McCulloch invited him to live with his family, giving the two men ample time to work together on their mutual obsession to provide a logical basis for brain activity in the way that Turing had provided it for computation.

McColloch and Pitts simplified the operation of individual neurons to their most fundamental character, envisioning a neural computing unit with multiple inputs (received from upstream neurons) and a single on-off output (sent to downstream neurons) with the additional possibility of feedback loops as downstream neurons fed back onto upstream neurons. They also discretized the dynamics in time, using discrete logic and time-difference equations, succeeding in devising a logical structure with rules and equations for the general operation of nets of neurons.  They published their results a 1943 in the paper titled “A logical calculus of the ideas immanent in nervous activity,” [2] introducing computational language and logic to neuroscience.  Their simplified neural unit became the basis for discrete logic, picked up a few years later by von Neumann as an elemental example of a logic gate upon which von Neumann began constructing the theory and design of the modern electronic computer.

Fig. 2 The only figure in McCulloch and Pitt’s “Logical Calculus”.

Donald Hebb (1949): Hebbian Learning

The basic model for learning and adjustment of synaptic weights among neurons was put forward in 1949 by the physiological psychologist Donald Hebb (1904-1985) of McGill University in Canada in a book titled The Organization of Behavior [3].

In Hebbian learning, an initially untrained network consists of many neurons with many synapses having random synaptic weights. During learning, a synapse between two neurons is strengthened when both the pre-synaptic and post-synaptic neurons are firing simultaneously. In this model, it is essential that each neuron makes many synaptic contacts with other neurons because it requires many input neurons acting in concert to trigger the output neuron. In this way, synapses are strengthened when there is collective action among the neurons. The synaptic strengths are therefore altered through a form of self-organization. A collective response of the network strengthens all those synapses that are responsible for the response, while the other synapses that do not contribute, weaken. Despite the simplicity of this model, it has been surprisingly robust, standing up as a general principle for the training of artificial neural networks.

Fig. 3. A Figure from Hebb’s textbook on psychology (1958). From Link.

Hodgkin and Huxley (1952): Neuron Transporter Models

Alan Hodgkin (1914 – 1998) and Andrew Huxley (1917 – 2012) were English biophysicists who received the 1963 Nobel Prize in physiology for their work on the physics behind neural activation.  They constructed a differential equation for the spiking action potential for which their biggest conceptual challenge was the presence of time delays in the voltage signals that were not explained by linear models of the neural conductance. As they began exploring nonlinear models, using their experiments to guide the choice of parameters, they settled on a dynamical model in a four-dimensional phase space. One dimension was voltage, while another was inhibitory current. The two remaining dimensions were sodium and potassium conductances, which they had determined were the major ions participating in the generation and propagation of the action potential. The nonlinear conductances of their model described the observed time delays and captured the essential neural behavior of the fast spike followed by a slow recovery. Huxley solved the equations on a hand-cranked calculator, taking over three months of tedious cranking to plot the numerical results.

Fig. 4 The Hodgkin-Huxley model of the neuron, including capacitance C, voltage V and bias current I along with the conductances of potassium (K), sodium (Na) and Lithium (L) channels.

Hodgkin and Huxley published [4] their measurements and their model (known as the Hodgkin-Huxley model) in a series of six papers in 1952 that led to an explosion of research in electrophysiology, for which Hodgkin and Huxley won the 1963 Nobel Prize in physiology or medicine. The four-dimensional Hodgkin–Huxley model stands as a classic example of the power of phenomenological modeling when combined with accurate experimental observation. Hodgkin and Huxley were able to ascertain not only the existence of ion channels in the cell membrane, but also their relative numbers, long before these molecular channels were ever directly observed using electron microscopes. The Hodgkin–Huxley model lent itself to simplifications that could capture the essential behavior of neurons while stripping off the details.

Frank Rosenblatt (1958): The Perceptron

Frank Rosenblatt (1928–1971) had a PhD in psychology from Cornell University and was in charge of the cognitive systems section of the Cornell Aeronautical Laboratory (CAL) located in Buffalo, New York.  He was tasked with fulfilling a contract from the Navy to develop an analog image processor. Drawing from the work of McCulloch and Pitts, his team constructed a software system and then constructed a hardware model that adaptively updated the strength of the inputs, that they called neural weights, as it was trained on test images. The machine was dubbed the Mark I Perceptron, and its announcement in 1958 created a small media frenzy [5]. A New York Times article reported the perceptron was “the embryo of an electronic computer that [the navy] expects will be able to walk, talk, see, write, reproduce itself and be conscious of its existence.”

The perceptron had a simple architecture, with two layers of neurons consisting of an input layer and a processing layer, and it was programmed by adjusting the synaptic weights to the inputs. This computing machine was the first to adaptively learn its functions, as opposed to following predetermined algorithms like digital computers. It seemed like a breakthrough in cognitive science and computing, as trumpeted by the New York Times.  But within a decade, the development had stalled because the architecture was too restrictive.

Fig. 5 Frank Rosenblatt with his Perceptron. From Link.

Richard Fitzhugh and Jin-Ichi Nagumo (1961): Neural van der Pol Oscillators

In 1961 Richard FitzHugh (1922–2007), a neurophysiology researcher at the National Institute of Neurological Disease and Blindness (NINDB) of the National Institutes of Health (NIH), created a surprisingly simple model of the neuron that retained only a third order nonlinearity, just like the third-order nonlinearity that Rayleigh had proposed and solved in 1883, and that van der Pol extended in 1926. Around the same time that FitzHugh proposed his mathematical model [6], the electronics engineer Jin-Ichi Nagumo (1926-1999) in Japan created an electronic diode circuit with an equivalent circuit model that mimicked neural oscillations [7]. Together, this work by FitzHugh and Nagumo led to the so-called FitzHugh–Nagumo model. The conceptual importance of this model is that it demonstrated that the neuron was a self-oscillator, just like a violin string or wheel shimmy or the pacemaker cells of the heart. Once again, self-oscillators showed themselves to be common elements of a complex world—and especially of life.

Fig. 6 The FitzHugh-Nagumo model of the neuron simplifies the Hodgkin-Huxley model from four dimensions down to two dimensions of voltage V and channel activation n.

John Hopfield (1982): Spin Glasses and Recurrent Networks

John Hopfield (1933–) received his PhD from Cornell University in 1958, advised by Al Overhauser in solid state theory, and he continued to work on a broad range of topics in solid state physics as he wandered from appointment to appointment at Bell Labs, Berkeley, Princeton, and Cal Tech. In the 1970s Hopfield’s interests broadened into the field of biophysics, where he used his expertise in quantum tunneling to study quantum effects in biomolecules, and expanded further to include information transfer processes in DNA and RNA. In the early 1980s, he became aware of aspects of neural network research and was struck by the similarities between McColloch and Pitts’ idealized neuronal units and the physics of magnetism. For instance, there is a type of disordered magnetic material called a spin glass in which a large number of local regions of magnetism are randomly oriented. In the language of solid-state physics, one says that the potential energy function of a spin glass has a large number of local minima into which various magnetic configurations can be trapped. In the language of dynamics, one says that the dynamical system has a large number of basins of attraction [8].

The Parallel Distributed Processing Group (1986): Backpropagation

David Rumelhart, a mathematical psychologist at UC San Diego, was joined by James McClelland in 1974 and then by Geoffrey Hinton in 1978 to become what they called the Parallel Distributed Processing (PDP) group. The central tenets of the PDP framework they developed were: 1) processing is distributed across many semi-autonomous neural units, that 2) learn by adjusting the weights of their interconnections based on the strengths of their signals (i.e., Hebbian learning), whose memories and behaviors are 3) an emergent property of the distributed learned weights.

PDP was an exciting framework for artificial intelligence, and it captured the general behavior of natural neural networks, but it had a serious problem: How could all of the neural weights be trained?

In 1986, Rumelhart and Hinton with the mathematician Ronald Williams developed a mathematical procedure for training neural weights called error backpropagation [9]. The idea is actually very simple: create a mean squared error of the response of a neural network compared to an ideal response, then tweak one of the neural weights and see if the error increases or decreases. If the error decreases, keep the tweak for that weight and move to the next, working iteratively, tweak by tweak, to minimize the mean squared error. In this way, large numbers of neural weights can be adjusted as the network is trained to perform a specified task.

Error backpropagation has come a long way from that early 1986 paper, and it now lies at the core of the AI revolution we are experiencing today as tens of millions of neural weights are trained on massive datasets.

Yann LeCun (1989): Convolutional Neural Networks

In 1988, I was a new post-doc at AT&T Bell Labs at Holmdel, New Jersey fresh out of my PhD in physics from Berkeley. Bell Labs liked to give its incoming employees inspirational talks and tours of their facilities, and one of the tours I took was of the neural network lab run by Lawrence Jackel that was working on computer recognition of zip-code digits. The team’s new post-doc, arriving at Bell Labs the same time as me, was Yann LeCun. It is very possible that the demo our little group watched was run by him, or at least he was there, but at the time he was a nobody, so even if I had heard his name, it wouldn’t have meant anything to me.

Fast forward to today, and Yann LeCun’s name is almost synonomous with AI. He is the Chief AI Scientist at Facebook and his google scholar page reports that he gets 50,000 citations per year.

LeCun is famous for developing the convolutional neural network (CNN) in work that he published from Bell Labs in 1989 [10]. It is a biomimetic neural network that takes its inspiration from the receptive fields of the neural networks in the retina. What you think you see, when you look at something, is actually reconstructed by your brain. Your retina is a neural processor with receptive fields that are a far cry from one-to-one. Most prominent in the retina are center-surround fields, or kernels, that respond to the derivatives of the focused image instead of the image itself. It’s the derivatives that are sent up your optic neuron to your brain which then reconstructs the image. It works as a form of image compression so that broad uniform areas in an image are reduced to its edges.

The convolutional neural network works in the same way, it’s just engineered specifically to produce compressed and multiscale codes that capture broad areas as well as the fine details of an image. By constructing many different “kernel” operators at many different scales, it creates a set of features that capture the nuances of the image in quantitative form that is then processed by training neural weights in downstream neural networks.

Fig. 7 Example of a receptive field of a CNN. The filter is the kernel (in this case a discrete 3×3 Laplace operator) that is stepped sequentially across the image field to produce the Laplacian feature map of the original image. One feature map for every different kernel becomes the input for the next level of kernels in a hierarchical scaling structure.

Geoff Hinton (2006): Deep Belief

It seems like Geoff Hinton has had his finger in almost every pie when it comes to how we do AI today. Backpropagation? Geoff Hinton. Rectified Linear Units? Geoff Hinton. Boltzmann Machines? Geoff Hinton. t-SNE? Geoff Hinton. Dropout regularization? Geoff Hinton. AlexNet? Geoff Hinton. The 2024 Nobel Prize in Physics? Geoff Hinton! He may not have invented all of these, but he was in the midst of it all.

Hinton received his PhD in Artificial Intelligence (ar rare field at the time) from the University of Edinburgh in 1978 after which he joined the PDP group at UCSD (see above) as a post-doc. After a time at Carnegie-Mellon, he joined the University of Toronto, Canada, in 1987 where he established one of the leading groups in the world on neural network research. It was from here that he launched so many of the ideas and techniques that have become the core of deep learning.

A central idea of deep learning came from Hinton’s work on Boltzmann Machines that learn statistical distributions of complex data. This type of neural network is known as an energy-based model, similar to a Hopfield network, and it has strong ties to the statistical mechanics of spin-glass systems. Unfortunately, it is a bitch to train! So Hinton simplified it into a Restricted Boltzmann Machine (RBM) that was much more tractable and layers of RBMs could be stacked into “Deep Belief Networks” [11] that had a hierarchical structure that allowed the neural nets to learn layers of abstractions. These were among the first deep networks that were able to do complex tasks at the level of human capabilities (and sometimes beyond).

The breakthrough that propelled Geoff Hinton to world-wide acclaim was the success of AlexNet, a neural network constructed by his graduate student Alex Krizhevsky at Toronto in 2012 consisting of 650,000 neurons with 60 million parameters that were trained using two early Nvidia GPUs. It won the ImageNet challenge that year, enabled by its deep architecture and representing a marked advancement that has been proceeding unabated today.

Deep learning is now the rule in AI, supported by the Attention mechanism and Transformers that underpin the large language models, like ChatGPT and others, that are poised to disrupt all the legacy business models based on the previous silicon revolution of 50 years ago.

Further Reading

(Sections of this article have been excerpted from Chapter 11 of Galileo Unbound, (Oxford University Press)

References

[1] Ramón y Cajal S. (1888). Estructura de los centros nerviosos de las aves. Rev. Trim. Histol. Norm. Pat. 1, 1–10.

[2] McCulloch, W.S. and W. Pitts, A Logical Calculus of the Ideas Immanent in Nervous Activity. Bull. Math. Biophys., 1943. 5: p. 115.

[3] Hebb, D. O. (1949). The Organization of Behavior: A Neuropsychological Theory. New York: Wiley and Sons. ISBN 978-0-471-36727-7 – via Internet Archive.

[4] Hodgkin AL, Huxley AF (August 1952). “A quantitative description of membrane current and its application to conduction and excitation in nerve”. The Journal of Physiology. 117 (4): 500–44.

[5] Rosenblatt, Frank (1957). “The Perceptron—a perceiving and recognizing automaton”. Report 85-460-1. Cornell Aeronautical Laboratory.

[6] FitzHugh, Richard (July 1961). “Impulses and Physiological States in Theoretical Models of Nerve Membrane”. Biophysical Journal. 1 (6): 445–466.

[7] Nagumo, J.; Arimoto, S.; Yoshizawa, S. (October 1962). “An Active Pulse Transmission Line Simulating Nerve Axon”. Proceedings of the IRE. 50 (10): 2061–2070.

[8] Hopfield, J. J. (1982). “Neural networks and physical systems with emergent collective computational abilities”. Proceedings of the National Academy of Sciences. 79 (8): 2554–2558.

[9] Rumelhart, D.E. et al. Nature 323, 533-536 (1986).

[10] Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard and L. D. Jackel: Backpropagation Applied to Handwritten Zip Code Recognition, Neural Computation, 1(4):541–551, Winter 1989.

[11] G. E. Hinton, S. Osindero, and Y. W. Teh, “A fast learning algorithm for deep belief nets,” Neural Computation 18, 1527-1554 (2006).

Books by David D. Nolte at Oxford University Press
Read more in Books by David D. Nolte at Oxford University Press

Frontiers of Physics (2024): Dark Energy Thawing

At the turn of the New Year, as I turn to the breakthroughs in physics of the previous year, sifting through the candidates, I usually narrow it down to about 4 to 6 that I find personally compelling (See, for instance 2023, 2022). In a given year, they may be related to things like supersolids, condensed atoms, or quantum entanglement. Often they relate to those awful, embarrassing gaps in physics knowledge that we give euphemistic names to, like “Dark Energy” and “Dark Matter” (although in the end they may be neither energy nor matter). But this year, as I sifted, I was struck by how many of the “physics” advances of the past year were focused on pushing limits—lower temperatures, more qubits, larger distances.

If you want something that is eventually useful, then engineering is the way to go, and many of the potential breakthroughs of 2024 did require heroic efforts. But if you are looking for a paradigm shift—a new way of seeing or thinking about our reality—then bigger, better and farther won’t give you that. We may be pushing the boundaries, but the thinking stays the same.

Therefore, for 2024, I have replaced “breakthrough” with a single “prospect” that may force us to change our thinking about the universe and the fundamental forces behind it.

This prospect is the weakening of dark energy over time.

It is a “prospect” because it is not yet absolutely confirmed. If it is confirmed in the next few years, then it changes our view of reality. If it is not confirmed, then it still forces us to think harder about fundamental questions, pointing where to look next.

Einstein’s Cosmological “Constant”

Like so much of physics today, the origins of this story go back to Einstein. At the height of WWI in 1917, as Einstein was working in Berlin, he “tweaked” his new theory of general relativity to allow the universe to be static. The tweak came in the form of a parameter he labelled Lambda (Λ), providing a counterbalance against the gravitational collapse of the universe, which at the time was assumed to have a time-invariant density. This cosmological “constant” of spacetime represented a pressure that kept the universe inflated like a balloon.

Fig. 1 Einstein’s “Field Equations” for the universe containing expressions for curvature, the metric tensor and energy density. Spacetime is warped by energy density, and trajectories within the warped spacetime follow geodesic curves. When Λ = 0, only gravitional attraction is present. When Λ ≠ 0, a “repulsive” background force exerts a pressure on spacetime, keeping it inflated like a balloon.

Later, in 1929 when Edwin Hubble discovered that the universe was not static but was expanding, and not only expanding, but apparently on a free trajectory originating at some point in the past (the Big Bang), Einstein zeroed out his cosmological constant, viewing it as one of his greatest blunders.

And so it stood until 1998 when two teams announced that the expansion of the universe is accelerating—and Einstein’s cosmological constant was back in. In addition, measurements of the energy density of the universe showed that the cosmological constant was contributing around 68% of the total energy density, which has been given the name of Dark Energy. One of the ways to measure Dark Energy is through BAO.

Baryon Acoustic Oscillations (BAO)

If the goal of science communication is to be transparent, and to engage the public in the heroic pursuit of pure science, then the moniker Baryon Acoustic Oscillations (BAO) was perhaps the wrong turn of phrase. “Cosmic Ripples” might have been a better analogy (and a bit more poetic).

In the early moments after the Big Bang, slight density fluctuations set up a balance of opposing effects between gravitational attraction, that tends to clump matter, and the homogenization effects of the hot photon background, that tends to disperse ionized matter. Matter consists of both dark matter as well as the matter we are composed of, known as baryonic matter. Only baryonic matter can be ionized and hence interact with photons, hence only photons and baryons experience this balance. As the universe expanded, an initial clump of baryons and photons expanded outward together, like the ripples on a millpond caused by a thrown pebble. And because the early universe had many clumps (and anti-clumps where density was lower than average), the millpond ripples were like those from a gentle rain with many expanding ringlets overlapping.

Fig. 2 Overlapping ripples showing galaxies formed along the shells. The size of the shells is set by the speed of “sound” in the universe. From [Ref].
Fig. 3 Left. Galaxies formed on acoustic ringlets like drops of dew on a spider’s web. Right. Many ringlets overlapping. The characteristic size of the ringlets can still be extracted statistically. From [Ref].

Then, about 400,000 years after the Big Bang, as the universe expanded and cooled, it got cold enough that ionized electrons and baryons formed atoms that are neutral and transparent to light. Light suddenly flew free, decoupled from the matter that had constrained it. Removing the balance between light and matter in the BAO caused the baryonic ripples to freeze in place, as if a sudden arctic blast froze the millpond in an instant. The residual clumps of matter in the early universe became clumps of galaxies in the modern universe that we can see and measure. We can also see the effects of those clumps on the temperature fluctuations of the cosmic microwave background (CMB).

Between these two—the BAO and the CMB—it is possible to measure cosmic distances, and with those distances, to measure how fast the universe is expanding.

Acceleration Slowing

The Dark Energy Spectroscopic Instrument (DESI) on top of Kitt Peak in Arizona is measuring the distances to millions of galaxies using automated fiber optic arrays containing thousands of optical fibers. In one year it measured the distances to about 6 milliion galaxies.

Fig. 4 The Kitt Peak observatory, the site of DESI. From [Ref].

By focusing on seven “epochs” in galaxy formation in the universe, it measures the sizes of the BAO ripples over time, ranging in ages from 3 billion to 11 billion years ago. (The universe is about 13.8 billion years old.) The relative sizes are then compared to the predictions of the LCDM (Lambda-Cold-Dark-Matter) model. This is the “consensus” model of the day—agreed upon as being “most likely” to explain observations. If Dark Energy is a true constant, then the relative sizes of the ripples should all be the same, regardless of how far back in time we look.

But what the DESI data discovered is that relative sizes more recently (a few billion years ago) are smaller than predicted by LCDM. Given that LCDM includes the acceleration of the expansion of the universe caused by Dark Energy, it means that Dark Energy is slightly weaker in the past few billion years than it was 10 billion years ago—it’s weakening or “thawing”.

The measurements as they stand today are shown in Fig. 5, showing the relative sizes as a function of how far back in time they look, with a dashed line showing the deviation from the LCDM prediction. The error bars in the figure are not yet are that impressive, and statistical effects may be causing the trend, so it might be erased by more measurements. But the BAO results have been augmented by recent measurements of supernova (SNe) that provide additional support for thawing Dark Energy. Combined, the BAO+SNe results currently stand at about 3.4 sigma. The gold standard for “discovery” is about 5 sigma, so there is still room for this effect to disappear. So stay tuned—the final answer may be known within a few years.

Fig. 5 Seven “epochs” in the evolution of galaxies in the universe. This plot shows relative galactic distances as a function of time looking back towards the Big Bang (older times closer to the Big Bang are to the right side of the graph). In more recent times, relative distances are smaller than predicted by the consensus theory known as Lambda-Cold-Dark-Matter (LCDM), suggesting that Dark Energy is slight weaker today than it was billions of years ago. The three left-most data points (with error bars from early 2024) are below the LCDM line. From [Ref].
Fig. 6 Annotated version of the previous figure. From [Ref].

The Future of Physics

The gravitational constant G is considered to be a constant property of nature, as is Planck’s constant h, and the charge of the electron e. None of these fundamental properties of physics are viewed as time dependent and none can be derived from basic principles. They are simply constants of our reality. But if Λ is time dependent, then it is not a fundamental constant and should be derivable and explainable.

And that will open up new physics.

100 Years of Quantum Physics:  Pauli’s Exclusion Principle (1924)

One hundred years ago this month, in December 1924, Wolfgang Pauli submitted a paper to Zeitschrift für Physik that provided the final piece of the puzzle that connected Bohr’s model of the atom to the structure of the periodic table.  In the process, he introduced a new quantum number into physics that governs how matter as extreme as neutron stars, or as perfect as superfluid helium, organizes itself.

He was led to this crucial insight, not by his superior understanding of quantum physics, which he was grappling with as much as Bohr and Born and Sommerfeld were at that time, but through his superior understanding of relativistic physics that convinced him that the magnetism of atoms in magnetic fields could not be explained through the orbital motion of electrons alone.

Encyclopedia Article on Relativity

Bored with the topics he was being taught in high school in Vienna, Pauli was already reading Einstein on relativity and Emil Jordan on functional analysis before he arrived at the university in Munich to begin studying with Arnold Sommerfeld.  Pauli was still merely a student when Felix Klein approached Sommerfeld to write an article on relativity theory for his Encyclopedia of Mathematical Sciences.  Sommerfeld by that time was thoroughly impressed with Pauli’s command of the subject and suggested that he write the article.


Pauli’s encyclopedia article on relativity expanded to 250 pages and was published in Klein’s fifth volume in 1921 when Pauli was only 21 years old—just 5 years after Einstein had published his definitive work himself!  Pauli’s article is still considered today one of the clearest explanations of both special and general relativity.

Pauli’s approach established the methodical use of metric space concepts that is still used today when teaching introductory courses on the topic.  This contrasts with articles written only a few years earlier that seem archaic by comparison—even Einstein’s paper itself.  As I recently read through his article, I was struck by how similar it is to what I teach from my textbook on modern dynamics to my class at Purdue University for junior physics majors.

Fig. 1 Wolfgang Pauli [Image]

Anomalous Zeeman Effect

In 1922, Pauli completed his thesis on the properties of water molecules and began studying a phenomenon known as the anomalous Zeeman effect.  The Zeeman effect is the splitting of optical transitions in atoms under magnetic fields.  The electron orbital motion couples with the magnetic field through a semi-classical interaction between the magnetic moment of the orbital and the applied magnetic field, producing a contribution to the energy of the electron that is observed when it absorbs or emits light. 

The Bohr model of the atom had already concluded that the angular momentum of electron orbitals was quantized into integer units.  Furthermore, the Stern-Gerlach experiment of 1922 had shown that the projection of these angular momentum states onto the direction of the magnetic field was also quantized.  This was known at the time as “space quantization”.  Therefore, in the Zeeman effect, the quantized angular momentum created quantized energy interactions with the magnetic field, producing the splittings in the optical transitions.

File:Breit-rabi-Zeeman-en.svg
Fig. 2 The magnetic Zeeman splitting of Rb-87 from the weak field to the strong-field (Pachen-Back) effect

So far so good.  But then comes the problem with the anomalous Zeeman effect.

In the Bohr model, all angular momenta have integer values.  But in the anomalous Zeeman effect, the splittings could only be explained with half integers.  For instance, if total angular momentum were equal to one-half, then in a magnetic field it would produce a “doublet” with +1/2 and -1/2 space quantization.  An integer like L = 1 would produce a triplet with +1, 0, and -1 space quantization.  Although doublets of the anomalous Zeeman effect were often observed, half-integers were unheard of (so far) in the quantum numbers of early quantum physics.

But half integers were not the only problem with “2”s in the atoms and elements.  There was also the problem of the periodic table. It, too, seemed to be constructed out of “2”s, multiplying a sequence of the difference of squares.

The Difference of Squares

The difference of squares has a long history in physics stretching all the way back to Galileo Galilei who performed experiments around 1605 on the physics of falling bodies.  He noted that the distance traveled in successive time intervals varied as the difference 12 – 02 = 1, then 22-12 = 3, then 32-22 = 5, then 42-32 = 7 and so on.  In other words, the distances traveled in each successive time interval varied as the odd integers.  Galileo, ever the astute student of physics, recognized that the distance traveled by an accelerating body in a time t varied as the square of time t2.  Today, after Newton, we know that this is simply the dependence of distance for an accelerating body on the square of time s = (1/2)gt2

By early 1924 there was another law of the difference of squares.  But this time the physics was buried deep inside the new science of the elements, put on graphic display through the periodic table. 

The periodic table is constructed on the difference of squares.  First there is 2 for hydrogen and helium.  Then another 2 for lithium and beryllium, followed by 6 for B, C, N, O, F and Ne to make a total of 8.  After that there is another 8 plus 10 for the sequence of Sc, Ti, V, Cr, Mn, Fe, Co, Ni, Cu and Zn to make a total of 18.  The sequence of 2-8-18 is 2•12 = 2, 2•22 = 8, 2•32 = 18 for the sequence 2n2

Why the periodic table should be constructed out of the number 2 times the square of the principal quantum number n was a complete mystery.  Sommerfeld went so far as to call the number sequence of the periodic table a “cabalistic” rule. 

The Bohr Model for Many Electrons

It is easy to picture how confusing this all was to Bohr and Born and others at the time.  From Bohr’s theory of the hydrogen atom, it was clear that there were different energy levels associated with the principal quantum number n, and that this was related directly to angular momentum through the motion of the electrons in the Bohr orbitals. 

But as the periodic table is built up from H to He and then to Li and Be and B, adding in successive additional electrons, one of the simplest questions was why the electrons did not all reside on the lowest energy level?  But even if that question could not be answered, there was the question of why after He the elements Li and Be behaved differently than B, N, O and F, leading to the noble gas Ne.  From normal Zeeman spectroscopy as well as x-ray transitions, it was clear that the noble gases behaved as the core of succeeding elements, like He for Li and Be and Ne for Na and Mg.

To grapple with all of this, Bohr had devised a “building up” rule for how electrons were “filling” the different energy levels as each new electron of the next element was considered.  The noble-gas core played a key role in this model, and the core was also assumed to be contributing to both the normal Zeeman effect as well as the anomalous Zeeman effect with its mysterious half-integer angular momenta.

But frankly, this core model was a mess, with ad hoc rules on how the additional electrons were filling the energy levels and how they were contributing to the total angular momentum.

This was the state of the problem when Pauli, with his exceptional understanding of special relativity, began to dig deep into the problem.  Since the Zeeman splittings were caused by the orbital motion of the electrons, the strongly bound electrons in high-Z atoms would be moving at speeds near the speed of light.  Pauli therefore calculated what the systematic effects would be on the Zeeman splittings as the Z of the atoms got larger and the relativistic effects got stronger.

He calculated this effect to high precision, and then waited for Landé to make the measurements.  When Landé finally got back to him, it was to say that there was absolutely no relativistic corrections for Thallium (Z = 90).  The splitting remained simply fixed by the Bohr magneton value with no relativistic effects.

Pauli had no choice but to reject the existing core model of angular momentum and to ascribe the Zeeman effects to the outer valence electron.  But this was just the beginning.

Pauli’s Breakthrough

https://onionesquereality.wordpress.com/wp-content/uploads/2012/07/wolfgang-pauli.jpg
Fig. 5 Wolfgang Pauli [Image]

By November of 1924 Pauli had concluded, in a letter to Landé

“In a puzzling, non-mechanical way, the valence electron manages to run about in two states with the same k but with different angular momenta.”

And in December of 1924 he submitted his work on the relativistic effects (or lack thereof) to Zeitschrift für Physik,

“From this viewpoint the doublet structure of the alkali spectra as well as the failure of Larmor’s theorem arise through a specific, classically  non-describable sort of Zweideutigkeit (two-foldness) of the quantum-theoretical properties of the valence electron. (Pauli, 1925a, pg. 385)

Around this time, he read a paper by Edmund Stoner in the Philosophical Magazine of London published in October of 1924.  Stoner’s insight was a connection between the number of states observed in a magnetic field and the number of states filled in the successive positions of elements in the periodic table.  Stoner’s insight led naturally to the 2-8-18 sequence for the table, although he was still thinking in terms of the quantum numbers of the core model of the atoms.

This is when Pauli put 2 plus 2 together: He realized that the states of the atom could be indexed by a set of 4 quantum numbers: n-the principal quantum number, k1-the angular momentum, m1-the space quantization number, and a new fourth quantum number m2 that he introduced but that had, as yet, no mechanistic explanation.  With these four quantum numbers enumerated, he then made the major step:

It should be forbidden that more than one electron, having the same equivalent quantum numbers, can be in the same state.  When an electron takes on a set of values for the four quantum numbers, then that state is occupied.

This is the Exclusion Principle:  No two electrons can have the same set of quantum numbers.  Or equivalently, no electron state can be occupied by two electrons.

Fig. 6 Level filling for Krypton using the Pauli Exclusion Principle

Today, we know that Pauli’s Zweideutigkeit is electron spin, a concept first put forward in 1925 by the American physicist Ralph Kronig and later that year by George Uhlenbeck and Samuel Goudsmit.



And Pauli’s Exclusion Principle is a consequence of the antisymmetry of electron wavefunctions first described by Paul Dirac in 1926 after the introduction of wavefunctions into quantum theory by Erwin Schrödinger earlier that year.

Fig. 7 The periodic table today.

Timeline:

1845 – Faraday effect (rotation of light polarization in a magnetic field)

1896 – Zeeman effect (splitting of optical transition in a magnetic field)

1897 – Anomalous Zeeman effect (half-integer splittings)

1902 – Lorentz and Zeeman awarded Nobel prize (for electron theory)

1921 – Paschen-Back effect (strong-field Zeeman effect)

1922 – Stern-Gerlach (space quantization)

1924 – de Broglie matter waves

1924 – Bose statistics of photons

1924 – Stoner (conservation of number of states)

1924 – Pauli Exclusion Principle

References:

E. C. Stoner (Philosophical Magazine, 48 [1924], 719) Issue 286  October 1924

M. Jammer, The conceptual development of quantum mechanics (Los Angeles, Calif.: Tomash Publishers, Woodbury, N.Y. : American Institute of Physics, 1989).

M. Massimi, Pauli’s exclusion principle: The origin and validation of a scientific principle (Cambridge University Press, 2005).

Pauli, W. Über den Einfluß der Geschwindigkeitsabhängigkeit der Elektronenmasse auf den Zeemaneffekt. Z. Physik 31, 373–385 (1925). https://doi.org/10.1007/BF02980592

Pauli, W. (1925). “Über den Zusammenhang des Abschlusses der Elektronengruppen im Atom mit der Komplexstruktur der Spektren”. Zeitschrift für Physik. 31 (1): 765–783

Read more in Books by David Nolte at Oxford University Press

Science Underground: Neutrino Physics and Deep Gold Mines

“By rights, we shouldn’t even be here,” says Samwise Gamgee to Frodo Baggins in the Peter Jackson movie The Lord of the Rings: The Two Towers

But we are!

We, our world, our Galaxy, our Universe of matter, should not exist.  The laws of physics, as we currently know them, say that all the matter created at the instant of the Big Bang should have annihilated with all the anti-matter there too.  The great flash of creation should have been followed by a great flash of destruction, and all that should be left now is a faint glow of light without matter.

Except that we are here, and so is our world, and our Galaxy and our Universe … against the laws of physics as we know them.

So, there must be more that we have yet to know.  We are not done yet with the laws of physics.

Which is why the scientists of the Sanford Underground Research Facility (SURF), a kilometer deep under the Black Hills of South Dakota, are probing the deep questions of the universe near the bottom of a century-old gold mine.

Homestake Mine

>>> Twenty of us are plunging vertically at one meter per second into the depths of the earth, packed into a steel cage, seven to a row, dressed in hard hats and fluorescent safety vests and personal protective gear plus a gas filter that will keep us alive for a mere 60 minutes if something goes very wrong.  It is dark, except for periodic fast glimpses of LED-lit mine drifts flying skyward, then rock again, repeating over and over for ten minutes.  Drops of water laced with carbonate drip from the cage ceiling, that, when dried, leave little white stalagmites on our clothing.  A loud bang tells everyone inside that a falling boulder has crashed into the top of the cage, and we all instinctively press our hard hats more tightly onto our heads.  Finally, the cage slows, eventually to a crawl, as it settles to the 4100 level of the Homestake mine. <<<

The Homestake mine was founded in 1877 on land that had been deeded for all time to the Lakota Sioux by the United States Government in the Treaty of Fort Laramie in 1868—that is, before George Custer, twice cursed, found gold in the rolling forests of Ȟe Sápa—the Black Hills, South Dakota.  The prospectors rushed in, and the Lakota were pushed out.

Gold was found washed down in the streams around the town of Deadwood, but the source of the gold was found a year later at the high Homestake site by prospectors.  The stake was too large for them to operate themselves, so they sold it to a California consortium headed by George Hearst, who moved into town and bought or stole all the land around it.  By 1890, the mine was producing the bulk of gold and silver in the US.  When George Hearst died in 1891, his wife Phoebe donated part of the fortune to building projects at the University of California at Berkeley, including the Hearst Mining Building, which was the largest building devoted to the science of mining engineering in the world.  Their son, William Randolph Hearst, became a famous newspaper magnate and a possible inspiration for Orson Well’s Citizen Cane.

The interior of Hearst Mining Building, UC Berkeley campus.

By the late 1900’s, the mining company had excavated over 300 miles of tunnels and extracted nearly 40 million ounces of gold (equivalent to $100B today).  Over the years, the mine had gone deeper and deeper, eventually reaching the 8000 foot level (about 3000 feet below sea level). 

This unique structure presented a unique opportunity for a nuclear chemist, Ray Davis, at Brookhaven National Laboratory who was interested in the physics of neutrinos, the elementary particles that Enrico Fermi had named the “little neutral ones” that accompany radioactive decay. 

Neutrinos are unlike any other fundamental particles, passing through miles of solid rock as if it were transparent, except for exceedingly rare instances when a neutrino might collide with a nucleus.  However, neutrino detectors on the surface of the Earth were overwhelmed by signals from cosmic rays.  What was needed was a thick shield to protect the neutrino detector, and what better shield than thousands of feet of rock? 

Davis approached the Homestake mining company to request space in one of their tunnels for his detector.  While a mining company would not usually be receptive to requests like this, one of its senior advisors had previously had an academic career at Harvard, and he tipped the scales in favor of Davis.  The experiment would proceed.

The Solar Neutrino Problem

>>> After we disembark onto the 4100 level (4100 feet below the surface) from the Ross Shaft, we load onto the rail cars of a toy train, the track width little more than a foot wide.  The diminutive engine clunks and clangs and jerks itself forward, gathering speed as it pushes and pulls us, disappearing into a dark hole (called a drift) on a mile-long trek to our experimental site.  Twice we get stuck, the engine wheels spinning without purchase, and it is not clear if the engineers can get it going again. 

At this point we have been on the track for a quarter of an hour and the prospect of walking back to the Ross is daunting.  The only other way out, the Yates Shaft, is down for repairs.  The drift is unlit except by us with our battery-powered headlamps sweeping across the rock face, and who knows how long the batteries will last?  The ground is broken and uneven, punctuated with small pools of black water.  There would be a lot of stumbling and falls if we had to walk our way out.  I guess this is why I had to initial and sign in twenty different places on six pages, filled with legal jargon nearly as dense as the rock around us, before they let me come down here. <<<

In 1965, the Homestake mining crews carved out a side cavern for Davis near the Yates shaft at the 4850 level of the mine.  He constructed a large vat to hold cleaning fluid that contained lots of chlorine atoms.  When a rare neutrino interacted with a chlorine nucleus, the nucleus would convert to argon and give off a characteristic flash of light.  By tallying the flashes of light, and by calculating how likely it was for a neutrino to interact with a nucleus, the total flux of neutrinos through the vat could be back calculated.

The main source for neutrinos in our neck of the solar system is the sun.  As hydrogen fuses into helium, it gives off neutrinos.  These pass through the overlying layers of the sun and pass through the Earth and through Davis’ vat—except those rare cases when chlorine converts to argon.  The rate at which solar neutrinos should be detected in the vat was calculated very accurately by John Bahcall at Cal Tech.

By the early 1970’s, there were enough data that the total neutrino flux could be calculated and compared to the theoretical value based on the fusion reactions in the sun—and they didn’t match.  Worse, they didn’t match within a factor of three!  There were three times fewer neutrino events detected that there should have been.  Where were all the missing neutrinos?

Origins and fluxes of solar neutrinos.

This came to be called the “Solar neutrino problem”.  At first, everyone assumed that the experiment was wrong, but Davis knew he was right.  Then others said the theoretical values were wrong, but Bahcall knew he was right.  The problem was, that Davis and Bahcall couldn’t both be right, could they?

Enter neutrino oscillations

The neutrinos coming from the sun originate mostly as what are known as electron neutrinos.  These interact with a neutron in a chlorine nucleus to convert it to a proton plus an ejected electron.  But if the neutrino were of a different kind, perhaps a muon neutrino, then there isn’t enough energy for the neutron to eject a muon, so the reaction doesn’t take place. 

Hydrogen fusion in the sun.

This became the leading explanation for the missing solar neutrinos.  If many of them converted to muon neutrinos on their way to the Earth, then the Davis experiment wouldn’t detect them—hence the missing events.

The way that neutrinos can oscillate from electron neutrinos to muon neutrinos is if neutrinos have a very small but finite mass.  This was the solution, then, to the solar neutrino problem.  Neutrinos have mass.  Ray Davis was awarded the Nobel Prize in Physics in 2002 for his discovery of the missing neutrinos.

But one solution begets another problem: the Standard Model of elementary particles says that neutrinos are massless.  What can be going on with the Standard Model?

Once again, the answer may be found deep underground.

Sanford Underground Research Facility (SURF)

>>> The rock of the Homestake is one of the hardest and densest rocks you will find, black as night yet shot through with white streaks of calcite like the tails of comets.  It is impermeable, and despite being so deep, the rock is surprisingly dry—most of the fractures are too tight to allow a trickle through. 

As our toy train picks up speed, the veins flash by in our headlamps, sometimes sparkling with pin pricks of reflected light.  A gold fleck perhaps?  Yet the drift as a whole (or as a hole) is a shabby thing, rusty wedges half buried in the ceiling to keep slabs from falling, bent and battered galvanized metal pinned to the walls by rock bolts to hold them back, flimsy metal webbing strung across the ceiling to keep boulders from crushing our heads.  It’s dirty and dark and damp and hewn haphazardly from the compressed crust.  There is no art, no sense of place.  These shafts were dynamited through, at three-to-five feet per detonation, driven by money and the need for the gold, so nobody had any sense of aesthetics. <<<

The Homestake mine closed operations in 2001 due to the low grade of ore and the sagging price of gold.  They continued pumping water from the mine for two more years in anticipation of handing the extensive underground facility over to the National Science Foundation for use as a deep underground science lab.  However, delays in the transfer and the cost of pumping forced them to turn off the pumps and the water slowly began rising through the levels, taking a year or more to rise and flood the famous 4850 level while negotiations continued. 

The surface buildings of the Sanford Underground Research Facility (SURF).
The open pit at Homestake.

Finally, the NSF took over the facility to house the Deep Underground Science and Engineering Laboratory (DUSEL) that would operate at the deepest levels, but these had already been flooded.  After a large donation from South Dakota banker T. Denny Sanford and support from the Governor Mike Rounds, the facility became the Sanford Underground Research Fability (SURF).  The 4850 level was “dewatered”, and the lab was dedicated in 2009.  But things were still not settled.  NSF had second thoughts, and in 2011 the plans for DUSEL (still under water) were terminated and the lab was transferred to the Department of Energy (DOE), administered through the Lawrence Berkeley National Laboratory, to host experiments at the 4850 level and higher.

Layout of the mine levels at SURF.

Two early experiments at SURF were the Majorana Demonstrator and LUX. 

The Majorana Demonstrator was an experiment designed to look for neutrino-less double-beta decay where two neutrons in a nucleus decay simultaneously, each emitting a neutrino. A theory of neutrinos proposed by the Italian physicist, Ettore Marjorana, in 1937 that goes beyond the Standard Model ,says that a neutrino is its own antiparticle. If this were the case, then the two neutrinos emitted in the double beta decay could annihilate each otherhence a “neutrinoless” double beta decay. The Demonstrator was too small to actually see such an event, but it tested the concept and laid the ground for later larger experiments. It operated between 2016 and 2021.

Neutrinoless double-beta decay.

The Large Underground Xenon (LUX) experiment was a prototype for the search for dark matter. Dark matter particles are expected to interact very weakly with ordinary matter (sort of like neutrinos, but even less interactive). Such weakly interacting massive particles (WIMPs) might scatter off a nucleus in an atom of Xenon, shifting the nucleus enough that it emits electrons and light. These would be captured by detectors at the caps of the liquid Xenon container.

Once again, cosmic rays at the surface of the Earth would make the experiment unworkable, but deep underground there is much less background within which to look for the “needle in the haystack”. LUX operated from 2009 to 2016 and was not big enough to hope to see a WIMP, but like the Demonstrator, it was a proof-of-principle to show that the idea worked and could be expanded to a much larger 7-ton experiment called LUX-Zeplin that began in 2020 and is ongoing, looking for the biggest portion of mass in our universe. (About a quarter of the energy of the universe is composed of dark matter. The usual stuff we see around us only makes up about 4% of the energy of the universe.)

LUX-Zeplin Experiment

Deep Underground Neutrino Experiment (DUNE)

>>> “Always keep a sense of where you are,” Bill the geologist tells us, in case we must hike our way out.  But what sense is there?  I have a natural built-in compass that has served me well over the years, but it seems to run on the heavens.  When I visited South Africa, I had an eerie sense of disorientation the whole time I was there.  When you are a kilometer underground, the heavens are about as far away as Heaven.  There is no sense of orientation, only the sense of lefts and rights. 

We were told there would be signs directing us towards the Ross or Yates Shafts.  But once we are down here, it turns out that these “signs” are crudely spray-painted marks on the black rock, like bad graffiti.  When you see them, your first thought is of kids with spray cans making a mess—until you suddenly recognize an R or an O or two S’s along with an indistinct arrow that points slightly more one way than the other. <<<

Deep Underground Neutrino Experiment (DUNE).

One of the most ambitious high-energy experiments ever devised is the Long Baseline Neutrino Facility (LBNF) that is 800 miles long. It begins in Batavia, Illinois, at the Fermilab accelerator that launches a beam of neutrinos that travel 800 miles through the Earth to detectors at the Deep Underground Neutrino Experiment (DUNE) at SURF in Lead, South Dakota. The neutrinos are expected to oscillate in flavor, just like solar neutrinos, and the detection rates at DUNE could finally answer one of the biggest outstanding questions of physics: Why is our universe made of matter?

At the instant of the Big Bang, equal amounts of matter and antimatter should have been generated, and these should have annihilated in equal manner, and the universe should be filled with nothing but photons. But it’s not. Matter is everywhere. Why?

In the Standard Model there are many symmetries, also known as conserved properties. One power symmetry is known as CPT symmetry, where C is a symmetry of changing particles into the antiparticles, P is a reflection of left-handed or right-handed particles, and T is time-reversal symmetry. Yet there could be a CP symmetry too, which you might expect if time-reversal is taken as a symmetric property of physics. But it’s not!

There is a strange meson called a Kaon that does not decay the same way for its particle and antiparticle pair, violating CP symmetry. This was discovered in 1964 by James Cronin and Val Fitch who won the 1980 Nobel prize in physics. The discovery shocked the physics world. Since then, additional violations of CP symmetry have been observed in quarks. Such a broken symmetry is allowed in the Standard Model of particles, but the effect is so exceedingly smallCP is so extremely close to being a true symmetrythat it cannot explain the size of the matter-antimatter asymmetry in the universe.

Neutrino oscillations also can violate CP symmetry, but the effects have been hard to measurethus the need for DUNE. By creating large amounts of neutrinos, beaming them 800 miles through the Earth, and detecting them in the vast liquid Argon vats in the underground caverns of SURF, the parameters of neutrino oscillation can be measured directly, possibly explaining the matter asymmetry of the universeand answering Samwise’s question of why we are here.

Center for Understanding Subsurface Signals and Permeability (CUSSP)

>>> Finally, in the distance, as we rush down the dark drift, we see a bright glow that grows to envelope us with a string of white LED lights.  The drift is not so shabby here, with fresh pipes and electrical cables laid neatly by the side.  We had arrived at the CUSSP experimental site.  It turned out it was just a few steps away from the inactive Yates Shaft, that, if it had been operating, would have removed the need for the crazy train ride through black rock along broken tunnels.  But that is OK.  Because we are here, and this is what had brought us down into the Earth to answer questions down-to-Earth as we try to answer questions related to our future existence on this planet, learning what we need to generate the power for our high-tech society without making our planet unlivable.  <<<

Not all the science at SURF is so ethereal. For instance, research on Enhanced Geothermal Systems (EGS) is funded by the DOE Office of Basic Energy Sciences.  Geothermal systems can generate power by extracting super-heated water from underground to run turbines. However, superheated water is nasty stuff, very corrosive and full of minerals that tend to block up the fractures that the water flows through. The idea of enhanced geothermal systems is to drill boreholes and use “fracking” to create fractures in the hard rock, possibly refracturing older fractures that had become blocked. If this could be done reliably, then geothermal sites could be kept operating.

The Center for Understanding Subsurface Signals and Permeability (CUSSP) was recently funded by the DoE to use the facilities at SURF to study how well fracks can be controlled. The team is led by Pacific Northwest National Lab (PNNL) with collaborations from Lawrence Berkeley Lab, Maryland, Illinois and Purdue, among others. We are installing seismic equipment as well as electrical resistivity to monitor the induced fractures.

The CUSSP installation on the 4100 level was the destination of our underground journey, to see the boreholes in person and to get a sense of the fracture orientations at the drift wall. During the half hour at the site, rocks were examined, questions were answered, tall tales were told, and it was time to return.

Shooting to the Stars

>>> At the end of the tour, we pack again into the Ross cage and are thrust skyward at 2 meters per second—twice the speed as coming down because of the asymmetry of slack cables that could snag and snap.  Ears pop, and pop again, until the cage slows, and we settle to the exit level, relieved and tired and ready to see the sky. Thinking back, as we were shooting up the shaft, I imagined that the cage would never stop, flying up past the massive hoist, up and onward into the sky and to the stars.  <<<

In a video we had been shown about SURF, Jace DeCory, a scholar of the Lakota Sioux, spoke of the sacred ground of Ȟe Sápa—the Black Hills.  Are we taking again what is not ours?  This time it seems not.  The scientists of SURF are linking us to the stars, bringing knowledge instead of taking gold.  Jace quoted Carl Sagan: “We are made of star-stuff.”  Then she reminded us, the Lakota Sioux have known that all along.

Edward Purcell:  From Radiation to Resonance

As the days of winter darkened in 1945, several young physicists huddled in the basement of Harvard’s Research Laboratory of Physics, nursing a high field magnet to keep it from overheating and dumping its field.  They were working with bootstrapped equipment—begged, borrowed or “stolen” from various labs across the Harvard campus.  The physicist leading the experiment, Edward Mills Purcell, didn’t even work at Harvard—he was still on the payroll of the Radiation Laboratory at MIT, winding down from its war effort on radar research for the military in WWII, so the Harvard experiment was being done on nights and weekends.

Just before Christmas, 1945, as college students were fleeing campus for the first holiday in years without war, the signal generator, borrowed from a psychology lab, launched an electromagnetic pulse into simple paraffin—and disappeared!  It had been absorbed by the nuclear spins of the copious number of hydrogen nuclei (protons) in the wax. 

The experiment was simple, unfunded, bootstrapped—and it launched a new field of physics that ultimately led to magnetic resonance imaging (MRI) that is now the workhorse of 3D medical imaging.

This is the story, in Purcell’s own words, of how he came to the discovery of nuclear magnetic resonance in solids, for which he was awarded the Nobel Prize in Physics in 1952.

Early Days

Edward Mills Purcell (1912 – 1997) was born in a small town in Illinois, the son of a telephone businessman, and some of his earliest memories were of rummaging around in piles of telephone equipment—wires and transformers and capacitors. He especially like the generators:

“You could always get plenty of the bell-ringing generators that were in the old telephones, which consisted of a series of horseshoe magnets making the stator field and an armature that was wound with what must have been a mile of number 39 wire or something like that… These made good shocking machines if nothing else.”

His science education in the small town was modest, mostly chemistry, but he had a physics teacher, a rare woman at that time, who was open to searching minds. When she told the students that you couldn’t pull yourself up using a single pulley, Purcell disagreed and got together with a friend:

“So we went into the barn after school and rigged this thing up with a seat and hooked the spring scales to the upgoing rope and then pulled on the downcoming rope.”

The experiment worked, of course, with the scale reading half the weight of the boy. When they rushed back to tell the physics teacher, she accepted their results immediately—demonstration trumped mere thought, and Purcell had just done his first physics experiment.

However, physics was not a profession in the early 1920’s.

“In the ’20s the idea of chemistry as a science was extremely well publicized and popular, so the young scientist of shall we say 1928 — you’d think of him as a chemist holding up his test tube and sighting through it or something…there was no idea of what it would mean to be a physicist.

The name Steinmetz was more familiar and exciting than the name Einstein, because Steinmetz was the famous electrical engineer at General Electric and was this hunchback with a cigar who was said to know the four-place logarithm table by heart.”

Purdue University and Prof. Lark-Horowitz

Purcell entered Purdue University in the Fall of 1929. The University had only 4500 students who paid $50 a year to attend. He chose a major in electrical engineering, because

“Being a physicist…I don’t remember considering that at that time as something you could be…you couldn’t major in physics. You see, Purdue had electrical, civil, mechanical and chemical engineering. It had something called the School of Science, and you could graduate, having majored in science.”

But he was drawn to physics. The Physics Department at Purdue was going through a Renaissance under the leadership of its new department head Prof. Lark-Horovitz

“His [Lark-Horovitz] coming to Purdue was really quite important for American physics in many ways…  It was he who subsequently over the years brought many important and productive European physicists to this country; they came to Purdue, passed through. And he began teaching; he began having graduate students and teaching really modern physics as of 1930, in his classes.”

Purcell attended Purdue during the early years of the depression when some students didn’t have enough money to find a home:

“People were also living down there in the cellar, sleeping on cots in the research rooms, because it was the Depression and some of the graduate students had nowhere else to live. I’d come in in the morning and find them shaving.”

Lark-Horovitz was a demanding department chair, but he was bringing the department out of the dark ages and into the modern research world.

“Lark-Horovitz ran the physics department on the European style: a pyramid with the professor at the top and everybody down below taking orders and doing what the professor thought ought to be done. This made working for him rather difficult. I was insulated by one layer from that because it was people like Yearian, for whom I was working, who had to deal with the Lark. “

Hubert Yearian had built a 20-kilovolt electron diffraction camera, a Debye-Scherrer transmission camera, just a few years after Davisson and Germer had performed the Nobel-prize winning experiment at Bell Labs that proved the wavelike nature of electrons. Purcell helped Yearian build his own diffraction system, and recalled:

“When I turned on the light in the dark room, I had Debye-Scherrer rings on it from electron diffraction — and that was only five years after electron diffraction had been discovered. So it really was right in the forefront. And as just an undergraduate, to be able to do that at that time was fantastic.”

Purcell graduated from Purdue in 1933 and from contacts through Lark-Horovitz he was able to spend a year in the physics department at Karlsruhe in Germany. He returned to the US in 1934 to enter graduate scool in physics at Harvard, working under Kenneth Bainbridge. His thesis topic was a bit of a bust, a dusty old problem in classical electrostatics that was a topic far older than the electron diffraction he worked on at Purdue. But it was enough to get him his degree in 1938, and he stayed on at Harvard as a faculty instructor until the war broke out.

Radiation Laboratory, MIT

In the Fall at the end of 1940 the Radiation Lab at MIT was launched and began vacuuming up all the unattached physicists in the United States, and Purcell was one of them. The radiation lab also vacuumed up some of the top physicists in the country, like Isidor Rabi from Columbia, to supervise the growing army of scientists that were committed to the war effort—even before the US was in the war.

“Our mission was to make a radar for a British night fighter using 10-centimeter magnetron that had been discovered at Birmingham.”

This research turned Purcell and his cohort into experts in radio-frequency electronics and measurement. He worked closely with Rabi (Nobel Prize 1944) and Norman Ramsey (Nobel Prize 1989) and Jerrold Zacharias, who were in the midst of measuring resonances in molecular beams. The names at the Rad Lab was like reading a Who’s Who of physics at that time:

“And then there was the theoretical group, which was also under Rabi. Most of their theory was concerned with electromagnetic fields and signal to noise, things of that sort. George Uhlenbeck was in charge of it for quite a long time, and Bethe was in it for a while; Schwinger was in it; Frank Carlson; David Saxon, now president of the University of California; Goudsmit also.”

Nuclear Magnetic Resonance

The research by Rabi had established the physics of resonances in molecular beams, but there were serious doubts that such phenomena could exist in solids. This became one of the Holy Grails of physics, with only a few physicists across the country with the skill and understanding to make a try to observe it in the solid state.

Many of the physicists at the Rad Lab were wondering what they should do next, after the war was over.

“Came the end of the war and we were all thinking about what shall we do when we go back and start doing physics. In the course of knocking around with these people, I had learned enough about what they had done in molecular beams to begin thinking about what can we do in the way of resonance with what we’ve learned. And it was out of that kind of talk that I was struck with the idea for what turned into nuclear magnetic resonance.”

“Well, that’s how NMR started, with that idea which, as I say, I can trace back to all those indirect influences of talking with Rabi, Ramsey and Zacharias, thinking about what we should do next.

“We actually did the first NMR experiment here [Harvard], not at MIT. But I wasn’t officially back. In fact, I went around MIT trying to borrow a magnet from somebody, a big magnet, get access to a big magnet so we could try it there and I didn’t have any luck. So I came back and talked to Curry Street, and he invited us to use his big old cosmic ray magnet which was out in the shed. So I didn’t ask anybody else’s permission. I came back and got the shop to make us some new pole pieces, and we borrowed some stuff here and there. We borrowed our signal generator from the Psycho Acoustic Lab that Smitty Stevens had. I don’t know that it ever got back to him. And some of the apparatus was made in the Radiation Lab shops. Bob Pound got the cavity made down there. They didn’t have much to do — things were kind of closing up — and so we bootlegged a cavity down there. And we did the experiment right here on nights and week-ends.

This was in December, 1945.

“Our first experiment was done on paraffin, which I bought up the street at the First National store between here and our house. For paraffin we thought we might have to deal with a relaxation time as long as several hours, and we were prepared to detect it with a signal which was sufficiently weak so that we would not upset the spin temperature while applying the r-f field. And, in fact, in the final time when the experiment was successful, I had been over here all night … nursing the magnet generator along so as to keep the field on for many hours, that being in our view a possible prerequisite for seeing the resonances. Now, it turned out later that in paraffin the relaxation time is actually 10-4 seconds. So I had the magnet on exactly 108 times longer than necessary!

The experiment was completed just before Christmas, 1945.


E. M. Purcell, H. C. Torrey, and R. V. Pound, “RESONANCE ABSORPTION BY NUCLEAR MAGNETIC MOMENTS IN A SOLID,” Physical Review 69, 37-38 (1946).

“But the thing that we did not understand, and it gradually dawned on us later, was really the basic message in the paper that was part of Bloembergen’s thesis … came to be known as BPP (Bloembergen, Purcell and Pound). [This] was the important, dominant role of molecular motion in nuclear spin relaxation, and also its role in line narrowing. So that after that was cleared up, then one understood the physics of spin relaxation and understood why we were getting lines that were really very narrow.”

Diagram of the microwave cavity filled with paraffin.

This was the discovery of nuclear magnetic resonance (NMR) for which Purcell shared the 1952 Nobel Prize in physics with Felix Bloch.

David D. Nolte is the Edward M. Purcell Distinguished Professor of Physics and Astronomy, Purdue University. Sept. 25, 2024

References and Notes

• The quotes from EM Purcell are from the “Living Histories” interview in 1977 by the AIP.

• K. Lark-Horovitz, J. D. Howe, and E. M. Purcell, “A new method of making extremely thin films,” Review of Scientific Instruments 6, 401-403 (1935).

• E. M. Purcell, H. C. Torrey, and R. V. Pound, “RESONANCE ABSORPTION BY NUCLEAR MAGNETIC MOMENTS IN A SOLID,” Physical Review 69, 37-38 (1946).

• National Academy of Sciences Biographies: Edward Mills Purcell

Read more in Books by David Nolte at Oxford University Press