100 Years of Quantum Physics: Rise of the Matrix (1925)

Niels Bohr’s atom, by late 1925, was a series of kludges cobbled together into a Rube Goldberg type of construction.  On the one hand, there was the Bohr-Sommerfeld quantization conditions that let Bohr’s originally circular orbits morph into ellipses like planets in the solar system.  On the other hand, there was Pauli’s exclusion principle that partially explained the building up of electron orbits in many-electron atoms, but to many it seemed like an ad hoc rule. 

The time was ripe for a new perspective. Enter the wunderkind, Werner Heisenberg.

Heisenberg’s Trajectory

Werner Heisenberg (1901 – 1976) was the golden boy—smart, dashing, ambitious. He excelled at everything he did and was a natural leader among his group of young friends. He entered the University of Munich in 1920 at the age of 19 to begin working towards his doctorate degree in mathematics, but he quickly became entranced with an advanced seminar course given by Arnold Sommerfeld (1868 – 1951) on quantum mechanics. His studies under Sommerfeld advanced quickly, and he was proficient enough to be “lent out” to the group of Max Born and David Hilbert at the University of Göttingen for the 1922-1923 semester when Sommerfeld was on sabbatical at the University of Wisconsin, Madison, in the United States. Born was impressed with the young student and promised him a post-doc position upon his graduation with a doctoral degree in theoretical physics the next year (when Heisenberg would be just 22 years old).

Unfortunately, his brilliantly ascending career ran headlong into “Willy” Wien who had won the Nobel Prize in 1911 for his displacement law of black body radiation. Wien was a hard-baked experimentalist who had little patience with the speculative flights of theoretical physics. Heisenberg, in contrast, had little patience with the mundane details of experimental science. The two were heading for an impasse.

The collision came during the oral examination for Heisenberg’s doctoral degree. Wien, determined to put Heisenberg in his place, opened with a difficult question about experimental methods. Heisenberg could not answer, so Wien asked a slightly less difficult but still detailed question that Heisenberg also could not answer. The examination went on like this until finally Wien asked Heisenberg to derive the resolving power of a simple microscope. Heisenberg was so flustered by this time that he could not do even that. Wien, in disgust, turned to Sommerfeld and pronounced a failing grade for Heisenberg. After Heisenberg stepped out of the room, the professors wrangled over the single committee grade that would need to be recorded. Sommerfeld’s top grade for Heisenberg’s mathematical performance and Wien’s bottom grade for his experimental performance led to the compromise grade of a “C” for the exam—the minimum grade sufficient to pass.

Heisenberg was mortified. Accustomed always to excelling and being lauded for his talents, Heisenberg left town that night, taking the late train to Göttingen where a surprised Born found him outside his office early the next morning—fully two months ahead of schedule. Heisenberg told him everything and asked if Born would still have him. After learning more about Wien’s “ambush”, Born assured Heisenberg that he still had a place for him.

Heisenberg was so successful at Göttingen, that when Born planned to spend a year sabbatical at MIT in the United States for the 1924-1925 semester, Heisenberg was “lent out” to Niels Bohr in Copenhagen. While there, Heisenberg, Bohr, Pauli and Kramers had intense discussions about the impending crisis in quantum theory. Bohr was fully aware of the precarious patches that made up the quantum theory of the many-electron atom, and the four physicists attempted to patch it yet again with a theoretical effort led by Kramers to try to reconcile optical transitions in the atomic spectra. But no one was satisfied, and the theory had serious internal inconsistencies, not the least of which was a need to sacrifice the sacrosanct principle of conservation of energy.

Through it all, Heisenberg was thrilled by his deep involvement in the most fundamental questions of physics of the day and was even more thrilled by his interactions with the great minds he found in Copenhagen. When he returned to Göttingen on April 27, 1925, the arguments and inconsistencies were ringing in his head, infecting the group at Göttingen with the challenging physics, especially Max Born and Pascual Jordan.

Little headway could be made, until Heisenberg had a serious attack of hay fever that sent him for respite on June 7 to the remote island of Helgoland in the North Sea far off of the coast from Bremerhaven. The trip cleared Heisenberg’s head—literally and figuratively—as he had time to come to grips with the core difficulties of quantum theory.

Trajectory’s End

The Mythology of Physics recounts the tale of when Heisenberg had his epiphany, watching from the beach as the sun rose over the sea. The repeated retelling has solidified the moment into revealed “truth”, but the origins are probably more prosaic. Strip a problem bare of all its superficial coverings and what remains must be the minimal set of what can be known. Yet to do so requires courage, for much of the superficial coverings are established dogma, embedded so deeply in the thought of the day that angry reactions must be expected.

Map of Heligoland, Germany
Fig. 1 Heligoland Germany. (From Google Maps and Wikipedia)

At some moment, Heisenberg realized that the superficial covering of atomic theory was the slavish devotion to the electron trajectory—to the Bohr-Sommerfeld electron orbits. Ever since Kepler, the mental image of masses in orbit around their force center had dominated physical theory. Quantum theory likewise was obsessed with images of trajectories—it persists to this day in the universal logo of atomic energy. Heisenberg now rejected this image as unknowable and hence not relevant for a successful theory. But if electron orbits were out, what was the minimal set of what can be known to be true? Heisenberg decided that it was simply the wavelengths and intensities of light absorbed and emitted by atoms. But what then? How do you create a theory constructed on transition energies and intensities alone? The epiphany was the answer—construct a dynamics by which the quantum system proceeds step-by-step, transition-by-transition, while retaining the sacrosanct energy conservation that had been discarded by Kramer’s unsuccessful theory.

The result, after returning to Göttingen, is Heisenberg’s paper, submitted July 29, 1925 to Zeitschrift für Physik titled Über quantentheoretische Umdeutung kinematischer und mechanischer Beziehungen (Over the quantum theoretical meaning of kinematic and mechanical relationships).

Heisenberg 1925 Over quantum theoretical meaning of kinematic and mechanical relationships
Fig. 2 The heading of Heisenberg’s 1925 Zeitschrift für Physik article. The abstract reads: “This work seeks to find fundamental principles for a quantum theoretical mechanics that is based exclusively on relationships among principal observable magnitudes.” [1]

Heisenberg begins with the fundamental energy relationship between the frequency of light and the energy difference in a transition

Heisenberg 1925 Zeitschrift fur Physik paper on transition energies
Fig. 3 Dynamics emerges from the transitions among different energy states in the atom [1].

His goal is to remove the electron orbit from the theory, yet positions cannot be removed entirely, so he takes the step of transforming position into a superposition of amplitudes with frequencies related to the optical transitions.

Heisenberg replacing the electron orbits with their Fourier coefficients based on transition frequencies.
Fig. 4 Replace the electron orbits with their Fourier coefficients based on transition frequencies [1].

Armed with the amplitude coefficients and the transition frequencies, he constructs sums of transitions that proceed step by step between initial and final states.

Heisenberg connecting initial and final states with series of steps that conserve energy.
Fig. 5 Consider all the possible paths between initial and final states that obey energy conservation [1].

After introducing the electric field, Heisenberg calculates the polarizability of the atom, the induced moment, using Kramer’s dispersion formula combined with his new superposition.

Heisenberg transition amplitude based on sums over possible energy steps
Fig. 6 Transition amplitude between initial and final states based on a series of energy-conserving transition steps [1].

Heisenberg applied his new theoretical approach to one-dimensional quantum systems, using as an explicit example the anharmonic oscillator, and it worked! Heisenberg had invented a new theoretical approach to quantum physics that relied only on transition frequencies and amplitudes—only what could be measured without any need to speculate on what types of motions electrons might be executing. Heisenberg published his new theory on his own, as sole author befitting his individual breakthrough. Yet it was done under the guidance of his supervisor Max Born, who recognized something within Heisenberg’s mathematics.

The Matrix

Heisenberg’s derivations involved numerous summations as amplitudes multiplied amplitudes in complicated sequences. The mathematical steps themselves were straightforward—just products and sums—but the numbers of permutations were daunting, and their sequential order mattered, requiring skill and care not to miss terms or to get minus signs wrong.

Yet Born recognized within Heisenberg’s mathematics the operations of matrix multiplication. The different permutations with sums of alternating signs were exactly what one obtained by taking determinants of matrices, and it was well known that the order of matrix multiplication mattered, where a*b ≠ b*a. With his assistant Pacual Jordan, the two reworked Heisenberg’s paper in the mathematical language of matrices, submitting their “mirror” paper to Zeitschrift on Sept. 27, 1925. Their title was prophetic: Towards Quantum Mechanics. This was the first time that the phrase “quantum mechanics” was used to encompass all of the widely varying aspects of quantum systems.

Born and Jordan 1925 Zur Quantenmechanik
Fig. 7 The header for Born and Jordan’s reworking of Heisenberg’s paper into matrix mathematics [2].

In the abstract, they state:

The approaches recently put forward by Heisenberg (initially for systems with one degree of freedom) are developed into a systematic theory of quantum mechanics. The mathematical tool is matrix calculus. After this is briefly outlined, the mechanical equations of motion are derived from a variational principle, and the proof is carried out that, on the basis of Heisenberg’s quantum condition, the energy theorem and Bohr’s frequency condition follow from the mechanical equations.  Using the example of the anharmonic oscillator, the question of the uniqueness of the solution and the significance of the phases in the partial oscillations are discussed.  The conclusion describes an attempt to incorporate the laws of the electromagnetic field into the new theory.

Born and Jordan begin by creating a matrix form for the Hamiltonian subject to Hamilton’s dynamical equations

Born Hamiltonian and Hamilton's equations
Fig. 8 Defining the Hamiltonian with matrix operators [2].

Armed with matrix quantities for position and momentum, Born and Jordan construct the commutator of p with q to arrive at one of the most fundamental quantum relationships: the non-zero difference in the permuted products related to Planck’s constant. This commutation relationship would become the foundation for many quantum theories to come.

Born 1925 matrix commutator
Fig. 9 Page 871 of Born and Jordan’s 1925 Zeitschrift article that introduces the commutation relationship between p and q [2].

As Heisenberg had done in his paper, Born and Jordan introduce the electric field of light to derive the dispersion of an atomic gas.

Max Born quantum transition amplitude from matrix elements
Fig. 10 Expression for the dispersion of light in an atomic gas [2].

The Born and Jordan paper appeared in the November issue of Zeitschrift für Physik, although a pre-print was picked up in England by Paul Dirac, who was working towards his doctoral degree under the mentorship of Ralph Fowler (1889 – 1944) at Cambridge. Dirac was deeply knowledgable in classical mechanics, and he recognized as soon as he saw it that the new quantum commutator was intimately connected to a quantity in classical mechanics known as a Poisson bracket. The Poisson bracket is part of Hamiltonian mechanics that defines how two variables, known as conjugate variables, are connected. For instance, the Poisson bracket of x with px is non-zero, meaning that these are conjugate variables, while the Poisson bracket of x with py is zero, meaning that these variables are fully independent. Conjugate variables are not “dependent” in an algebraic sense, but are linked through the structure of Hamilton’s equations—they are the “p’s and q’s” of phase space.

Dirac 1926 Proceedings of the Royal Society comparison of Poisson bracket to quantum commutator.
Fig. 11 The Poisson bracket in Dirac’s paper submitted on Nov. 7, 1925 [3].

Dirac submitted a paper on Nov. 7, 1925 to the Proceedings of the Royal Society of London where he showed that the Heisenberg commutator (a quantum quantity) directly proportional to the Poisson bracket (the classical quantity) with a proportionality factor that depended on Planck’s constant.

Dirac quantum commutation relationship
Fig. 12 Dirac relating the quantum commutator to the classical Poisson (Jacobi) bracket [3].

The Drei-Männer Quantum Mechanics Paper: Born, Heisenberg, and Jordan

Meanwhile, back in Göttingen, the three quantum physicists Born, Heisenberg and Jordan now combined forces to write a third foundational paper that established the full range of the new matrix mechanics. Heisenberg’s first paper had been the insight. Born and Jordan’s following paper had re-expressed Heistenberg’s formulas into matrix algebra. But both papers had used simple one-dimensional problems as test examples. Working together, they extended the new quantum mechanics to systems with many degrees of freedom.

Quantum Mechanics. Born, Heisenberg and Jordan three-man paper heading.
Fig. 13 Header for the completed new theory on quantum mechanics by Born, Heisenberg and Jordan [4].

With this paper, the matrix properties of dynamical variables are defined and used in their full form.

Quantum Mechanics. Born, Heisenberg and Jordan. View of a matrix.
Fig. 14 An explicit form for a dynamical matrix in the “three-man” paper [4].

With the theory out in the open, Pauli in Hamburg and Dirac at Cambridge used the new quantum mechanics to derive the transition energies of hydrogen, while Lucy Mensing and J. Robert Oppenheimer in Göttingen extended it to the spectra of more complicated molecules.

Open Issues

Heisenberg’s matrix mechanics might have exclusively taken hold of the quantum theory community and we would all be using matrices today to perform all our calculations. But within one month of the success of matrix mechanics, an alternative quantum theory would be proposed by Erwin Schrödinger based on waves, a theory that came to be called wave mechanics. There was a minor battle fought over matrix mechanics versus wave mechanics, but in the end, Bohr compromised with his complementarity principle, allowing each to stand as equivalent viewpoints of quantum phenomena (but more about Schrödinger and his waves in my next Blog).

Further Reading

For more stories about the early days of quantum physics read Chapter 8, “On the Quantum Footpath” in D. D. Nolte, “Galileo Unbound: A Path Across Life, the Universe and Everything” (Oxford University Press, 2018)

For definitive accounts of Heisenberg’s life see D. Cassidy “Beyond Uncertainty: Heisenberg, Quantum Physics and the Bomb” (Bellevue Press, 2009)

References

[1] Heisenberg, W. (1925). “Über quantentheoretische Umdeutung kinematischer und mechanischer Beziehungen”. Zeitschrift für Physik, 33(1), 879–893.

[2] Born, M., & Jordan, P. (1925). “Zur Quantenmechanik”. Zeitschrift für Physik, 34(1), 858–888.

[3] Dirac, P. A. M. (1925). The fundamental equations of quantum mechanics. Proceedings of the Royal Society of London. Series A, Containing Papers of a Mathematical and Physical Character, 109(752), 642–653

[4] Born, M., W. Heisenberg and P. Jordan (1926). “Quantum mechanics II.” Zeitschrift Fur Physik, 35, (8/9): 557–615.

[5] Dirac, P. A. M. (1926). “Quantum mechanics and a preliminary investigation of the hydrogen atom.” Proceedings of the Royal Society of London Series A, 110(755): 561–79.

[6] Pauli, W. (1926). “The hydrogen spectrum from the view point of the new quantal mechanics.” Zeitschrift Fur Physik, 36(5): 336–63.

[7] Mensing, L. (1926). “Die Rotations-Schwingungsbanden nach der Quantenmechanik”. Zeitschrift für Physik, 36(11), 814–823.

[8] Born, M., & Oppenheimer, J. R. (1927). “Zur Quantentheorie der Molekeln”. Annalen der Physik, 389(20), 457–484.

Holonomy, Parallel Transport and Foucault's Pendulum

Whole World Holonomy

Once upon a time the World was flat, and sailors feared falling off the edge if they sailed too far…at least that is how the fairy tale is told.

Flat World was a simple World.  Its inhabitants, the Flatworlders, carried vectors abount with them, vectors that were thick bundles of arrows, bound in a sheaf, that tended to point in a fixed direction.  In Flat World, once a vector had been oriented one way, it kept that orientation, no matter what path the Flatworlder took, always being exactly the same when it returned to its starting point.

Then came Cristobal Colon, and Carl Gauss and Bernhard Riemann, and Flat World became Round World, and the carrying around of vectors was no longer such a simple thing.  The surprising part was, when vectors returned to their starting point, even if they were carried with the greatest of care never to turn them or twist them, carrying them always parallel to themselves, they were never the same.  At the end of every different journey, they pointed in a different direction. 

How can this be?  How can a vector, that was transported always parallel to itself, end up pointing in a different direction when it was carried around a closed loop?

The answer is Holonomy!

Great Circle Routes

We were taught in Euclidean Geometry that the shortest path between two points is a straight line.  This lesson was fine for Flat World.  But now that we live in Round Riemann World, we know that the shortest distance between two points on the surface of the Earth is along a great circle route.  All great circles are Earth circumferences.  They are defined by three points: the starting point, the ending point and the center of the Earth.  The three points define a plane that intersects the surface of the Earth along a great circle.

Fig. 1. Great circle route from Rio de Janeiro, Brazil, to Seoul, Korea on a Winkel Tripel projection of the Earth.

The practical demonstration that great circles are shortest paths can be done with a string and a globe.  Pick any two points on the surface of the globe and stretch the string tightly between them so that the string lies taught on the surface of the globe.  If the two points are far enough apart (but no so far that they are nearly antipodal), then the string takes on the arc of a circle centered on the center of the globe.

Shortest paths on a curved surface, like a globe, are also known as geodesics.  And now that we have our geodesic paths for the Earth, we can start playing around with the parallel transportation of vectors.

Parallel Transport

The game is simple.  Start at the Equator with a vector that is pointing due North.  Now, move the vector due North along a line of longitude (a geodesic) until you hit the North Pole.  All the while, as you carry it along, the vector continues pointing due North on the surface of the Earth, never deviating.  When you reach the North Pole, take a sharp turn right by 90 degrees, being careful not to twist or turn the vector in any way.  Now, carry the vector with you due South along a new line of longitude until you hit the Equator, where again you take a sharp right turn by 90 degrees, and once again careful not to twist or turn the vector in any way.  Then return to your starting point. 

What you find, when you return home, is that the vector has turned through an angle of 90 degrees.   Despite how careful you were never to twist or turn it—it has turned nonetheless. This is holonomy.

Fig. 2. Parallel transport around a geodesic triangle.Start at the equator with the vector pointing due North. Transport it without twisting it to the North Pole. Take a right turn, careful not to twist the vector, and proceed to the equator where you take another right turn and return to the starting point. Without ever twisting the vector, it has nonetheless rotated by 90-degrees over the closed path.

Holonomy

Holonomy, or more specifically Riemann holonomy, is the intrinsic twisting of vectors as they are transported parallel to themselves around a closed loop on a curved surface.  The twist is something “outside” of the local transport of the vector.  In the case of the Earth, this “outside” element is the curvature of the Earth’s surface.  Locally, the Earth looks flat, and the vector is moved so that it always points in the same direction.  But globally, the vector can slowly tilt as it moves along a geodesic path.

For the example of parallel transport on the Earth, look at the vector at its starting point and the vector when it reaches the North pole.  Clearly the vector has rotated by 90 degrees, even despite of, or actually because of, its perfect Northward orientation along the line of longitude.

In this specific example, the solid angle of the closed path is a perfect eighth part of the total 4π solid angle of the surface, or 4π/8 = π/2.  The angle by which the vector rotated on this path is precisely the same as the subtended solid angle.  This is no coincidence—it is the consequence of the Gauss-Bonnet Theorem.

This theorem holds for any arbitrary closed path because any path can be viewed as if it were made up of lots of little segments of great circles.  You can even pick a path that crosses itself, taking care to keep track of minus signs as solid angles add and subtract.  For example, a perfect figure eight, if followed around smoothly, has no holonomy, because the two halves cancel.

Here, alas, we must leave our simple geometric games with great circles on the globe. To delve deeper into the origins of holonomy, it is time to turn to differential geometry.

Differential Geometry

Differential geometry is the application of differential calculus to geometry, in particular to geometric subspaces, also known as manifolds.  A good example is the surface of a sphere embedded in three-dimensional space.  The surface of a sphere has intrinsic curvature, where the curvature is defined as the inverse of the radius of the sphere.

One of the cornerstones of differential geometry is the operator known as the covariant derivative.  It is defined as

This covariant derivative is a master at bookkeeping.  Notice how the item on the left has an a-up and a b-down, as does the first term on the right.  And if we think of the c-up and -down as canceling in the last term, then again we have a-up and b-down.  The small-case “del” on the right is the usual partial derivative of Va with respect to xb.  The second term on the right takes care of the intrinsic “twist” of the coordinates caused by curvature.  As a bookkeeping device, covariant derivatives take care of the variation of a function as well as of the underlying variation of the coordinate frame.  (The covariant derivative was crucial for Einstein when he was developing his General Theory of Relativity applied to gravity.)

One of the most important discoveries in differential geometry was may by Tullio Levi-Civita in 1917. Years before, Levi-Civita had helped develop tensor calculus with his advisor Gregorio Ricci-Curbastro at the University of Padua, and they published a seminal review paper in 1901 that Marcel Grossmann brought to Einstein’s attention when he was struggling to reconcile special relativity with gravity. Einstein and Levi-Civita corresponded in a series of famous letters in early 1915 as Einstein zeroed in on the final theory of General Relativity. Interestingly, that correspondence had as profound an effect on Levi-Civita as it had on Einstein. Once Einstein’s new theory was published in late 1915, Levi-Civita returned to tensor calculus to answer a critical question: What was the geometric meaning of the covariant derivative that was so crucial to the new theory of gravity?

To answer this question, Levi-Civita defined a new form of parallelism that held for vector fields on curved manifolds. The new definition stated that during the parallel transport of a vector along a path, its covariant derivative along that path vanishes. This definition is contained in the expression

where the ub on the left is the tangent vector of the path. Expanding this expression gives

and simplifying the first term yields the equation for parallel transport

For the surface of the sphere, with two variables θ and φ, these lead to two coupled ODEs

The Christoffel symbols for a spherical surface are

Yielding the equations for Parallel Transport of a vector on the Earth

These equations are all you need to calculate how much a vector rotates for any path taken across the face of the Earth.

Example: Parallel Transport Around a line of Latitude

One particularly simple path is a line of latitude. Lines of latitude are not geodesics (except for the Equator) and hence there will be a contribution to the twist of the vector caused by the curvature of the Earth. For lines of latitude, the equations of Parallel Transport become

The line element is

leading to the flow equations

where theta is fixed (in this case the angle relative to the North Pole) and φ is the dependent variable.

Fig. 3. A vector that originally points due East on the 30th Parallel is rotated by 90-degrees during parallel transport, ending by pointing due South.

These copled ODEs are recognized as simple oscillations with the flow

and the initial value problem has the solution

where the angular frequency is the geometric mean of the coefficients

where λ is the latitude.

For a full rotation around a closed line of latitude, the parallel-transported vector components are

As an example, take an initial vector [0, 1] pointing due East along the line of latitude. Then parallel-transport it around latitude 30o North. This gives the final vector the components

which is now pointing due South. The vector has been rotated by 90-degrees even though it was transported always parallel to itself on the surface of the Earth. The solid angle subtended by the 30-th parallel is exactly π/2.

One final bookkeeping step is needed. It looks like the magnitude of the vector changed during the transport: y0 = 1, but xf = cosλ. However, cosλ = sinθ, which is exactly the metric coefficient on the dφ term of the line element, and the vector retains its magnitude in spherical coordinates.

Foucault’s Pendulum

Holonomy is a subtle effect, and it’s hard to find good examples in real life where it matters. But there is one demonstration that almost anyone has seen, at least anyone with an interest in science: Foucault’s pendulum. This is the very long pendulum that is often found in science museums around the world. As the Earth turns, the plane of oscillation of the pendulum slowly turns, and the pendulum bob often knocks down blocks that the museum staff set up in the morning to track the precessing plane.

In a classical mechanics class, the precession of Foucault’s pendulum is usually derived through the effect of the Coriolis force on the moving pendulum bob. The answer for the precession frequency, after difficult integrations between non-inertial frames, is

Alternatively, the normal vector to the plane of oscillation can be viewed as parallel-transported around a closed loop at constant latitude. The amount of precession per day is

which is exactly the same thing. Therefore, Foucault’s pendulum is a striking and exact physical demonstration of Whole World Holonomy.

Additional Reference: Physics Stack Exchange

Intersystem crossing (ISC) for singlet-to-triplet transition and generation of singlet oxygen

A Story of Singlets and Triplets: Relativistic Biology

The tortoise and the hare.  The snail and the falcon.  The sloth and the cheetah.

Comparing the slowest animals to the fastest, the peregrine falcon wins in the air at 200 mph diving on a dove.  The cheetah wins on the ground at 75 mph running down dinner.  That’s fast!

Einstein’s theory of relativity says that fast things behave differently than slow things.  So how fast do things need to move to see these effects?

The measure is the ratio of the speed of the object relative to the speed of light at 670 million miles per hour.  This ratio is called beta

The cheetah has β = 1×10-7, and the peregrine falcon has β = 4×10-7—which are less than one part in a million. 

And what about that snail?  At a speed of 0.003 mph it has β = 4×10-12 for a just few parts per trillion.

Yet relativistic physics is present in biology, despite these slow speeds, and they can help keep us alive.  How?

The Boon and Bane of Oxygen

All animal life on Earth needs oxygen to survive.  Oxygen is the energy storehouse that fuels mitochondria—the tiny batteries inside our cells that generate the energetic molecules that make us run.

Of all the elements, oxygen is second only to flourine as the most chemically reactive.  But fluorine is too one-dimensional to help much with life.  It needs only one extra electron to complete its outer-shell octet, leaving nothing in reserve to attach to much else.  Oxygen, on the other hand, needs two electrons to complete its outer shell, making it an easy bridge between two other atoms, usually carbons or hydrogens, and there you have it—organic chemistry is off and running.

Yet the same chemical reactivity that makes oxygen central to life also makes it corrosive to life.  When oxygen is not doing its part keeping things alive, it is tearing things apart.

The problem is reactive oxygen species, or ROS.  These are activated forms of oxygen that damage biological molecules, including DNA, putting wear and tear on (aging) the cellular machinery and introducing mutations into the genetic codes.  And one of the ROS is an active form of simple diatomic oxygen O2 — the very air we breathe.

The Saga of the Singlet and the Triplet

Diatomic oxygen is deceptively simple: two oxygen atoms, each two electrons short of a full shell, coming together to share two of each other’s electrons in a double bond, satisfying the closed shell octet for the outer valence electrons for this element.

Fig. 1 The Lewis diagram of the oxygen double bond.  Each oxygen shares two electrons with the other, filling the valence shell.

Bonds are based on energy levels, and the individual energy levels of the two separate oxygen atoms combine and shift as they form molecular orbitals (MO).  These orbitals are occupied by the 6 + 6 = 12 electrons of the diatomic molecule.  The first 10 electrons fill up the lower MOs until the last two go into the next highest state.  Here, for interesting reasons associated with how electrons interact with each other in confined spaces, the last two electrons both have the same orientation of their spins.  The total electron spin of the final full ground-state configuration of 12 electrons has S = 1. 

The unreactive triplet ground state molecular orbital diagram of diatomic oxygen (dioxygen). The two last electron spins are unpaired.
Fig. 2 Molecular orbital diagram for diatomic oxygen.  The 2s and 2p levels of the separated oxygen hybridize into new orbitals that are filled with 6 + 6 = 12 electrons.  The last two electron states have the same spin with a total spin S = 1.  This is the unreactive triplet ground state.

When a many-electron system has a spin S = 1, quantum mechanics prescribes that it has 2S+1 spin projections, so that the ground state of diatomic oxygen has three possible spin projections—known as a triplet.  But this is just for the lowest energy configuration of O2.

It is also possible to put both electrons into the final levels with opposite spins, giving S = 0.  This is known as the singlet, and there are two different ways these spins can be anti-aligned, creating two excited states.  The first excited state is about 1 eV in energy above the ground state, and the second excited state is about 0.7 eV higher than the first. 

The spin excited states of diatomic oxygen showing the pairing of spins to produce reactive singlet oxygen.
Fig. 3 Ground and excited states of O2.  The two singlet excited states have paired electron spins with a total spin of S = 0.  The ground state has S = 1.  The transitions between ground and excited states are “spin forbidden”.

Now we come the heart of our story.  It hinges on how reactive the different forms of O2 are, and how easily the singlet and the triplet can transform into each other.

Don’t Mix Your Spins

What happens if you mix hydrogen and oxygen in a 2:1 ratio in a plastic bag?

Nothing!  Unless you touch a match to it, and then it “pops” and creates water H2O.

Despite the thermodynamic instability of oxygen and hydrogen, the mixture is kinetically stable because it takes an additional input of energy to make the reaction go.  This is because oxygen at room temperature is almost exclusively in its ground state, the triplet state.  The triplet oxygen accepts electrons from other molecules, or atoms like hydrogen, but the local environment of organic molecules are almost exclusively in S = 0 states because of their stability. To accept two electrons from organic molecules requires spins to flip, which has an energy associated with the process, creating an energy barrier. This is known as the spin conservation rule of organic chemistry. Therefore, the reaction with the triplet is unlikely to go, unless you give it a hand with extra energy or a catalyst to get over the barrier.

However, this “spin blockade” that makes triplet (S = 1) oxygen unreactive (kinetically stable) is lifted in the case of singlet (S = 0) oxygen.  The singlet still accepts two electrons from other molecules, but these can be taken up easily by accepting electrons from the S = 0 organic molecules around it, conserving spin.  This makes singlet oxygen extremely reactive organically and hence is why it is an ROS. Singlet oxygen reacts with organic molecules, damaging them.

Despite the deleterious effects of singlet oxygen, it is produced as a side effect of lipids (fats) immersed in an oxygen environment. In a sense, fat slowly “burns”, creating reactive organic species that further react with lipids (primarily polyunsaturated fatty acids (PUFAs)), generating more ROS and creating a chain reaction. The chain reaction is terminated by the Russell mechanism that generates singlet oxygen as a byproduct.

Singlet oxygen, even though it is an excited state, cannot convert directly back to the benign triplet ground state for the same spin conservation reasons as organic chemistry. The singlet (S = 0) cannot emit a photon to get to its ground state triplet (S = 1) because photons cannot change the spin of the electrons in a transition. So once the singlet oxygen is formed, it can stick around and react chemically, eating up an organic molecule. Therefore, the oxygen environment, so necessary for our survival, is slowly killing us with oxydative stress. In response, mechanisms to de-excite singlet oxygen evolved, using organic molecules like carotenoids that are excellent physical quenchers of the singlet state.

But let’s flip the problem and ask what it takes to selectively harness singlet oxygen production to act as a targeted therapy that kills cancer cells—Enter Einstein’s theory of special relativity.

Relativistic Origins of Spin-Orbit Coupling

When I was a sophomore at Cornell University in the late 1970’s, I took the introductory class in electricity and magnetism. The text was an oddball thing from the Berkeley Physics Series authored by the Nobel Laureate, Edward Purcell, from Harvard.

Cover and front page of Purcell's E&M volume of the Berkeley Series.
Fig. 4 The cover and front page of my 1978 copy of Edward Purcell’s Berkeley Physics volume on Electricity and Magnetism.

(The Berkeley Series was a set of 5 volumes to accompany a 5-semester introduction to physics. It was the brainchild of Charles Kittel from Berkeley and Philip Morrison from Cornell who, in 1961, were responding to the Sputnik crisis and the need to improve the teaching of physics in the West.)

Purcell’s book had the quirky misfortune to use Gaussian units based on centimeters and grams (cgs) instead of meters and kilograms (MKS). Physicists tend to like cgs units, especially in the teaching of electromagnetism, because it places electric fields and magnetic fields on equal footing. Unfortunately, it creates a weird set of units like “statvolts” and “statcoulombs” that are a nightmare to calculate with.

Nonetheless, Purcell’s book was revolutionary as a second-semester intro physics book, in that it used Special Relativity to explain the transformations between electric and magnetic fields. For instance, starting with the static electric field of a stationary parallel plate capacitor in one frame, Purcell showed how an observer moving relatively to the capacitor in a moving frame detects the electric field as expected, but also detects a slight magnetic field. As the speed of the observer increases, the strength of the magnetic field increases, becoming comparable to the electric field in strength as the relative frame speed approaches the speed of light.

In this way, there is one frame in which there is no magnetic field at all, and another frame in which there is a strong magnetic field. The only difference is the point of view—yet the consequences are profound, especially when the quantum spin of an electron is involved.

Conversion of E fields to B fields through relativistically moving frames
Fig. 5 Purcell’s use of a parallel plate capacitor in his demonstration of the origin of magnetic fields from electric fields in relatively moving frames.

The spin of the electron is like a tiny magnet, and if the spin is in an external magnetic field, it feels a torque as well as an interaction energy. The torque makes it precess, and the interaction energy shifts its energy levels depending on the orientation of the spin to the field.

When an electron is in a quantum orbital around a stationary nucleus, attracted by the electric field of the nucleus, it would seem that there is no magnetic field for the spin to interact with, which indeed is true if the electron has no angular momentum. But if the electron orbital does have angular momentum with a non-zero expectation value for its velocity, then this moving electron does experience a magnetic field—the electron is moving relative to the electric field of the nucleus, and special relativity dictates that it experiences a magnetic field. The resulting magnetic field interacts with the magnetic moment of the electron and shifts its energy a tiny amount.

Spin-orbit coupling and spin-forbidden transitions. Spin-orbit mixes singlet and triplet spin states.
Fig. 6 Transitions to the singlet excited state are allowed but to the triplet state are spin-forbidden. The spin-orbit coupling mixes the spin states, giving a little triplet character to the singlet and vice versa

This is called the spin-orbit interaction and leads to the fine structure of the electron energy levels in atoms and molecules. More importantly for our story of Singlets and Triplets, the slight shift in energy also mixes the spin states. Quantum mechanical superposition of states mixes in a little S = 0 into the triplet excited state and a little S = 1 into the singlet ground state, and the transition is no longer strictly spin forbidden. The spin-orbit effect is relatively small in oxygen, contributing little to the quenching of singlet oxygen. But spin-orbit in other molecules, especially optical dyes that absorb light, can be used to generate singlet oxygen as a potent way to kill cancer cells.

Spin-Orbit and Photodynamic Cancer Therapy

The physics of light propagation through living tissue is a fascinating topic. Light is easily transported through tissue through multiple scattering. This is why your whole hand glows red when you cover a flashlight at night. The photons bounce around but are not easily absorbed. This surprising translucence of tissue can be used to help treat cancer.

Photodynamic cancer therapy uses photosensitizer molecules, typically organic dyes, that absorb light, transforming the molecule from a single ground state to a singlet excited state (spin-allowed transition). Although singlet-to-triplet conversion is spin-forbidden, the spin-orbit coupling slightly mixes the spin states which allows a transformation known as intersystem crossing (ISC). The excited singlet crosses over (usually through a vibrational state associated with thermal energy) to an excited triplet state of the photosensitizer molecule. Oxygen triplet molecules, that are always prevalent in tissue, collide with the triplet photosensitizer—and they swap their energies in a spin-allowed transfer. The photosensitizer triplet returns to its singlet ground state, while the triplet oxygen converts to highly reactive singlet oxygen. The swap doesn’t change total spin, so it is allowed and fast, generating large amounts of reactive oxygen singlets.

Intersystem crossing (ISC) for singlet-to-triplet transition and generation of singlet oxygen
Fig. 7 Intersystem crossing diagram. A photosensitizer molecule in a singlet ground state absorbs a photon that creates a singlet excited state of the molecule. The spin-orbit mixing of spin states allows intersystem crossing (ISC) to generate a triplet excited state of the photosensitizer. This triplet can then exchange its energy with oxygen to create highly-reactive singlet oxygen.

In photodynamic therapy, the photosensitizer is taken up by the patient’s body, but the doctor only shines light around the tumor area, letting the light multiply-scatter through the tissue, exciting the photosensitizer and locally producing singlet oxygens that kill the local cancer cells. Other parts of the body remain in the dark and have none of the ill-effects that are so common for the conventional systemic cytotoxic anti-cancer drugs. This local treatment of the tumor using localized light can be much more benign to overall patient health while still delivering effective treatment to the tumor.

Photodynamic therapy has been approved by the FDA for some early stage cancers, and continuing research is expanding its applications. Because light transport through tissue is limited to about a centimeter, most of these applications are for “shallow” tumors that can be accessed by light through the skin or through internal surfaces (like the esophagus or colon).

Postscript: Relativistic Spin

I can’t leave this story of relativistic biology without going into a final historical detail from the early days of relativistic quantum theory. When Uhlenbeck and Goudsmit first proposed in 1925 that the electron has spin, there were two immediate objections. First, Lorentz showed that in a semiclassical model of the spinning electron, the surface of the electron would be moving at 300 times the speed of light. The solution to this problem came three years later in 1928 with the relativistic quantum theory of Paul Dirac, which took an entirely different view of quantum versus classical physics. In quantum theory, there is no “radius of the electron” as there was in semiclassical theory.

The second objection was more serious. The original predictions from electron spin and the spin-orbit interaction in Bohr’s theory of the atom predicted a fine-structure splitting that was a factor of 2 times larger than observed experimentally. In 1926 Llewellyn Thomas showed that a relativistically precessing spin was in a non-inertial frame, requiring a continuous set of transformations from one instantaneously inertial frame to another. These continuously shifting transformations introduced a factor of 1/2 into the spin precession, exactly matching the experimental values. This effect is known as Thomas precession. Interestingly, the fully relativistic theory of Dirac in 1928 automatically incorporated this precession within the theory, so once again Dirac’s equation superseded the old quantum theory.

Examples of conformal maps with transform functions for field lines and potentials.

The Craft of Conformal Maps

Somewhere, within the craft of conformal maps, lie the answers to dark, difficult problems of physics.

For instance, can you map out the electric field lines around one of Benjamin Franklin’s pointed lightning rods?

Can you calculate the fluid velocities in a channel making a sharp right angle?

Now take it up a notch in difficulty: Can you find the distribution of the sizes of magnetic domains within a flat magnet that is about to depolarize?

Or take it to the max: Can you find the vibration frequencies of the cosmic strings of string theory?

The answers to all these questions starts with simple physics solutions within simple boundaries—sometimes a problem so simple even a freshman physics student can solve it—and then mapping the solution, point by point, onto the geometry of the desired problem.

Once the right mapping function is found, you can solve some of the stickiest, ugliest, crankiest problems of physics like a pro.

The Earliest Conformal Maps

What is a conformal map?  It is a transformation that takes one picture into another, keeping all local angles unchanged, no matter how distorted the overall transformation is.

This property was understood by the very first mathematicians, wrangling their sexagesimal numbers by the waters of Babylon as they mapped the heavens onto charts to foretell the coming of the seasons.

Sterographic projection of the celestial sphere onto a plane
Fig. 1 Geometry of stereographic projection

Hipparchus of Rhodes, around 150 BCE during the Hellenistic transition from Alexander to Caesar, was the first to describe the stereographic projection by which the locations of the stars were mapped on a plane by locating all the stars on the celestial sphere and tracing a line from the bottom of the sphere to the star and plotting where the line intersects the mid plane.  Why this method should be conformal, preserving the angles, was probably beyond his mathematical powers, but he probably intuitively knew that it did.

Astrolabe map of the stars based on stereographic projection
Fig. 2 An astrolabe for the location of Brussels, Belgium, generated by a stereographic projection

Ptolemy of Alexandria around 150 CE, expanded on Hipparchus’ star charts and then introduced his own conical-like projection to map all the known world of his day.  The Ptolemaic projection is almost conformal, but not quite.  He was more interested in keeping areas faithful than angles.

Ptolemy's map of the world as reconstructed in the Renaissance.
Fig. 3 A Renaissance rendering of Ptolemy’s map of the known world in pseudo-conic projection.

Mercator’s Rules of Rhumb

The first conformal mapping of the Earth’s surface onto the plane was constructed by Gerard Mercator in 1569.  His goal, as a map maker, was to construct a map that traced out a ship’s course of constant magnetic bearing as a straight line, known as a rhumb line.  This mapping property had important utility for navigators, especially on long voyages at sea beyond the sight of land, and was a hallmark of navigation maps of the Mediterranean, known as Portolan Charts.  Rhumb lines were easy to draw on the small scales of the middle ocean, but on the scale of the Earth, no one knew how to do it.

Mercator's North Atlantic map
Fig. 4 Mercator’s North Atlantic with compass roses and rhumb lines. (Note the fictitious islands south of Iceland and the large arctic continent north of Greenland.)

Though Mercator’s life and career have been put under the biographer’s microscope numerous times, the exact moment when he realized how to make his map—the now-famous Mercator Projection—is not known.  It is possible that he struck a compromise between a cylindrical point projection, that stretched the arctic regions, and a cylindrical line projection that compressed the arctic regions.  He also was a maker of large globes on which rhumb lines (actually curves) could be measured and transferred to a flat map.  Either way, he knew that he had invented something entirely new, and he promoted his map as an aid for the Age of Exploration.  There is some evidence that Frobisher took Mercator’s map with him during his three famous arctic expeditions seeking the Northwest Passage.

Mercator projection of the Earth
Fig. 5 Modern Mercator conformal projection of the Earth.

Mercator never explained nor described the mathematical function behind his projection.  This was first discovered by the English mathematician Thomas Harriot in 1589, 20 years after Mercator published his map, as Harriot was helping Sir Walter Rayleigh with his New World projects.  Like most of what Harriot did during his lifetime, he was years (sometimes decades) ahead of anyone else, but no one ever knew because he never published.  His genius remained buried in his personal notes until they were uncovered in the late 1800’s long after others had claimed credit for things he did first.

The rhumb lines of Mercator’s map maintain constant angle relative to all lines of longitude and hence the Mercator projection is a conformal map.  The mathematical proof of this fact was first given by James Gregory in 1668 (almost a century after Mercator’s feat) followed by a clearer proof by Isaac Barrow in 1670.  It was 25 years later that Edmund Halley (of Halley’s Comet fame) proved that the stereographic projection was also conformal.

A hundred years passed after Halley before anyone again looked into the conformal properties of mapping—and then the field exploded.

The Rube in Frederick’s Berlin

In 1761, the Swiss contingent of the Prussian Academy of Sciences nominated a little-known self-taught Swiss mathematician to the membership of the Academy.  The process was pro forma, but everyone nominated was interviewed personally by Frederick the Great who had restructured the Academy years before from a backwater society to a leading scientific society in Europe.  When Frederick met Johann Lambert, he thought it must be a practical joke.  Lambert looked strange, dressed strangely, and his manners were even stranger.  He was born poor, had never gone to school, and he looked it and he talked it. 

Portrait of Johann Heinrich Lambert
Fig. 6 Portrait of Johann Heinrich Lambert

Frederick rejected the nomination.

But the Swiss contingent, led by Leonhard Euler himself, persisted, because they knew what Frederick did not—Lambert was a genius.  He was an autodidact who had pulled himself up so thoroughly, that he had self-published some of the greatest works of philosophy and science of his generation.  One of these was on the science of optics which established standards of luminance that we still use today.  (In my own laboratory, my students and I routinely refer to Lambertian surfaces in our research on laser speckle.  And we use the Lambert-Beer law of optical attenuation every day in our experiments.)

Frederick finally relented after a delay of two years, and admitted Lambert to his Academy, where Lambert went on a writing rampage, publishing a paper a month over the next ten years, like a dam letting loose.

One of Lambert’s many papers was on projection maps of the Earth.  He not only picked up where Halley had left off a hundred years earlier, but he invented 7 new projections, three of which were conformal and four of which were equal area.  Three of Lambert’s projections are in standard use today in cartography

The Lambert conformal conic projection of the Earth
Fig. 7 The Lambert conformal conic projection centered on the 36th parallel.

Although Lambert worked at the time of Euler, Euler’s advances in complex-valued mathematics was still young and not well known, so Lambert worked his projections using conventional calculus.  It would be another 100 years before the power of complex analysis was brought fully to bear on the problem of conformal mappings.

Riemann’s Sphere

It seems like the history of geometry can be divided into two periods: the time before Bernhard Riemann and the time after Bernhard Riemann

Bernhard Riemann portrait
Fig. 8 Bernhard Riemann.

Bernhard Riemann was a gentle giant, a shy and unimposing figure with a Herculean mind.  He transformed how everyone thought about geometry, both real and complex.  His doctoral thesis was the most complete exposition to date on the power of complex analysis, and his Habilitation Lecture on the foundations of geometry shook those very foundations to their core.

In the hands of Riemann, the stereographic projection became a complex transform of the simplest type

where x, y and z are the spherical coordinates of a point on the sphere.

Riemann sphere projection as a conformal map
Fig. 9 Conformal mapping of the surface of the Riemann sphere onto the complex plane.

The projection in Fig. 9 is from the North Pole, which represents Antarctica faithfully but distorts the lands of the Northern Hemisphere. Any projection can be centered on a chosen point of the Earth by projecting from the opposite point, called the antipode. For instance, the stereographic projection centered on Chicago is shown in Fig. 10.

Stereographic projection centered on Chicago.
Fig. 10 Stereographic projection centered on Chicago, Illinois. Note the size of Greenland relative to its size in the Mercator projection of Fig. 5.

Building on the work of Euler and Cauchy, Riemann dove into conformal maps and emerged in 1851 with one of the most powerful theorems in complex analysis, known as the Riemann Mapping Theorem:

Any non-empty, simply connected open subset of the complex plane (which is not the entire plane itself) can be conformally mapped to the open unit disk. 

An immediate consequence of this is that all non-empty, simply connected open subsets of the complex plane are equivalent, because any domain can be mapped onto the unit disk, and then the unit disk can be mapped to any domain.

The consequences of this are astounding:  Solve a simple physics problem in a simple domain and then use the Riemann mapping theorem to transform it into the most complex, ugly, convoluted, twisted problem you can think of (as long as it is simply connected) and then you have the answer.

The reason that conformal maps (that are purely mathematical) allow the transformation of physics problems (that are “real”) is because physics is based on orthogonal sets of fields and potentials that govern how physical systems behave.  In other words, the solution to the Laplacian operator on one domain can be transformed to a solution of the Laplacian operator on a different domain.

Powerful! Great!  But how do you do it?  Riemann’s theorem was an existence proof—not a solution manual.  The mapping transformations still needed to be found.

Schwarz-Christoffel

On the heels of Bernard Riemann, who had altered the course of geometry, Hermann Schwarz at the University of Halle, Germany, and Elwin Bruno Christoffel at the Technical University in Zürich, Switzerland, took up Riemann’s mapping theorem to search for the actual mappings that would turn the theorem from “nice to know” to an actual formula.

Working independently, Christoffel in 1867 drew on his expertise in differential geometry while Schwarz in 1869 drew on his expertise in the calculus of variations, both with a solid background in geometry and complex analysis.  They focused on conformal maps of polygons because general domains on the complex plane can be described with polygonal boundaries.  The conformal map they sought would take simple portions of the complex plane and map them to the interior angle of a polygonal vertex.  With sufficient constraints, the goal was to map all the vertexes and hence the entire domain.

The surprisingly simple result is known as the Schwarz-Christoffel equation

where a, b, c … are the positions of the vertices, and α, β, γ … are the interior angles of the vertices.  The integral needs to be carried out on the complex plane, but it has closed-form solutions for many common cases.

This equation solves the problem of “how” that allows any physics solution on one domain to be mapped to another domain.

Conformal Maps

The list of possible conformal maps is literally limitless, yet there are a few that are so common that they deserve to be explored in some detail here.

One conformal map is so “famous” it has the name of the Joukowski Map that takes the upper half plane and transforms it (through an open strip) onto the full complex plane. The field lines and potentials are shown in Fig. 11 as a simple transform of straight lines. To calculate these fields and potentials directly would require the solution of a partial differential equation (PDE) through numerical methods.

Field and potential lines near an aperture in conducting plates by a conformal map.
Fig. 11 Field lines and potentials around a gap in a charged conductor by the Joukowski conformal map.

Other common conformal maps are power-law transformations, taking the upper half plane into the full plane. Fig. 12 shows three of these, the first an inner half corner, the second the outer half corner, and the third transforming the upper half plane onto the full plane. All three of these show the field lines and the potentials near charged conducting plates.

Conformal maps from the half-plane to the full plane  with field and potential lines.
Fig. 12 Maps from the half-plane to the full plane: a) Inner corner, b) outer corner, c) charged thin plate.

Conformal maps can also be “daisy-chained”. For instance, in Fig. 13, the unit circle is transformed into the upper half plane, providing the field lines and equipotentials of a point charge near a conducting plate. The fields are those of a point charge and its image charge, creating a dipole potential. This charge and its image are transformed again into the fields and potentials of a point charge near a conducting corner.

Compound conformal maps of the unit circle to the half-plane to the outside of a corner.
Fig. 13 Point charge fields and potentials near a conducting corner by a compound conformal map: a) Unit circle to the half-plane, b) Half-plane to the outside corder.

But we are not quite done with conformal maps. They have reappeared in recent years in exciting new areas of physics in the form of conformal field theory.

Conformal Field Theory

The importance being conformal extends far beyond solutions to Laplace’s equation.  Physics is physics, regardless of how it comes about and how it is described, and transformations cannot change the physics.  As an example, when a many-body system is at a critical point, then the description of the system is scale independent.  In this case, changing scale is one type of transformation that keeps the physics the same.  Conformal maps also keep the physics the same by preserving angles.  Taking this idea into the quantum realm, a quantum field theory of a scale-invariant system can be conformally mapped onto other, more complex systems for which answers are not readily derived. 

This is why conformal field theory (CFT) has become an important new field of physics with applications ranging as widely as quantum phase transitions and quantum strings.


Books by David Nolte at Oxford University Press
Read more in Books by David D. Nolte at Oxford University Press.

Anant K. Ramdas in the Golden Age of Physics

The physicist, as a gentleman and a scholar, who, in his leisure, pursues physics as both vocation and hobby, is an endangered species, though they once were endemic.  Classic examples come from the turn of the last century, as Rayleigh and de Broglie and Raman built their own laboratories to follow their own ideas.  These were giants in their fields. But there are also many quiet geniuses, enthralled with the life of ideas and the society of scientists, working into the late hours, following the paths that lead them forward through complex concepts and abstract mathematics as a labor of love.

One of these quiet geniuses, of late, was a colleague of mine and a friend, Anant K. Ramdas.  He was the last PhD student of the Nobel Prize Laureate, C. V. Raman, and he may have been the last of his kind as a gentleman and a scholar physicist.

Anant K. Ramdas

Anant Ramdas was born in May, 1930, in Pune, India, not far from the megalopolis of Mumbai when it had just over a million inhabitants (the number is over 22 million today, nearly a hundred years later).  His father, Lakshminarayanapuram A. Ramdas, was a scientist, a meteorologist who had studied under C. V. Raman at the University of Calcutta.  Raman won the Nobel Prize in Physics the same year that Anant Ramdas was born. 

Ramdas received his BS in Physics from the University of Pune in 1950, then followed in his father’s footsteps by studying for his MS (1953) and PhD (1956) degrees in Physics under Raman, who had established the Raman Institute in Bangalore, India. 

While facing the decision, after his graduation, on what to do and where to go, Ramdas read a review article published by Prof. H. Y. Fan of Purdue University on infrared spectroscopy of semiconductors.  After corresponding with Fan, and with the Purdue Physics department head, Prof. Karl Lark-Horowitz, Ramdas decided to accept the offer of a research associate (a post-doc position), and he prepared to leave India.

Within only a few months, he met and married his wife, Vasanti, and they hopped on a propeller plane to London that stopped along the way in Cairo, Beirut, Lebanon, and Paris before arriving in London.  From there, they caught a cargo ship making a two-week passage across the Atlantic, after stopping at ports in France and Portugal.  In New York City, they took a train to Chicago, getting off during a brief stop in the little corn-town of Lafayette, Indiana, home of Purdue University.  It was 1956, and Anant and Vasanti were, ironically, the first Indians that some people in the Indiana town had ever seen.

Semiconductor Physics at Purdue

Semiconductors became the ascendent electronic material during the Second World War when it was discovered that their electrical properties were ideal for military radar applications.  Many of the top physicists of the time worked at the “Rad Lab”, the Radiation Laboratory of MIT, and collaborations spread out across the US, including to the Physics Department at Purdue University.  Researchers at Purdue were especially good at growing the semiconductor Germanium, which was used in radar rectifiers.  The research was overseen by Lark-Horowitz.

After the war, semiconductor research continued to be a top priority in the Purdue Physics department as groups around the world competed to find ways to use semiconductors instead of vacuum tubes for information and control.  Friendly competition often meant the exchange of materials and samples, and sometime in early 1947, several Germanium samples were shipped to the group of Bardeen and Brattain at Bell Labs, where, several months later, they succeeded in making the first point contact transistor using Germanium (with some speculation today that it may have been with the samples sent from Purdue).  It was a close thing. Ralph Bray, a professor at Purdue, had seen nonlinear current dependences in the Purdue-grown Germanium samples that were precursers of transistor action, but Bell made the announcement before Bray had a chance to take the next step. Lark-Horowitz (and Bray) never forgot how close Purdue had come to making the invention themselves [1].

In 1948, Lark-Horowitz hired H. Y. Fan, who had received his PhD at MIT in 1937 and had been teaching at Tsinghua University in China.  Fan was an experimental physicist specializing in the infrared properties of semiconductors, and when Ramdas arrived at Purdue in 1956, he worked directly under Fan.  They published their definitive work on the infrared absorption of irradiated silicon in 1959 [2].

Absorption spectrum of “effective-mass” shallow defect levels in irradiated silicon.

One day, while Ramdas was working in Fan’s lab, Lark-Horowitz stopped by, as he was accustomed to do, and he casually asked if Ramdas would be interested in becoming a professor at Purdue.  Ramdas of course said “Yes”, and Lark-Horowitz gave him the job on the spot.  Ramdas was appointed as an assistant professor in 1960.  These things were less formal in those days, and it was only later that Ramdas learned that Fan had already made a strong case for him.

The Golden Age of Physics

The period from 1960 to 2015, which spanned Ramdas’ career, start to finish, might be called “The Golden Age of Physics”. 

This time span saw the completion of the Standard Model of particle physics with the theory of quarks (1964), the muon neutrino (1962), electro-weak unification (1968), quantum chromodynamics (1970s), the tau lepton (1975), the bottom quark (1977), the top quark (1995), the W and Z bosons (1983), the tau neutrino (2000), neutrino mass oscillations (2004), and of course capping it off with the detection of the Higgs boson (2012). 

This was the period in solid state physics that saw the invention of the laser (1960), the quantum Hall effect (1980), the fractional quantum Hall effect (1982), scanning tunneling microscopy (1981), quasi-crystals (1982), high-temperature superconductors (1986), and graphene (2005).

This was also the period when astrophysics witnessed the discovery of the Cosmic Background Radiation (1964), the first black hole (1964), pulsars (1967), confirmation of dark matter (1970s), inflationary cosmology (1980s), Baryon Acoustic Oscillations (2005), and capping the era off with the detection of gravitational waves (2015).

The period from 1960 – 2015 stands out relative to the “first” Golden Age of Physics from 1900 – 1930 because this later phase is when the grand programs from early in the century were brought largely to completion.

But these are the macro-events of physics from 1960-2015.  This era was also a Golden Age in the micro-events of the everyday lives of the physicists.  It is this personal aspect where this later era surpassed the earlier era (when only a handful of physicists were making progress).  In the later part of the century, small armies of physicists were advancing rapidly along all the frontiers at the same time, and doing it with the greatest focus.

This was when a single NSF grant could support a single physicist with several grad students and an undergraduate or two.  The grants could be renewed with near certainty, as long as progress was made and papers were published.  Renewal applications, in those days, were three pages.  Contrast that to today when 25 pages need to be honed to perfection—and then the renewal rate is only about 10% (soon to be even lower with the recent budget cuts to science in the USA).  In those earlier days, the certainty of success, and the absence of the burden of writing multiple long grant proposals, bred confidence to dispose of the conventional, to try anything new.  In other words, the vast amount of time spent by physicists during this Golden Age was in the pursuit of physics, in the classroom and in the laboratory.

And this was the time when Anant Ramdas and his cohort—Sergio Rodriguez, Peter Fisher, Jacek Furdyna, Eugene Haller, the Chandrasekhar’s, Manuel Cardona, and the Dresselhaus’s—rode the wave of semiconductor physics when money was easy, good students were plentiful, and a vibrant intellectual community rallied around important problems.

Selected Topics of Research from Anant Ramdas

It is impossible to give justice to the breadth and depth of research performed by Anant over his career. So here is my selection of some of my favorite examples of his work:

Diamond

Anant had a life-long fascination for diamonds. As a rock and gem collector, he was fond of telling stories about the famous Cullinan diamond (weighed 1.3 pounds as a raw diamond at 3000 carats) and the blue Hope diamond (discovered in India). One of his earliest and most cited papers was on the Raman spectrum of Diamond [3], and he published several papers on his favorite color for diamonds—Blue [4]!

Raman Spectrum of Diamond.

His work on diamond helped endear Anant with the husband-wife team of Milly Dresselhaus and Gene Dresselhaus at MIT. Milly was the “Queen” of carbon, known for her work on graphite, carbon nanotubes and Fullerenes. Purdue had made an offer of an assistant professorship to Gene Dresselhaus when the two were looking for faculty positions after their post-docs at the University of Chicago, but Purdue would not give Milly a position (she was viewed as a “trailing” spouse). Anant was already at Purdue at that time and got to know both of them, maintaining a life-long friendship. Milly went on to become the president of the APS and was elected a member of the National Academy of Sciences, the National Academy of Engineering and the American Academy of Arts and Sciences.

Magneto-Optics

Purdue was a hot-bed of II-VI semiconductor research in the 1980’s, spearheaded by Jacek Furdyna. The substitution of the magnetic ion Mn for Zn, Cd or Hg created a unique class of highly magnetic semiconductors. Anant was the resident expert on the optical properties of the materials and collected one of the best examples of Giant Faraday Rotation [5].

Giant Faraday Effect in CdMnTe

Anant and the Purdue team were the world leaders in the physics and materials science of diluted magnetic semiconductors.

Shallow Defects in Semiconductors

My own introduction to Anant was through his work on shallow effective-mass defect states in semiconductors. I was working towards my PhD with Eugene ‘Gene” Haller at Lawrence Berkeley Lab (LBL) in the early 1980’s, and Gene was an expert on the spectroscopy of the shallow levels in Germanium. My co-physics graduate student colleague was Joe Kahn, and the two of us were tasked with studying the review article that Anant had written with his long-time theoretical collaborator Sergio Rodriguez on the physics of effective-mass shallow defects in semiconductors [6]. We called it “The Bible”, and spent months studying it. Gene Haller’s principal technique was photothermal ionization spectroscopy (PTIS), and Joe was building the world’s finest PTIS instrument. Joe met Anant for dinner one night at the March meeting of the APS in 1986, and when he got back to the room, he waxed poetic about Anant for an hour. It was like he had met his hero. I don’t remember how I missed that dinner, so my personal introduction to Anant Ramdas would have to wait.

PTIS spectra of donors in GaAs

My own research went into deep-level transient spectroscopy (DLTS) working with Gene and his group theorist, Wladek Walukiewicz, where we discovered a universal pressure derivative in III-V semiconductors. This research led me to a post-doc position at Bell Labs under Alastair Glass and later to a faculty position at Purdue, where I did finally meet Anant, who became my long-time champion and mentor. But Joe had stayed with the shallow defects, and in particular defects that showed interesting dynamical properties, known as tunneling defects.

Dynamic Defects in Semiconductors

Dynamic defects in semiconductors are multicomponent defects (often involving vacancies or interstitial defects) in which one of the components tunnels quantum mechanically, or hops, rapidly on a time scale short compared to the measurement interaction time (electric dipole transition), so that the measurement sees increased symmetry compared to the instantaneous low-symmetry configuration of the defect.

Eugene Haller and his physics theory collaborator, Leo Falicov, were pioneers in tunneling defects related to hydrogen, building on earlier work by George Watkins who studied dynamical defects using EPR measurements. In my early days doing research under Eugene, we thought we had discovered a dynamical effect in FeB defects in silicon, and I spent two very interesting weeks at Lehigh University, visiting Watkins, to test out our idea, but it turned out to be a static effect. Later, Joe Kahn found that some of the early hydrogen defects in Germanium that Gene and Leo had proposed as dynamical defects were also, in fact, static. So the class of dynamical defects in semiconductors was actually shrinking over time rather than expanding. Joe did go on to find clear proof of a hydrogen-related dynamical defect in Germanium, saving the Haller-Falicov theory from the dust bin of Physics History.

In 2006 and in 2008, Ramdas was working on Oxygen-related defect complexes in CdSe when his student, G. Chen [7-8], discovered a temperature-induced symmetry raising. It showed clear evidence for a lower symmetry defect that converged into a higher symmetry mode at high temperatures, very much in agreement with the Haller-Falicov theory of dynamical symmetry raising.

At that time, I was developing my course notes for my textbook Introduction to Modern Dynamics, where some of the textbook problems in synchronization looked just like Anant’s data. Using a temperature-dependent coupling in a model of nonlinear (anharmonic) oscillators, I obtained the following fits (solid curves) to the Ramdas data (data points):

Quantum synchronization in CdSe and CdTe.

The fit looks too good to be a coincidence, and Anant and I debated on whether the Haller-Falicov theory, or a theory based on nonlinear synchronization, would be better descriptions of the obviously dynamical properties of these defects. Alas, Anant is now gone, and so are Gene and Leo, so I am the last one left thinking about these things.

Beyond the Golden Age?

Anant Ramdas was fortunate to have spent his career during the Golden Age of Physics, when the focus was on the science and on the physics, as healthy communities helped support one another in friendly competition. He was a gentleman scholar, an avid reader of books on history and philosophy, much of it (but not all) on the history and philosophy of physics. His “Coffee Club” at 9:30 AM every day in the Physics Department at Purdue was a must-not-miss event that was attended by all of the Old Guard as well as by myself, where the topics of conversation ran the gamut, presided over by Anant. He had his NSF grant, year after year (and a few others), and that was all he needed to delve into the mysteries of the physics of semiconductors.

Is that age over? Was Anant one of the last of that era? I can only imagine what he would say about the current war against science and against rationality raging across the USA right now, and the impending budget cuts to all the science institutes. He spent his career and life upholding the torch of enlightenment. Today, I fear he would be holding it in the dark. He passed away Thanksgiving, 2024.

Vasanti and Anant, 2022.

References

[1] Ralph Bray, “A Case Study in Serendipity”, The Electrochemical Society, Interface, Spring 1997.

[2] H. Y. Fan and A. K. Ramdas, “INFRARED ABSORPTION AND PHOTOCONDUCTIVITY IN IRRADIATED SILICON,” Journal of Applied Physics, Article vol. 30, no. 8, pp. 1127-1134, 1959, doi: 10.1063/1.1735282.

[3] S. A. Solin and A. K. Ramdas, “RAMAN SPECTRUM OF DIAMOND,” Physical Review B, Article vol. 1, no. 4, pp. 1687-&, 1970, doi: 10.1103/PhysRevB.1.1687

[4] H. J. Kim, Z. Barticevic, A. K. Ramdas, S. Rodriguez, M. Grimsditch, and T. R. Anthony, “Zeeman effect of electronic Raman lines of accepters in elemental semiconductors: Boron in blue diamond,” Physical Review B, Article vol. 62, no. 12, pp. 8038-8052, Sep 2000, doi: 10.1103/PhysRevB.62.8038.

[5] D. U. Bartholomew, J. K. Furdyna, and A. K. Ramdas, “INTERBAND FARADAY-ROTATION IN DILUTED MAGNETIC SEMICONDUCTORS – ZN1-XMNXTE AND CD1-XMNXTE,” Physical Review B, Article vol. 34, no. 10, pp. 6943-6950, Nov 1986, doi: 10.1103/PhysRevB.34.6943.

[6] A. K. Ramdas and S. Rodriguez, “SPECTROSCOPY OF THE SOLID-STATE ANALOGS OF THE HYDROGEN-ATOM – DONORS AND ACCEPTORS IN SEMICONDUCTORS,” Reports on Progress in Physics, Review vol. 44, no. 12, pp. 1297-1387, 1981, doi: 10.1088/0034-4885/44/12/002

[7] G. Chen, I. Miotkowski, S. Rodriguez, and A. K. Ramdas, “Stoichiometry driven impurity configurations in compound semiconductors,” Physical Review Letters, Article vol. 96, no. 3, Jan 2006, Art no. 035508, doi: 10.1103/PhysRevLett.96.035508.

[8] G. Chen, J. S. Bhosale, I. Miotkowski, and A. K. Ramdas, “Spectroscopic Signatures of Novel Oxygen-Defect Complexes in Stoichiometrically Controlled CdSe,” Physical Review Letters, Article vol. 101, no. 19, Nov 2008, Art no. 195502, doi: 10.1103/PhysRevLett.101.195502.

Other Notable Papers:

[9] E. S. Oh, R. G. Alonso, I. Miotkowski, and A. K. Ramdas, “RAMAN-SCATTERING FROM VIBRATIONAL AND ELECTRONIC EXCITATIONS IN A II-VI QUATERNARY COMPOUND – CD1-X-YZNXMNYTE,” Physical Review B, Article vol. 45, no. 19, pp. 10934-10941, May 1992, doi: 10.1103/PhysRevB.45.10934.

[10] R. Vogelgesang, A. K. Ramdas, S. Rodriguez, M. Grimsditch, and T. R. Anthony, “Brillouin and Raman scattering in natural and isotopically controlled diamond,” Physical Review B, Article vol. 54, no. 6, pp. 3989-3999, Aug 1996, doi: 10.1103/PhysRevB.54.3989.

[11] M. H. Grimsditch and A. K. Ramdas, “BRILLOUIN-SCATTERING IN DIAMOND,” Physical Review B, Article vol. 11, no. 8, pp. 3139-3148, 1975, doi: 10.1103/PhysRevB.11.3139.

[12] E. S. Zouboulis, M. Grimsditch, A. K. Ramdas, and S. Rodriguez, “Temperature dependence of the elastic moduli of diamond: A Brillouin-scattering study,” Physical Review B, Article vol. 57, no. 5, pp. 2889-2896, Feb 1998, doi: 10.1103/PhysRevB.57.2889.

[13] A. K. Ramdas, S. Rodriguez, M. Grimsditch, T. R. Anthony, and W. F. Banholzer, “EFFECT OF ISOTOPIC CONSTITUTION OF DIAMOND ON ITS ELASTIC-CONSTANTS – C-13 DIAMOND, THE HARDEST KNOWN MATERIAL,” Physical Review Letters, Article vol. 71, no. 1, pp. 189-192, Jul 1993, doi: 10.1103/PhysRevLett.71.189.

.

How to Lose Weight by Supporting PBS and NPR

What is the point of education?  Why do we learn facts we never use in our jobs?  Why do we worry over tiny details in arcane classes that have no utility?  Isn’t it all a waste of time?

Let me ask it a different way.  Why not train for a specific job?  Can’t we jettison all those irrelevant facts and details and just spend our time on the activities we will be performing when we are employed?  Why bother with education in the first place?  Why not just get on with the job?

The answer is simple:  To adapt and to survive. Or even more simply: To live and to live well—which is the function of reason.

With a broad education, we learn how to learn, and we learn how to think.  We learn how to adapt, to be agile, to think differently.  We learn to recognize approaching pitfalls and opportunities.  We learn not to be afraid of the unknown.  We learn to be savvy and to know what’s what.

The world is changing faster and faster, and the worst thing we can do now is to stand still, hunkering down in our fox holes, waiting in vain for a lull in the barrage.  The lull never comes.  To live and to live well, we need the tools to shift, to pivot, to ride the wave of the new. 

That is what education allows us to do.

But even that is not enough.  We need to keep learning as the world changes.  Education never ends, and that is why we need the Public Broadcasting Service (PBS) and National Public Radio (NPR). 

These services are the fastest and easiest and cheapest ways to keep learning, to continue our education.  They expose us to the latest developments on topics, and in areas, we would never seek out for ourselves.  The volume and the value and the treasures and the tools they teach us are priceless.  They are our lifelines as we struggle not to go under as the waves of change crash down upon us.

Some of the topics suck.  No doubt.  And the news trends woke.  Clearly.  There are times when I regretted watching a disturbing PBS segment, and other times when I rushed to the radio to turn NPR off.  And that is the point—I am free to turn it off.  But it is still there when I choose to turn it on again. 

Governments have the responsibility to help their citizens live and to live well.  Continuing education is one simple and cheap way to do that.  The $1B that was cut by Congress yesterday from current funding of PBS and NPR costs about $7 per year per taxpayer.  That is a single Vente Mocha Frappuccino at Starbucks in one year.

Wouldn’t you give away one Vente Mocha Frappuccino per year just to have the option to turn on PBS or NPR?  You don’t even need to do it — just to have the option to? And you might even lose weight in the process.

The Light in Einstein’s Elevator

Gravity bends light!

Of all the audacious proposals made by Einstein, and there were many, this one takes the cake because it should be impossible.

There can be no force of gravity on light because light has no mass.  Without mass, there is no gravitational “interaction”.  We all know Newton’s Law of gravity … it was one of the first equations of physics we ever learned

Newtonian gravitation

which shows the interaction between the masses M and m through their product.  For light, this is strictly zero. 

How, then did Einstein conclude, in 1907, only two years after he proposed his theory of special relativity, that gravity bends light? If it were us, we might take Newton’s other famous equation and equate the two

Newton's second law

and guess that somehow the little mass m (though it equals zero) cancels out to give

Acceleration

so that light would fall in gravity with the same acceleration as anything else, massive or not. 

But this is not how Einstein arrived at his proposal, because this derivation is wrong!  To do it right, you have to think like an Einstein.

“My Happiest Thought”

Towards the end of 1907, Einstein was asked by Johannes Stark to contribute a review article on the state of the relativity theory to the Jahrbuch of Radioactivity and Electronics. There had been a flurry of activity in the field in the two years since Einstein had published his groundbreaking paper in Annalen der Physik in September of 1905 [1]. Einstein himself had written several additional papers on the topic, along with others, so Stark felt it was time to put things into perspective.

Photo of Einstein around 1905 during his Annis Mirabalis.
Fig. 1 Einstein around 1905.

Einstein was still working at the Patent Office in Bern, Switzerland, which must not have been too taxing, because it gave him plenty of time think. It was while he was sitting in his armchair in his office in 1907 that he had what he later described as the happiest thought of his life. He had been struggling with the details of how to apply relativity theory to accelerating reference frames, a topic that is fraught with conceptual traps, when he had a flash of simplifying idea:

“Then there occurred to me the ‘glucklichste Gedanke meines Lebens,’ the happiest thought of my life, in the following form. The gravitational field has only a relative existence in a way similar to the electric field generated by magnetoelectric induction. Because for an observer falling freely from the roof of a house there exists —at least in his immediate surroundings— no gravitational field. Indeed, if the observer drops some bodies then these remain relative to him in a state of rest or of uniform motion… The observer therefore has the right to interpret his state as ‘at rest.'”[2]

In other words, the freely falling observer believes he is in an inertial frame rather than an accelerating one, and by the principle of relativity, this means that all the laws of physics in the accelerating frame must be the same as for an inertial frame. Hence, his great insight was that there must be complete equivalence between a mechanically accelerating frame and a gravitational field. This is the very first conception of his Equivalence Principle.

Cover of the Jahrbuch for Radioactivity and Electronics from 1907.
Fig. 2 Front page of the 1907 volume of the Jahrbuch. The editor list reads like a “Whos-Who” of early modern physics.

Title page to Einstein's 1907 Jahrbuch review article
Fig. 3 Title page to Einstein’s 1907 Jahrbuch review article “On the Relativity Principle and its Consequences” [3]

After completing his review of the consequences of special relativity in his Jahrbuch article, Einstein took the opportunity to launch into his speculations on the role of the relativity principle in gravitation. He is almost appologetic at the start, saying that:

“This is not the place for a detailed discussion of this question.  But as it will occur to anybody who has been following the applications of the principle of relativity, I will not refrain from taking a stand on this question here.”

But he then launches into his first foray into general relativity with keen insights.

The beginning of the section where Einstein first discusses the effects of accelerating frames and effects of gravity
Fig. 4 The beginning of the section where Einstein first discusses the effects of accelerating frames and effects of gravity.

He states early in his exposition:

“… in the discussion that follows, we shall therefore assume the complete physical equivalence of a gravitational field and a corresponding accelerated reference system.”

Here is his equivalence principle. And using it, in 1907, he derives the effect of acceleration (and gravity) on ticking clocks, on the energy density of electromagnetic radiation (photons) in a gravitational potential, and on the deflection of light by gravity.

Over the next several years, Einstein was distracted by other things, such as obtaining his first university position, and his continuing work on the early quantum theory. But by 1910 he was ready to tackle the general theory of relativity once again, when he discovered that his equivalence principle was missing a key element: the effects of spatial curvature, which launched him on a 5-year program into the world of tensors and metric spaces that culminated with his completed general theory of relativity that he published in November of 1915 [4].

The Observer in the Chest: There is no Elevator

Einstein was never a stone to gather moss. Shortly after delivering his triumphal exposition on the General Theory of Relativity, he wrote up a popular account of his Special and now General Theories to be published as a book in 1916, first in German [5] and then in English [6]. What passed for a “popular exposition” in 1916 is far from what is considered popular today. Einstein’s little book is full of equations that would be somewhat challenging even for specialists. But the book also showcases Einstein’s special talent to create simple analogies, like the falling observer, that can make difficult concepts of physics appear crystal clear.

In 1916, Einstein was not yet thinking in terms of an elevator. His mental image at this time, for a sequestered observer, was someone inside a spacious chest filled with measurement apparatus that the observer could use at will. This observer in his chest was either floating off in space far from any gravitating bodies, or the chest was being pulled by a rope hooked to the ceiling such that the chest accelerates constantly. Based on the measurement he makes, he cannot distinguish between gravitational fields and acceleration, and hence they are equivalent. A bit later in the book, Einstein describes what a ray of light would do in an accelerating frame, but he does not have his observer attempt any such measurement, even in principle, because the deflection of the ray of light from a linear path would be far too small to measure.

But Einstein does go on to say that any curvature of the path of the light ray requires that the speed of light changes with position. This is a shocking admission, because his fundamental postulate of relativity from 1905 was the invariance of the speed of light in all inertial frames. It was from this simple assertion that he was eventually able to derive E = mc2. Where, on the one hand, he was ready to posit the invariance of the speed of light, on the other hand, as soon as he understood the effects of gravity on light, Einstein did not hesitate to cast this postulate adrift.

Position-dependent speed of light in relativity.

Fig. 5 Einstein’s argument for the speed of light depending on position in a gravitational field.

(Einstein can be forgiven for taking so long to speak in terms of an elevator that could accelerate at a rate of one g, because it was not until 1946 that the rocket plane Bell X-1 achieved linear acceleration exceeding 1 g, and jet planes did not achieve 1 g linear acceleration until the F-15 Eagle in 1972.)

Aircraft with greater than 1:1 thrust to weight ratios
Fig. 6 Aircraft with greater than 1:1 thrust to weight ratios.

The Evolution of Physics: Enter Einstein’s Elevator

Years passed, and Einstein fled an increasingly autocratic and belligerent Germany for a position at Princeton’s Institute for Advanced Study. In 1938, at the instigation of his friend Leopold Infeld, they decided to write a general interest book on the new physics of relativity and quanta that had evolved so rapidly over the past 30 years.

Title page of "Evolution of Physics" 1938 written with his friend Leopold Infeld at Princeton's Institute for Advanced Study.
Fig. 7 Title page of “Evolution of Physics” 1938 written with his friend Leopold Infeld at Princeton’s Institute for Advanced Study.

Here, in this obscure book that no-one remembers today, we find Einstein’s elevator for the first time, and the exposition talks very explicitly about a small window that lets in a light ray, and what the observer sees (in principle) for the path of the ray.

One of the only figures in the Einstein and Infeld book: The origin of "Einstein's Elevator"!
Fig. 8 One of the only figures in the Einstein and Infeld book: The origin of “Einstein’s Elevator”!

By the equivalence principle, the observer cannot tell whether they are far out in space, being accelerated at the rate g, or whether they are statinary on the surface of the Earth subject to a gravitational field. In the first instance of the accelerating elevator, a photon moving in a straight line through space would appear to deflect downward in the elevator, as shown in Fig. 9, because the elevator is accelerating upwards as the photon transits the elevator. However, by the equivalence principle, the same physics should occur in the gravitational field. Hence, gravity must bend light. Furthermore, light falls inside the elevator with an acceleration g, just as any other object would.

The accelerating elevator and what an observer inside sees (From "Galileo Unbound" (Oxford, 2018).
Fig. 9 The accelerating elevator and what an observer inside sees (From “Galileo Unbound” (Oxford, 2018). [7])

Light Deflection in the Equivalence Principle

A photon enters an elevator at right angles to its acceleration vector g.  Use the geodesic equation and the elevator (Equivalence Principle) metric [8]

to show that the trajectory is parabolic. (This is a classic HW problem from Introduction to Modern Dynamics.)

The geodesic equation with time as the dependent variable

This gives two coordinate equations

Note that x0 = ct and x1 = ct are both large relative to the y-motion of the photon.  The metric component that is relevant here is

and the others are unity.  The geodesic becomes (assuming dy/dt = 0)

The Christoffel symbols are

which give

Therefore

or

where the photon falls with acceleration g, as anticipated.

Light Deflection in the Schwarzschild Metric

Do the same problem of the light ray in Einstein’s Elevator, but now using the full Schwarzschild solution to the Einstein Field equations.

Schwarzschild metric

Einstein’s elevator is the classic test of virtually all heuristic questions related to the deflection of light by gravity.  In the previous Example, the deflection was attributed to the Equivalence Principal in which the observer in the elevator cannot discern whether they are in an acceleration rocket ship or standing stationary on Earth.  In that case, the time-like metric component is the sole cause of the free-fall of light in gravity.  In the Schwarzschild metric, on the other hand, the curvature of the field near a spherical gravitating body also contributes.  In this case, the geodesic equation, assuming that dr/dt = 0 for the incoming photon, is

where, as before, the Christoffel symbol for the radial displacements are

Evaluating one of these

The other Christoffel symbol that contributes to the radial motion is

and the geodesic equation becomes

with

The radial acceleration of the light ray in the elevator is thus

The first term on right is free-fall in gravity, just as was obtained from the Equivalence Principal.  The second term is a higher-order correction caused by curvature of spacetime.  The third term is the motion of the light ray relative to the curved ceiling of the elevator in this spherical geometry and hence is a kinematic (or geometric) artefact.  (It is interesting that the GR correction on the curved-ceiling correction is of the same order as the free-fall term, so one would need to be very careful doing such an experiment … if it were at all measurable.) Therefore, the second and third terms are curved-geometry effects while the first term is the free fall of the light ray.


  

Post-Script: The Importance of Library Collections

I was amused to see the library card of the scanned Internet Archive version of Einstein’s Jahrbuch article, shown below. The volume was checked out in August of 1981 from the UC Berkeley Physics Library. It was checked out again 7 years later in September of 1988. These dates coincide with when I arrived at Berkeley to start grad school in physics, and when I graduated from Berkeley to start my post-doc position at Bell Labs. Hence this library card serves as the book ends to my time in Berkeley, a truly exhilarating place that was the top-ranked physics department at that time, with 7 active Nobel Prize winners on its faculty.

During my years at Berkeley, I scoured the stacks of the Physics Library looking for books and journals of historical importance, and was amazed to find the original volumes of Annalen der Physik from 1905 where Einstein published his famous works. This was the same library where, ten years before me, John Clauser was browsing the stacks and found the obscure paper by John Bell on his inequalities that led to Clauser’s experiment on entanglement that won him the Nobel Prize of 2022.

That library at UC Berkeley was recently closed, as was the Physics Library in my department at Purdue University (see my recent Blog), where I also scoured the stacks for rare gems. Some ancient books that I used to be able to check out on a whim, just to soak up their vintage ambience and to get a tactile feel for the real thing held in my hands, are now not even available through Interlibrary Loan. I may be able to get scans from Internet Archive online, but the palpable magic of the moment of discovery is lost.

References:

[1] Einstein, A. (1905). Zur Elektrodynamik bewegter Körper. Annalen der Physik, 17(10), 891–921.

[2] Pais, A (2005). Subtle is the Lord: The Science and Life of Albert Einstein (Oxford University Press). pg. 178

[3] Einstein, A. (1907). Über das Relativitätsprinzip und die aus demselben gezogenen Folgerungen. Jahrbuch der Radioaktivität und Elektronik, 4, 411–462.

[4] A. Einstein (1915), “On the general theory of relativity,” Sitzungsberichte Der Koniglich Preussischen Akademie Der Wissenschaften, pp. 778-786, Nov.

[5] Einstein, A. (1916). Über die spezielle und die allgemeine Relativitätstheorie (Gemeinverständlich). Braunschweig: Friedr. Vieweg & Sohn.

[6] Einstein, A. (1920). Relativity: The Special and the General Theory (A Popular Exposition) (R. W. Lawson, Trans.). London: Methuen & Co. Ltd.

[7] Nolte, D. D. (2018). Galileo Unbound. A Path Across Life, the Universe and Everything. (Oxford University Press)

[8] Nolte, D. D. (2019). Introduction to Modern Dynamics: Chaos, Networks, Space and Time (Oxford University Press).

Read more in Books by David Nolte at Oxford University Press.

Magister Mercator Maps the World (1569)

Gerardus Mercator was born in no-man’s land, in Flanders’ fields, caught in the middle between the Protestant Reformation and the Holy Roman Empire.  In his lifetime, armies washed back and forth over the countryside, sacking cities and obliterating the inhabitants.  At age 32 he was imprisoned by the Inquisition for heresy, though he had committed none, and languished for months as the authorities searched for the slimmest evidence against him.  They found none and he was released, though several of his fellow captives—elite academicians of their day—met their ends burned at the stake or beheaded or buried alive. It was not an easy time to be a scholar, with you and your work under persistent attack by political zealots.

Mercator received the degree of Magister, the degree in medieval universities that is equivalent to a Doctor of Philosophy … and then took what today we would call a “gap year” to “find himself” …

Yet in the midst of this turmoil and destruction, Mercator created marvels.  Originally trained for the Church, he was bewitched by cartography at a time when the known world was expanding rapidly after the discoveries of the New World.  Though the cognoscenti had known that the Earth was spherical since long before the Greeks, everyone saw it as flat, including cartographers, who in practice had to render it on flat maps.  When the world was local, flat maps worked well.  But as the world became global, new cartography methods were needed to capture the sphere, and Mercator entered the profession at just the moment when cartography was poised for a revolution.

Gerardus Mercator

The life of Gerardus Mercator (1512 – 1594) spanned nearly the full 16th century.  He was born 20 years after Colombus’ first voyage, and he died as Galileo began to study the law of fall, as Kepler began his study of planetary motion, and as Shakespeare began writing Romeo and Juliet.  Mercator was born in the town of Rupelmonde, Flanders, outside of Antwerp in the southern part of the Netherlands ruled by Hapsburg Spain.  His father was a poor shoemaker, but his uncle was an influential member of the clergy who paid for his nephew to attend a famous local school, in ‘s-Hertogenbosch, one where the humanist philosopher Erasmus (1466 – 1536) had attended several decades earlier. 

Mercator entered the University of Leuven in 1530 in the humanities where his friends included Andreas Vesalius (the future famous anatomist) and Antoine Granvelle (who would become one of the most powerful Cardinals of the era).  Mercator received the degree of Magister, the degree in medieval universities that is equivalent to a Doctor of Philosophy, in 1532, and then took what today we would call a “gap year” to “find himself” because he was having doubts about his faith and his future in the clergy.  It was during his gap year that he was introduced to cartography by the Franciscan friar Franciscus Monachus (1490 – 1565) at the Mechelen monastery situated between Antwerp and Brussels.

Returning to the University of Leuven in 1534, he launched himself into the physical sciences of geography and mathematics, for which he had no training, but he quickly mastered them under the tutelage of the Dutch mapmaker Gemma Frisius (1508 – 1555) at the university.  In 1537 Mercator completed his first map, a map of Palestine that received wide acclaim for its accuracy and artistry, and (more importantly) it sold well.  He had found his vocation.

Early Cartography

Maps are among the oldest man-made textual artefacts, dating to nearly 7000 BCE, several millennia before the invention of writing itself.  Knowing where things are, and where you are in relation to them, is probably the most important thing to remember in daily life.  Texts are memory devices, and maps are the oldest texts. 

The Alexandrian mathematician Claudius Ptolemy, around 150 CE, compiled a list of all the known world in his Geografia and drew up a map to accompany it.  It survived through Arabic translation and became a fixture in early medieval Europe where it remained a record of virtually all that was known until Christopher Columbus ran into the Caribbean Islands in 1492 on his way to China. Maps needed to be redrawn.

A pseudo-conic projection of the Mediterranean attributed to Ptolemy.
Fig. 1. A 1482 reproduction of the map of Ptolemy from 150 BCE. The known world had not expanded much in 1000 years. There is no bottom to Africa (the voyage of Bartolomeu Dias around the Cape of Good Hope came 6 years later) and no New World (Columbus’s first voyage was 10 years off).

The first map to show the new world was printed in 1500 by the Castillan navigator Juan de la Cosa, who had sailed with Columbus three times. His map included the explorations of John Cabot to the northern coasts.   

Portolan map by Juan de la Cosa.
Fig. 2. Juan de la Cosa’s 1500 map showing the new world as a single landmass (dark green on the left). Europe, Africa and Asia are outlined in light lettering in the center and right.

De la Cosa’s map was followed shortly by the world map of Martin Waldseemüller who named a small part of Brazil “America” in honor of Amerigo Vespucci who had just published an account of his adventures along the coasts of the new lands. 

The Waldseemüller map of 1507
Fig. 3. The Waldseemüller map of 1507 using “America” to name a part of current-day Brazil.

Leonardo da Vinci went further and created an eight-octant map of the globe around 1514, calling the entire new landmass “America”, expanding on Waldseemüller’s use of the name beyond merely Brazil.

The gores of Leonardo's world.
Fig. 4. The eight-octant globe found in the Leonardo codex in England. The globe is likely not by Leonardo’s own hand, but by one of his followers created sometime after 1507. The detail has far less than on the Waldseemüller map, but it is notable because it calls all of the New World “America”.

In 1538, just a year after his success with his Palastine map, Mercator created a map of the world that showed for the first time the separation of the Americas into two continents, the North and the South, expanding the name “America” to its full modern extent.

Mercator's 1538 map of the world.
Fig. 5. Mercator’s 1538 World Map showing North America and South America as separate continents. This is a “double cordiform” projection, which is a modified conical projection onto an internal cone with the apex at the Poles and the base at the Equator. The cone is split along the international date line (long before that was created). The Arctic is shown as an ocean while the Antarctic is shown as a continent (long before either of these facts were known).

These maps by the early cartographers were not functional maps for navigation, but were large, sometimes many feet across, meant to be displayed to advantage on the spacious walls in the rooms of the rich and famous.  On the other hand, since the late Middle Ages, there had been a long-standing tradition of map making among navigators whose lives depended on the utility and accuracy of their maps.  These navigational charts were called Portolan Charts, meaning literally charts of ports or harbors.  They carried sheaves of straight lines representing courses of constant magnetic bearing, meaning that the angle between the compass needle and the direction of the boat stayed constant. These are called rhumb lines, and they allowed ships to navigate between two known points beyond the sight of land.  The importance of rhumb lines far surpassed the use of decorative maps.  Mercator knew this, and for his next world map, he decided to give it rhumb lines that spanned the globe.  The problem was how to do it.

Portolan chart of the central Mediterranean.
Fig. 6. A Portolan Chart of the Mediterranean with Italy and Greece at the center, outlined by light lettering by the names of ports and bays. The straight lines are rhumb lines for constant-bearing navigation.

A Conformal Projection

Around the time that Mercator was bursting upon the cartographic scene, a Portuguese mathematician, Pedro Nunes, was studying courses of constant bearing upon a spherical globe.  These are mathematical paths on the sphere that were later called loxodromes, but over short distances, they corresponded to the rhumb line. 

Thirty years later, Mercator had become a master cartographer, creating globes along with scientific instruments and maps.  His globes were among the most precise instruments of their day, and he learned how to draw accurate loxodromes, following the work of Nunes.  On a globe, these lines became “curly cues” as they approached a Pole of the sphere, circling around the Pole in ever tighter circles that defied mathematical description (until many years later when Thomas Harriot showed they were logarithmic spirals).  Yet Mercator was a master draftsman, and he translated the curved loxodromes on the globe into straight lines on a world map.  What he discovered was a projection in which all lines of Longitude and all lines of Meridian were straight lines, as were all courses of constant bearing.  He completed his map in 1569, explicitly hawking its utility as a map that could be used on a global scale just as Portolan charts had been used in the Mediterranean.

Map of the North Atlantic by Gerard Mercator.
Fig. 7. A portion of Mercator’s 1569 World Map. The island just south of Thule (Iceland) is purely fictitious. Mercator has also filled in the Arctic Ocean with a new continent.
A segment of Mercator's 1569 map of the world.
Fig. 8. The Atlantic Ocean on Mercator’s 1569 map. Rhumb lines run true at all latitudes.

Mercator in 1569 was already established and famous and an old hand at making maps, yet even he was impressed by the surprising unity of his discovery.  Today, the Mercator projection is called a conformal map, meaning that all angles among intersecting lines on the globe are conserved in the planar projection, explaining the linear longitudes, latitudes and rhumbs.

The Geometry of Gerhardus Mercator

Mercator’s new projection is a convenient exercise in differential geometry. Begin with the transformation from spherical coordinates to Cartesian coordinates

where λ is the longitude and φ is the latitude. The Jacobian matrix is

Taking the transpose, and viewing each row as a new vector

creates the basis vectors of the spherical surface

A unit vector with constant heading at angle β is expressed in the new basis vectors as

and the path length and arc length along a constant-bearing path are related as

Equating common coefficients of the basis vectors gives

which is solved to yield the ordinary differential equation

This is integrated as

which is a logarithmic spiral.  The special function is called “the inverse Gundermannian”.  The longitude λ as a function of the latitude φ is solved as

To generate a Mercator rhumb, we only need to go back to a new set of Cartesian coordinates on a flat map

It is interesting to compare the Mercator projection to a conical projection onto a cylinder touching the sphere at its equator where the Mercator projection is

Equation for the Mercator projection.

while the conical projection onto the cylinder is

Clearly, the two projections are essentially the same around the Equator, but deviate exponentially approaching the Poles.

The Mercator projection has the conformal advantage, but it also has the disadvantage that landmasses at increasing latitude increase in size relative to their physical size on the glove.  Therefore, Greenland looks as big as Africa on a Mercator projection, while it is in fact only about the size of Texas.  The exaggerated sizes of countries in the upper latitudes (like the USA and Europe) relative to tropical countries near the equator has been viewed as creating an unfair psychological bias of first-world countries over third-world countries.  For this reason, Mercator projections are virtually never used today, with other map projections that retain relative sizes being now the most common. 

References


Crane, N. (2002), Mercator: The Man who Mapped the Planet, Weidenfeld & Nicolson, London.


Kythe, P. K. (2019), Handbook of Conformal Mappings and Applications, CRC Press.


Monmonier, M. S. (2004), Rhumb Lines and Map Wars: A Social History of the Mercator Projection, University of Chicago Press.


Snyder, J. P. (2002), Flattening the earth: Two thousand years of map projections, 5. ed ed., The University of Chicago Press, Chicago

Taylor, A. (2004), The World of Gerard Mercator: The Mapmaker Who Revolutionized Geography, Walker & Company, New York.

Read more in Books by David D. Nolte at Oxford University Press

Purge and Precipice: The Fall of American Science?

Let’s ask a really crazy question. As a pure intellectual exercise—not that it would ever happen—but just asking: What would it take to destroy American science? I know this is a silly question. After all, no one in their right mind would want to take down American science. It has been the guiding light in the world for the last 100 years, ushering in such technological marvels of modern life like transistors and the computer and lasers and solar panels and vaccines and immunotherapy and disease-resistant crops and such. So of course, American science is a National Treasure, more valuable than all the National Treasures in Washington, and no one would ever dream of attacking those.

But for the sake of argument, just to play Devil’s Advocate, what if someone with some power, someone who could make otherwise sensible people do his will, wanted to wash away the last 100 years of American leadership in Science? How would he do it?

The answer is obvious: Use science … and maybe even physics.

The laws of physics are really pretty simple: Cause and effect, action and reaction … those kinds of things. And modern physics is no longer about rocks thrown from cliffs, but is about the laws governing complex systems, like networks of people.

Can we really put equations to people? This was the grand vision of Isaac Asimov in his Foundation Trilogy. In that story, the number of people in a galaxy became so large that the behavior of the population as a whole could be explained by a physicist, Hari Seldon, using the laws of statistical mechanics. Asimov called it psychohistory.

It turns out we are not that far off today, and we don’t need a galaxy full of people to make it valid. But the name of the theory turns out to be a bit more prosaic than psychohistory: it’s called Network theory.

Network Theory

Network theory, at its core, is simply about nodes and links. It asks simple questions, like: What defines a community? What kind of synergy makes communities work? And when do things fall apart?

Science is a community.

In the United State, there are approximately a million scientists , 70% of whom work in industry with 20% in academia and 10% in government (at least, prior to 2025). Despite the low fraction employed in academia, all scientists and engineers received their degrees from universities and colleges and many received post-graduate training at those universities and at national labs like Los Alamos and the NIH labs in Washington. These are the backbone of the American scientific community, these are the hubs from which the vast network of scientists connect out across the full range of industrial and manufacturing activities that drive 70% of the GDP of the United States. The universities and colleges are also reservoirs for long-term science knowledge that can be tapped at a moment’s notice by industry when it pivots to new materials or new business models.

In network theory, hubs hold the key to the performance of the network. In technical terms, hubs have high average degree, which means that hubs connect to a large fraction of the total network. This is why hubs are central to network health and efficiency. Hubs also are the main cause of the “Small World Effect”, which states that everyone on a network is only a few links away from anyone else. This is also known as “Six degrees of Separation”, because in even vast networks that span the country, it only takes about 6 friends of friends of friends of friends of friends of friends before you connect to any given person. The world is small because you know someone who is a hub, and they know everyone else. This is a fundamental result of network theory, whether the network is of people, or servers, or computer chips.

Having established how important hubs are to network connectivity, it is clear that the disproportionate importance of hubs make them a disproportionate target for network disruption. For instance, in the power grid, take down a large central switching station and you can take down the grid over vast portions of the country. The same is true for science and the science community. Take down a few of the key pins, and the whole network can collapse—a topic of percolation theory.

Percolation and Collapse

Percolation theory says what it does––it tells when a path on a network is likely to “percolate” across it—like water percolating through coffee grounds. For a given number of nodes N, there needs to be enough links so that most of the nodes belong to the largest connected cluster. Then most starting paths can percolate across the whole network. On the other hand, if enough links are broken, then the network breaks apart into a lot of disconnected clusters, and you cannot get from one to the others.

Percolation theory says a lot about the percolation transition that occurs at the percolation threshold—which describes how the likelihood of having a percolating path across a network rises and falls as the number of links in the network increases or decreases. It turns out that for large networks, this transition from percolating to non-percolating is abrupt. When there are just barely enough links to keep the network connected, then removing just one link can separate it into disconnected clusters. In other words, the network collapses.

Therefore, network collapse can be sudden and severe. It is even possible to be near the critical percolation condition and not know it. All can seem fine, with plenty of paths to choose from to get across the network—then lose just a few links—and suddenly the network collapses into a bunch of islands. This is sometimes known as a tipping point—also as a bifurcation or as a catastrophe. Tipping points, bifurcations and percolation transitions get a lot of attention in network theory, because they are sudden and large events that can occur with little forewarning.

So the big question for this blog is: What would it take to have the scientific network of the United States collapse?

Department of Governmental Exterminations (DOGE)

The head of DOGE is a charismatic fellow, and like the villain of Jane Austen’s Pride and Prejudice, he was initially a likable character. But he turned out to be an arbiter of chaos and a cad. No one would want to be him in the end. The same is true in our own Austenesque story of Purge and Precipice: As DOGE purges, we approach the precipice.

Falling off a cliff is easy, because if a network has hubs, and those hubs have a disproportionate importance to keeping the network together, then an excellent strategy to destroy the network would be to randomly take out the most important hubs.

If the hubs of the scientific network across the US are the universities and colleges and government labs, then attack those, even though they only hold 20% to 30% of the scientists in the country, you can bring science to a standstill in the US by breaking apart the network into isolate islands. Alternatively, when talking about individuals in a network, the most important hubs are the scientists who are the repositories of the most knowledge—the elder statesmen of their fields—the ones you can get to buy out and retire.

Networks with strongly connected hubs are the most vulnerable to percolation collapse when the hubs are attacked specifically.

Science Network Evolving under Reduction in Force through Natural Attrition

Fig. 1 Healthy network evolving under a 15% reduction in force (RIF) through natural retirement and attrition.

This simulation looks at a reduction in force (RIF) of 15% and its effect on a healthy interaction network. It uses a scale-free network that evolves in time as individuals retire naturally or move to new jobs. When a node is removed from the net, it becomes a disconnected dot in the video. Other nodes that were “orphaned” by the retirement are reassigned to existing nodes. Links represent scientific interactions or lines of command. A few links randomly shift as interests change. Random retirements might hit a high-degree node (a hub), but the event is rare enough that the natural rearrangements of the links continue to keep the network connected and healthy as it adapts to the loss of key opinion leaders.

Science Network under DOGE Attack

Fig. 2 An attack on the high-degree nodes (the hubs) of the network, leading to the same 15% RIF as Fig. 1. The network becomes fragmented and dysfunctional.

Universities and government laboratories are high-degree nodes that have a disproportionate importance to the Science Network. By targeting these nodes, the network rapidly disintegrates. The effect is too drastic for the rearrangement of some links to fix it.

The percolation probability of an interaction network, like the Science Network, is a fair measure of scientific productivity. The more a network is interconnected, the more ideas flow across the web, eliciting new ideas and discoveries that often lead to new products and growth in the national GDP. But a disrupted network has low productivity. The scientific productivity is plotted in Fig. 3 as a function of the reduction in force up to 15%. Natural attrition can attain this RIF with minimal impact on the productivity of the network measured through its percolation probability. However, targeted attacks on the most influential scientific hubs rapidly degrades the network, breaking it apart into lots of disconnected clusters. There is no free flow of ideas and lost opportunities for new products and eventual erosion of the national GDP.

Fig. 3 Scientific productivity, measured by the percolation probability across the network, as a function of the reduction in force up to 15%. Natural attrition keeps most of the productivity high. Targeted attacks on the most influential science institutions decimate the Science Network.

It takes about 15 years for scientific discoveries to establish new products in the market place. Therefore, a collapse of American science over the next few years won’t be fully felt until around the year 2040. All the politicians in office today will be long gone by then (let’s hope!), so they will never get the blame. But our country will be poorer and weaker, and our lives will be poorer and sicker—the victims of posturing and grandstanding for no real benefit other than the fleeting joy of wrecking what was built over the past century. When I watch the glee of the Perp in Chief and his henchmen as they wreak their havoc, I am reminded of “griefers” in Minecraft.

The Upshot

One of the problems with being a physicist is that sometimes you see the train wreck coming.

I see a train wreck coming.

PostScript

It is important not to take these simulations too literally as if they were an accurate numerical model of the Science Network in the US. The point of doing physics is not to fit all the parameters—that’s for the engineers. The point of doing physics is to recognize the possibilities and to see the phenomena—as well as the dangers.

Take heed of the precipice. It is real. Are we about to go over it? It’s hard to tell. But should we even take the chance?

Maxwellian Steampunk and the Origins of Maxwell’s Equations

Physicists of the nineteenth century were obsessed with mechanical models.  They must have dreamed, in their sleep, of spinning flywheels connected by criss-crossing drive belts turning enmeshed gears.  For them, Newton’s clockwork universe was more than a metaphor—they believed that mechanical description of a phenomenon could unlock further secrets and act as a tool of discovery. 

It is no wonder they thought this way—the mid-eighteenth century was at the peak of the industrial revolution, dominated by the steam engine and the profusion of mechanical power and gears across broad swaths of society. 

Steampunk

The Victorian obsession with steam and power is captured beautifully in the literary and animé genre known as Steampunk.  The genre is alternative historical fiction that portrays steam technology progressing into grand and wild new forms as electrical and gasoline technology fail to develop.  An early classic in the genre is Miyazaki’s 1986 anime´ film Castle in the Sky (1986) by Hayao Miyazaki about a world where all mechanical devices, including airships, are driven by steam.  A later archetype of the genre is the 2004 animé film Steam Boy (2004) by Katsuhiro Otomo about the discovery of superwater that generates unlimited steam power.  As international powers vie to possess it, mad scientists strive to exploit it for society, but they create a terrible weapon instead.   One of the classics that helped launch the genre is the novel The Difference Engine (1990) by William Gibson and Bruce Sterling that envisioned an alternative history of computers developed by Charles Babbage and Ada Lovelace.

Scenes from Miyazaki's Castle in the Sky.

Steampunk is an apt, if excessively exaggerated, caricature of the Victorian mindset and approach to science.  Confidence in microscopic mechanical models among natural philosophers was encouraged by the success of molecular models of ideal gases as the foundation for macroscopic thermodynamics.  Pictures of small perfect spheres colliding with each other in simple billiard-ball-like interactions could be used to build up to overarching concepts like heat and entropy and temperature.  Kinetic theory was proposed in 1857 by the German physicist Rudolph Clausius and was quickly placed on a firm physical foundation using principles of Hamiltonian dynamics by the British physicist James Clerk Maxwell.

DVD cover of Steamboy by Otomo.

James Clerk Maxwell

James Clerk Maxwell (1831 – 1879) was one of three titans out of Cambridge who served as the intellectual leaders in mid-nineteenth-century Britain. The two others were George Stokes and William Thomson (Lord Kelvin).  All three were Wranglers, the top finishers on the Tripos exam at Cambridge, the grueling eight-day examination across all fields of mathematics.  The winner of the Tripos, known as first Wrangler, was announced with great fanfare in the local papers, and the lucky student was acclaimed like a sports hero is today.  Stokes in 1841 was first Wrangler while Thomson (Lord Kelvin) in 1845 and Maxwell in 1854 were each second Wranglers.  They were also each winners of the Smith’s Prize, the top examination at Cambridge for mathematical originality.  When Maxwell sat for the Smith’s Prize in 1854 one of the exam problems was a proof written by Stokes on a suggestion by Thomson.  Maxwell failed to achieve the proof, though he did win the Prize.  The problem became known as Stokes’ Theorem, one of the fundamental theorems of vector calculus, and the proof was eventually provided by Hermann Hankel in 1861.

James Clerk Maxwell.

After graduation from Cambridge, Maxwell took the chair of natural philosophy at Marischal College in the city of Aberdeen in Scotland.  He was only 25 years old when he began, fifteen years younger than any of the other professors.  He split his time between the university and his family home at Glenlair in the south of Scotland, which he inherited from his father the same year he began his chair at Aberdeen.  His research interests spanned from the perception of color to the rings of Saturn.  He improved on Thomas Young’s three-color theory by correctly identifying red, green and blue as the primary receptors of the eye and invented a scheme for adding colors that is close to the HSV (hue-saturation-value) system used today in computer graphics.  In his work on the rings of Saturn, he developed a statistical mechanical approach to explain how the large-scale structure emerged from the interactions among the small grains.  He applied these same techniques several years later to the problem of ideal gases when he derived the speed distribution known today as the Maxwell-Boltzmann distribution.

Maxwell’s career at Aberdeen held great promise until he was suddenly fired from his post in 1860 when Marischal College merged with nearby King’s College to form the University of Aberdeen.  After the merger, the university had the abundance of two professors of Natural Philosophy while needing only one, and Maxwell was the junior.  With his new wife, Maxwell retired to Glenlair and buried himself in writing the first drafts of a paper titled “On Physical Lines of Force” [2].  The paper explored the mathematical and mechanical aspects of the curious lines of magnetic force that Michael Faraday had first proposed in 1831 and which Thomson had developed mathematically around 1845 as the first field theory in physics. 

As Maxwell explored the interrelationships among electric and magnetic phenomena, he derived a wave equation for the electric and magnetic fields and was astounded to find that the speed of electromagnetic waves was essentially the same as the speed of light.  The importance of this coincidence did not escape him, and he concluded that light—that rarified, enigmatic and quintessential fifth element—must be electromagnetic in origin. Ever since Francois Arago and Agustin Fresnel had shown that light was a wave phenomenon, scientists had been searching for other physical signs of the medium that supported the waves—a medium known as the luminiferous aether (or ether). With Maxwell’s new finding, it meant that the luminiferous ether must be related to electric and magnetic fields.  In the Steampunk tradition of his day, Maxwell began a search for a mechanical model.  He did not need to look far, because his friend Thomson had already built a theory on a foundation provided by the Irish mathematician James MacCullagh (1809 – 1847)

The Luminiferous Ether

The late 1830’s was a busy time for the luminiferous ether.  Agustin-Louis Cauchy published his extensive theory of the ether in 1836, and the self-taught George Green published his highly influential mathematical theory in 1838 which contained many new ideas, such as the emphasis on potentials and his derivation of what came to be called Green’s theorem

In 1839 MacCullagh took an approach that established a core property of the ether that later inspired both Thomson and Maxwell in their development of electromagnetic field theory.  What McCullagh realized was that the energy of the ether could be considered as if it had both kinetic energy and potential energy (ideas and nomenclature that would come several decades later).  Most insightful was the fact that the potential energy of the field depended on pure rotation like a vortex.  This rotationally elastic ether was a mathematical invention without any mechanical analog, but it successfully described reflection and refraction as well as polarization of light in crystalline optics. 

In 1856 Thomson put Faraday’s famous magneto-optic rotation of light (the Faraday Effect discovered by Faraday in 1845) into mathematical form and began putting Faraday’s initially abstract ideas of the theory of fields into concrete equations.  He drew from MacCullagh’s rotational ether as well as an idea from William Rankine about the molecular vortex model of atoms to develop a mechanical vortex model of the ether.  Thomson explained how the magnetic field rotated the linear polarization of light through the action of a multiplicity of molecular vortices.  Inspired by Thomson, Maxwell took up the idea of molecular vortices as well as Faraday’s magnetic induction in free space and transferred the vortices from being a property exclusively of matter to being a property of the luminiferous ether that supported the electric and magnetic fields. 

Maxwellian Cogwheels

Maxwell’s model of the electromagnetic fields in the ether is the apex of Victorian mechanistic philosophy—too explicit to be a true model of reality—yet it was amazingly fruitful as a tool of discovery, helping Maxwell develop his theory of electrodynamics. The model consisted of an array of elastic vortex cells separated by layers of small particles that acted as “idle wheels” to transfer spin from one vortex to another .  The magnetic field was represented by the rotation of the vortices, and the electric current was represented by the displacement of the idle wheels. 

Maxwell's vortex model
Fig. 1 Maxwell’s vortex model of the electromagnetic ether.  The molecular vortices rotate according to the direction of the magnetic field, supported by idle wheels.  The physical displacement of the idle wheels became an analogy for Maxwell’s displacement current [2].

Two predictions by this outrightly mechanical model were to change the physics of electromagnetism forever:  First, any change in strain in the electric field would cause the idle wheels to shift, creating a transient current that was called a “displacement current”.  This displacement current was one of the last pieces in the electromagnetic puzzle that became Maxwell’s equations. 

Maxwell's discovery of the displacement current
Fig. 2 In “Physical Lines of Force” in 1861, Maxwell introduces the idea of a displacement current [RefLink].

In this description, E is not the electric field, but is related to the dielectric permativity through the relation

Maxwell went further to prove his Proposition XIV on the contribution of the displacement current to conventional electric currents.

Maxwell completing the laws of electromagnetics
Fig. 3 Maxwell’s Proposition XIV on adding the displacement current to the conventional electric current [RefLink].

Second, Maxwell calculated that this elastic vortex ether propagated waves at a speed that was close to the known speed of light measured a decade previously by the French physicist Hippolyte Fizeau.  He remarked, “we can scarcely avoid the inference that light consists of the transverse undulations of the same medium which is the cause of electric and magnetic phenomena.” [1]  This was the first direct prediction that light, previously viewed as a physical process separate from electric and magnetic fields, was an electromagnetic phenomenon.

Maxwell's estimate of the speed of light
Fig. 4 Maxwell’s calculation of the speed of light in his mechanical ether. It matched closely the measured speed of light [RefLink].

These two predictions—of the displacement current and the electromagnetic origin of light—have stood the test of time and are center pieces of Maxwells’s legacy.  How strange that they arose from a mechanical model of vortices and idle wheels like so many cogs and gears in the machinery powering the Victorian age, yet such is the power of physical visualization.


[1] pg. 12, The Maxwellians, Bruce Hunt (Cornell University Press, 1991)

[2] Maxwell, J. C. (1861). “On physical lines of force”. Philosophical Magazine. 90: 11–23.

Books by David Nolte at Oxford University Press