How Number Theory Protects You from the Chaos of the Cosmos

We are exceedingly fortunate that the Earth lies in the Goldilocks zone.  This zone is the range of orbital radii of a planet around its sun for which water can exist in a liquid state.  Water is the universal solvent, and it may be a prerequisite for the evolution of life.  If we were too close to the sun, water would evaporate as steam.  And if we are too far, then it would be locked in perpetual ice.  As it is, the Earth has had wild swings in its surface temperature.  There was once a time, more than 650 million years ago, when the entire Earth’s surface froze over.  Fortunately, the liquid oceans remained liquid, and life that already existed on Earth was able to persist long enough to get to the Cambrian explosion.  Conversely, Venus may once have had liquid oceans and maybe even nascent life, but too much carbon dioxide turned the planet into an oven and boiled away its water (a fate that may await our own Earth if we aren’t careful).  What has saved us so far is the stability of our orbit, our steady distance from the Sun that keeps our water liquid and life flourishing.  Yet it did not have to be this way. 

The regions of regular motion associated with irrational numbers act as if they were a barrier, restricting the range of chaotic orbits and protecting other nearby orbits from the chaos.

Our solar system is a many-body problem.  It consists of three large gravitating bodies (Sun, Jupiter, Saturn) and several minor ones (such as Earth).   Jupiter does influence our orbit, and if it were only a few times more massive than it actually is, then our orbit would become chaotic, varying in distance from the sun in unpredictable ways.  And if Jupiter were only about 20 times bigger than is actually is, there is a possibility that it would perturb the Earth’s orbit so strongly that it could eject the Earth from the solar system entirely, sending us flying through interstellar space, where we would slowly cool until we became a permanent ice ball.  What can protect us from this terrifying fate?  What keeps our orbit stable despite the fact that we inhabit a many-body solar system?  The answer is number theory!

The Most Irrational Number

What is the most irrational number you can think of? 

Is it: pi = 3.1415926535897932384626433 ? 

Or Euler’s constant: e = 2.7182818284590452353602874 ?

How about: sqrt(3) = 1.73205080756887729352744634 ?

These are all perfectly good irrational numbers.  But how do you choose the “most irrational” number?  The answer is fairly simple.  The most irrational number is the one that is least well approximated by a ratio of integers.  For instance, it is possible to get close to pi through the ratio 22/7 = 3.1428 which differs from pi by only 4 parts in ten thousand.  Or Euler’s constant 87/32 = 2.7188 differs from e by only 2 parts in ten thousand.  Yet 87 and 32 are much bigger than 22 and 7, so it may be said that e is more irrational than pi, because it takes ratios of larger integers to get a good approximation.  So is there a “most irrational” number?  The answer is yes.  The Golden Ratio.

The Golden ratio can be defined in many ways, but its most common expression is given by

It is the hardest number to approximate with a ratio of small integers.  For instance, to get a number that is as close as one part in ten thousand to the golden mean takes the ratio 89/55.  This result may seem obscure, but there is a systematic way to find the ratios of integers that approximate an irrational number. This is known as a convergent from continued fractions.

Continued fractions were invented by John Wallis in 1695, introduced in his book Opera Mathematica.  The continued fraction for pi is

An alternate form of displaying this continued fraction is with the expression

The irrational character of pi is captured by the seemingly random integers in this string. However, there can be regular structure in irrational numbers. For instance, a different continued fraction for pi is

that has a surprisingly simple repeating pattern.

The continued fraction for the golden mean has an especially simple repeating form

or

This continued fraction has the slowest convergence for its continued fraction of any other number. Hence, the Golden Ratio can be considered, using this criterion, to be the most irrational number.

If the Golden Ratio is the most irrational number, how does that save us from the chaos of the cosmos? The answer to this question is KAM!

Kolmogorov, Arnold and Moser: (KAM) Theory

KAM is an acronym made from the first initials of three towering mathematicians of the 20th century: Andrey Kolmogorov (1903 – 1987), his student Vladimir Arnold (1937 – 2010), and Jürgen Moser (1928 – 1999).

In 1954, Kolmogorov, considered to be the greatest living mathematician at that time, was invited to give the plenary lecture at a mathematics conference. To the surprise of the conference organizers, he chose to talk on what seemed like a very mundane topic: the question of the stability of the solar system. This had been the topic which Poincaré had attempted to solve in 1890 when he first stumbled on chaotic dynamics. The question had remained open, but the general consensus was that the many-body nature of the solar system made it intrinsically unstable, even for only three bodies.

Against all expectations, Kolmogorov proposed that despite the general chaotic behavior of the three–body problem, there could be “islands of stability” which were protected from chaos, allowing some orbits to remain regular even while other nearby orbits were highly chaotic. He even outlined an approach to a proof of his conjecture, though he had not carried it through to completion.

The proof of Kolmogorov’s conjecture was supplied over the next 10 years through the work of the German mathematician Jürgen Moser and by Kolmogorov’s former student Vladimir Arnold. The proof hinged on the successive ratios of integers that approximate irrational numbers. With this work KAM showed that indeed some orbits are actually protected from neighboring chaos by relying on the irrationality of the ratio of orbital periods.

Resonant Ratios

Let’s go back to the simple model of our solar system that consists of only three bodies: the Sun, Jupiter and Earth. The period of Jupiter’s orbit is 11.86 years, but instead, if it were exactly 12 years, then its period would be in a 12:1 ratio with the Earth’s period. This ratio of integers is called a “resonance”, although in this case it is fairly mismatched. But if this ratio were a ratio of small integers like 5:3, then it means that Jupiter would travel around the sun 5 times in 15 years while the Earth went around 3 times. And every 15 years, the two planets would align. This kind of resonance with ratios of small integers creates a strong gravitational perturbation that alters the orbit of the smaller planet. If the perturbation is strong enough, it could disrupt the Earth’s orbit, creating a chaotic path that might ultimately eject the Earth completely from the solar system.

What KAM discovered is that as the resonance ratio becomes a ratio of large integers, like 87:32, then the planets have a hard time aligning, and the perturbation remains small. A surprising part of this theory is that a nearby orbital ratio might be 5:2 = 1.5, which is only a little different than 87:32 = 1.7. Yet the 5:2 resonance can produce strong chaos, while the 87:32 resonance is almost immune. This way, it is possible to have both chaotic orbits and regular orbits coexisting in the same dynamical system. An irrational orbital ratio protects the regular orbits from chaos. The next question is, how irrational does the orbital ratio need to be to guarantee safety?

You probably already guessed the answer to this question–the answer must be the Golden Ratio. If this is indeed the most irrational number, then it cannot be approximated very well with ratios of small integers, and this is indeed the case. In a three-body system, the most stable orbital ratio would be a ratio of 1.618034. But the more general question of what is “irrational enough” for an orbit to be stable against a given perturbation is much harder to answer. This is the field of Diophantine Analysis, which addresses other questions as well, such as Fermat’s Last Theorem.

KAM Twist Map

The dynamics of three-body systems are hard to visualize directly, so there are tricks that help bring the problem into perspective. The first trick, invented by Henri Poincaré, is called the first return map (or the Poincaré section). This is a way of reducing the dimensionality of the problem by one dimension. But for three bodies, even if they are all in a plane, this still can be complicated. Another trick, called the restricted three-body problem, is to assume that there are two large masses and a third small mass. This way, the dynamics of the two-body system is unaffected by the small mass, so all we need to do is focus on the dynamics of the small body. This brings the dynamics down to two dimensions (the position and momentum of the third body), which is very convenient for visualization, but the dynamics still need solutions to differential equations. So the final trick is to replace the differential equations with simple difference equations that are solved iteratively.

A simple discrete iterative map that captures the essential behavior of the three-body problem begins with action-angle variables that are coupled through a perturbation. Variations on this model have several names: the Twist Map, the Chirikov Map and the Standard Map. The essential mapping is

where J is an action variable (like angular momentum) paired with the angle variable. Initial conditions for the action and the angle are selected, and then all later values are obtained by iteration. The perturbation parameter is given by ε. If ε = 0 then all orbits are perfectly regular and circular. But as the perturbation increases, the open orbits split up into chains of closed (periodic) orbits. As the perturbation increases further, chaotic behavior emerges. The situation for ε = 0.9 is shown in the figure below. There are many regular periodic orbits as well as open orbits. Yet there are simultaneously regions of chaotic behavior. This figure shows an intermediate case where regular orbits can coexist with chaotic ones. The key is the orbital period ratio. For orbital ratios that are sufficiently irrational, the orbits remain open and regular. Bur for orbital ratios that are ratios of small integers, the perturbation is strong enough to drive the dynamics into chaos.

Arnold Twist Map (also known as a Chirikov map) for ε = 0.9 showing the chaos that has emerged at the hyperbolic point, but there are still open orbits that are surprisingly circular (unperturbed) despite the presence of strongly chaotic orbits nearby.

Python Code

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Wed Oct. 2, 2019
@author: nolte
"""
import numpy as np
from scipy import integrate
from matplotlib import pyplot as plt
plt.close('all')

eps = 0.9

np.random.seed(2)
plt.figure(1)
for eloop in range(0,50):

    rlast = np.pi*(1.5*np.random.random()-0.5)
    thlast = 2*np.pi*np.random.random()

    orbit = np.int(200*(rlast+np.pi/2))
    rplot = np.zeros(shape=(orbit,))
    thetaplot = np.zeros(shape=(orbit,))
    x = np.zeros(shape=(orbit,))
    y = np.zeros(shape=(orbit,))    
    for loop in range(0,orbit):
        rnew = rlast + eps*np.sin(thlast)
        thnew = np.mod(thlast+rnew,2*np.pi)
        
        rplot[loop] = rnew
        thetaplot[loop] = np.mod(thnew-np.pi,2*np.pi) - np.pi            
          
        rlast = rnew
        thlast = thnew
        
        x[loop] = (rnew+np.pi+0.25)*np.cos(thnew)
        y[loop] = (rnew+np.pi+0.25)*np.sin(thnew)
        
    plt.plot(x,y,'o',ms=1)

plt.savefig('StandMapTwist')

The twist map for three values of ε are shown in the figure below. For ε = 0.2, most orbits are open, with one elliptic point and its associated hyperbolic point. At ε = 0.9 the periodic elliptic point is still stable, but the hyperbolic point has generated a region of chaotic orbits. There is still a remnant open orbit that is associated with an orbital period ratio at the Golden Ratio. However, by ε = 0.97, even this most stable orbit has broken up into a chain of closed orbits as the chaotic regions expand.

Twist map for three levels of perturbation.

Safety in Numbers

In our solar system, governed by gravitational attractions, the square of the orbital period increases as the cube of the average radius (Kepler’s third law). Consider the restricted three-body problem of the Sun and Jupiter with the Earth as the third body. If we analyze the stability of the Earth’s orbit as a function of distance from the Sun, the orbital ratio relative to Jupiter would change smoothly. Near our current position, it would be in a 12:1 resonance, but as we moved farther from the Sun, this ratio would decrease. When the orbital period ratio is sufficiently irrational, then the orbit would be immune to Jupiter’s pull. But as the orbital ratio approaches ratios of integers, the effect gets larger. Close enough to Jupiter there would be a succession of radii that had regular motion separated by regions of chaotic motion. The regions of regular motion associated with irrational numbers act as if they were a barrier, restricting the range of chaotic orbits and protecting more distant orbits from the chaos. In this way numbers, rational versus irrational, protect us from the chaos of our own solar system.

A dramatic demonstration of the orbital resonance effect can be seen with the asteroid belt. The many small bodies act as probes of the orbital resonances. The repetitive tug of Jupiter opens gaps in the distribution of asteroid radii, with major gaps, called Kirkwood Gaps, opening at orbital ratios of 3:1, 5:2, 7:3 and 2:1. These gaps are the radii where chaotic behavior occurs, while the regions in between are stable. Most asteroids spend most of their time in the stable regions, because chaotic motion tends to sweep them out of the regions of resonance. This mechanism for the Kirkwood gaps is the same physics that produces gaps in the rings of Saturn at resonances with the many moons of Saturn.

The gaps in the asteroid distributions caused by orbital resonances with Jupiter. Ref. Wikipedia

Further Reading

For a detailed history of the development of KAM theory, see Chapter 9 Butterflies to Hurricanes in Galileo Unbound (Oxford University Press, 2018).

For a more detailed mathematical description of the KAM theory, see Chapter 5, Hamiltonian Chaos, in Introduction to Modern Dynamics, 2nd edition (Oxford University Press, 2019).

See also:

Dumas, H. S., The KAM Story: A friendly introduction to the content, history and significance of Classical Kolmogorov-Arnold-Moser Theory. World Scientific: 2014.

Arnold, V. I., From superpositions to KAM theory. Vladimir Igorevich Arnold. Selected Papers 1997, PHASIS, 60, 727–740.

Science 1916: A Hundred-year Time Capsule

In one of my previous blog posts, as I was searching for Schwarzschild’s original papers on Einstein’s field equations and quantum theory, I obtained a copy of the January 1916 – June 1916 volume of the Proceedings of the Royal Prussian Academy of Sciences through interlibrary loan.  The extremely thick volume arrived at Purdue about a week after I ordered it online.  It arrived from Oberlin College in Ohio that had received it as a gift in 1928 from the library of Professor Friedrich Loofs of the University of Halle in Germany.  Loofs had been the Haskell Lecturer at Oberlin for the 1911-1912 semesters. 

As I browsed through the volume looking for Schwarzschild’s papers, I was amused to find a cornucopia of turn-of-the-century science topics recorded in its pages.  There were papers on the overbite and lips of marsupials.  There were papers on forgotten languages.  There were papers on ancient Greek texts.  On the origins of religion.  On the philosophy of abstraction.  Histories of Indian dramas.  Reflections on cancer.  But what I found most amazing was a snapshot of the field of physics and mathematics in 1916, with historic papers by historic scientists who changed how we view the world. Here is a snapshot in time and in space, a period of only six months from a single journal, containing papers from authors that reads like a who’s who of physics.

In 1916 there were three major centers of science in the world with leading science publications: London with the Philosophical Magazine and Proceedings of the Royal Society; Paris with the Comptes Rendus of the Académie des Sciences; and Berlin with the Proceedings of the Royal Prussian Academy of Sciences and Annalen der Physik. In Russia, there were the scientific Journals of St. Petersburg, but the Bolshevik Revolution was brewing that would overwhelm that country for decades.  And in 1916 the academic life of the United States was barely worth noticing except for a few points of light at Yale and Johns Hopkins. 

Berlin in 1916 was embroiled in war, but science proceeded relatively unmolested.  The six-month volume of the Proceedings of the Royal Prussian Academy of Sciences contains a number of gems.  Schwarzschild was one of the most prolific contributors, publishing three papers in just this half-year volume, plus his obituary written by Einstein.  But joining Schwarzschild in this volume were Einstein, Planck, Born, Warburg, Frobenious, and Rubens among others—a pantheon of German scientists mostly cut off from the rest of the world at that time, but single-mindedly following their individual threads woven deep into the fabric of the physical world.

Karl Schwarzschild (1873 – 1916)

Schwarzschild had the unenviable yet effective motivation of his impending death to spur him to complete several projects that he must have known would make his name immortal.  In this six-month volume he published his three most important papers.  The first (pg. 189) was on the exact solution to Einstein’s field equations to general relativity.  The solution was for the restricted case of a point mass, yet the derivation yielded the Schwarzschild radius that later became known as the event horizon of a non-roatating black hole.  The second paper (pg. 424) expanded the general relativity solutions to a spherically symmetric incompressible liquid mass. 

Schwarzschild’s solution to Einstein’s field equations for a point mass.

          

Schwarzschild’s extension of the field equation solutions to a finite incompressible fluid.

The subject, content and success of these two papers was wholly unexpected from this observational astronomer stationed on the Russian Front during WWI calculating trajectories for German bombardments.  He would not have been considered a theoretical physicist but for the importance of his results and the sophistication of his methods.  Within only a year after Einstein published his general theory, based as it was on the complicated tensor calculus of Levi-Civita, Christoffel and Ricci-Curbastro that had taken him years to master, Schwarzschild found a solution that evaded even Einstein.

Schwarzschild’s third and final paper (pg. 548) was on an entirely different topic, still not in his official field of astronomy, that positioned all future theoretical work in quantum physics to be phrased in the language of Hamiltonian dynamics and phase space.  He proved that action-angle coordinates were the only acceptable canonical coordinates to be used when quantizing dynamical systems.  This paper answered a central question that had been nagging Bohr and Einstein and Ehrenfest for years—how to quantize dynamical coordinates.  Despite the simple way that Bohr’s quantized hydrogen atom is taught in modern physics, there was an ambiguity in the quantization conditions even for this simple single-electron atom.  The ambiguity arose from the numerous possible canonical coordinate transformations that were admissible, yet which led to different forms of quantized motion. 

Schwarzschild’s proposal of action-angle variables for quantization of dynamical systems.

 Schwarzschild’s doctoral thesis had been a theoretical topic in astrophysics that applied the celestial mechanics theories of Henri Poincaré to binary star systems.  Within Poincaré’s theory were integral invariants that were conserved quantities of the motion.  When a dynamical system had as many constraints as degrees of freedom, then every coordinate had an integral invariant.  In this unexpected last paper from Schwarzschild, he showed how canonical transformation to action-angle coordinates produced a unique representation in terms of action variables (whose dimensions are the same as Planck’s constant).  These action coordinates, with their associated cyclical angle variables, are the only unambiguous representations that can be quantized.  The important points of this paper were amplified a few months later in a publication by Schwarzschild’s friend Paul Epstein (1871 – 1939), solidifying this approach to quantum mechanics.  Paul Ehrenfest (1880 – 1933) continued this work later in 1916 by defining adiabatic invariants whose quantum numbers remain unchanged under slowly varying conditions, and the program started by Schwarzschild was definitively completed by Paul Dirac (1902 – 1984) at the dawn of quantum mechanics in Göttingen in 1925.

Albert Einstein (1879 – 1955)

In 1916 Einstein was mopping up after publishing his definitive field equations of general relativity the year before.  His interests were still cast wide, not restricted only to this latest project.  In the 1916 Jan. to June volume of the Prussian Academy Einstein published two papers.  Each is remarkably short relative to the other papers in the volume, yet the importance of the papers may stand in inverse proportion to their length.

The first paper (pg. 184) is placed right before Schwarzschild’s first paper on February 3.  The subject of the paper is the expression of Maxwell’s equations in four-dimensional space time.  It is notable and ironic that Einstein mentions Hermann Minkowski (1864 – 1909) in the first sentence of the paper.  When Minkowski proposed his bold structure of spacetime in 1908, Einstein had been one of his harshest critics, writing letters to the editor about the absurdity of thinking of space and time as a single interchangeable coordinate system.  This is ironic, because Einstein today is perhaps best known for the special relativity properties of spacetime, yet he was slow to adopt the spacetime viewpoint. Einstein only came around to spacetime when he realized around 1910 that a general approach to relativity required the mathematical structure of tensor manifolds, and Minkowski had provided just such a manifold—the pseudo-Riemannian manifold of space time.  Einstein subsequently adopted spacetime with a passion and became its greatest champion, calling out Minkowski where possible to give him his due, although he had already died tragically of a burst appendix in 1909.

Relativistic energy density of electromagnetic fields.

The importance of Einstein’s paper hinges on his derivation of the electromagnetic field energy density using electromagnetic four vectors.  The energy density is part of the source term for his general relativity field equations.  Any form of energy density can warp spacetime, including electromagnetic field energy.  Furthermore, the Einstein field equations of general relativity are nonlinear as gravitational fields modify space and space modifies electromagnetic fields, producing a coupling between gravity and electromagnetism.  This coupling is implicit in the case of the bending of light by gravity, but Einstein’s paper from 1916 makes the connection explicit. 

Einstein’s second paper (pg. 688) is even shorter and hence one of the most daring publications of his career.  Because the field equations of general relativity are nonlinear, they are not easy to solve exactly, and Einstein was exploring approximate solutions under conditions of slow speeds and weak fields.  In this “non-relativistic” limit the metric tensor separates into a Minkowski metric as a background on which a small metric perturbation remains.  This small perturbation has the properties of a wave equation for a disturbance of the gravitational field that propagates at the speed of light.  Hence, in the June 22 issue of the Prussian Academy in 1916, Einstein predicts the existence and the properties of gravitational waves.  Exactly one hundred years later in 2016, the LIGO collaboration announced the detection of gravitational waves generated by the merger of two black holes.

Einstein’s weak-field low-velocity approximation solutions of his field equations.
Einstein’s prediction of gravitational waves.

Max Planck (1858 – 1947)

Max Planck was active as the secretary of the Prussian Academy in 1916 yet was still fully active in his research.  Although he had launched the quantum revolution with his quantum hypothesis of 1900, he was not a major proponent of quantum theory even as late as 1916.  His primary interests lay in thermodynamics and the origins of entropy, following the theoretical approaches of Ludwig Boltzmann (1844 – 1906).  In 1916 he was interested in how to best partition phase space as a way to count states and calculate entropy from first principles.  His paper in the 1916 volume (pg. 653) calculated the entropy for single-atom solids.

Counting microstates by Planck.

Max Born (1882 – 1970)

Max Born was to be one of the leading champions of the quantum mechanical revolution based at the University of Göttingen in the 1920’s. But in 1916 he was on leave from the University of Berlin working on ranging for artillery.  Yet he still pursued his academic interests, like Schwarzschild.  On pg. 614 in the Proceedings of the Prussian Academy, Born published a paper on anisotropic liquids, such as liquid crystals and the effect of electric fields on them.  It is astonishing to think that so many of the flat-panel displays we have today, whether on our watches or smart phones, are technological descendants of work by Born at the beginning of his career.

Born on liquid crystals.

Ferdinand Frobenius (1849 – 1917)

Like Schwarzschild, Frobenius was at the end of his career in 1916 and would pass away one year later, but unlike Schwarzschild, his career had been a long one, receiving his doctorate under Weierstrass and exploring elliptic functions, differential equations, number theory and group theory.  One of the papers that established him in group theory appears in the May 4th issue on page 542 where he explores the series expansion of a group.

Frobenious on groups.

Heinrich Rubens (1865 – 1922)

Max Planck owed his quantum breakthrough in part to the exquisitely accurate experimental measurements made by Heinrich Rubens on black body radiation.  It was only by the precise shape of what came to be called the Planck spectrum that Planck could say with such confidence that his theory of quantized radiation interactions fit Rubens spectrum so perfectly.  In 1916 Rubens was at the University of Berlin, having taken the position vacated by Paul Drude in 1906.  He was a specialist in infrared spectroscopy, and on page 167 of the Proceedings he describes the spectrum of steam and its consequences for the quantum theory.

Rubens and the infrared spectrum of steam.

Emil Warburg (1946 – 1931)

Emil Warburg’s fame is primarily as the father of Otto Warburg who won the 1931 Nobel prize in physiology.  On page 314 Warburg reports on photochemical processes in BrH gases.     In an obscure and very indirect way, I am an academic descendant of Emil Warburg.  One of his students was Robert Pohl who was a famous early researcher in solid state physics, sometimes called the “father of solid state physics”.  Pohl was at the physics department in Göttingen in the 1920’s along with Born and Franck during the golden age of quantum mechanics.  Robert Pohl’s son, Robert Otto Pohl, was my professor when I was a sophomore at Cornell University in 1978 for the course on introductory electromagnetism using a textbook by the Nobel laureate Edward Purcell, a quirky volume of the Berkeley Series of physics textbooks.  This makes Emil Warburg my professor’s father’s professor.

Warburg on photochemistry.

Papers in the 1916 Vol. 1 of the Prussian Academy of Sciences

Schulze,  Alt– und Neuindisches

Orth,  Zur Frage nach den Beziehungen des Alkoholismus zur Tuberkulose

Schulze,  Die Erhabunen auf der Lippin- und Wangenschleimhaut der Säugetiere

von Wilamwitz-Moellendorff, Die Samie des Menandros

Engler,  Bericht über das >>Pflanzenreich<<

von Harnack,  Bericht über die Ausgabe der griechischen Kirchenväter der dri ersten Jahrhunderte

Meinecke,  Germanischer und romanischer Geist im Wandel der deutschen Geschichtsauffassung

Rubens und Hettner,  Das langwellige Wasserdampfspektrum und seine Deutung durch die Quantentheorie

Einstein,  Eine neue formale Deutung der Maxwellschen Feldgleichungen der Electrodynamic

Schwarschild,  Über das Gravitationsfeld eines Massenpunktes nach der Einsteinschen Theorie

Helmreich,  Handschriftliche Verbesserungen zu dem Hippokratesglossar des Galen

Prager,  Über die Periode des veränderlichen Sterns RR Lyrae

Holl,  Die Zeitfolge des ersten origenistischen Streits

Lüders,  Zu den Upanisads. I. Die Samvargavidya

Warburg,  Über den Energieumsatz bei photochemischen Vorgängen in Gasen. VI.

Hellman,  Über die ägyptischen Witterungsangaben im Kalender von Claudius Ptolemaeus

Meyer-Lübke,  Die Diphthonge im Provenzaslischen

Diels,  Über die Schrift Antipocras des Nikolaus von Polen

Müller und Sieg,  Maitrisimit und >>Tocharisch<<

Meyer,  Ein altirischer Heilsegen

Schwarzschild,  Über das Gravitationasfeld einer Kugel aus inkompressibler Flüssigkeit nach der Einsteinschen Theorie

Brauer,  Die Verbreitung der Hyracoiden

Correns,  Untersuchungen über Geschlechtsbestimmung bei Distelarten

Brahn,  Weitere Untersuchungen über Fermente in der Lever von Krebskranken

Erdmann,  Methodologische Konsequenzen aus der Theorie der Abstraktion

Bang,  Studien zur vergleichenden Grammatik der Türksprachen. I.

Frobenius,  Über die  Kompositionsreihe einer Gruppe

Schwarzschild,  Zur Quantenhypothese

Fischer und Bergmann,  Über neue Galloylderivate des Traubenzuckers und ihren Vergleich mit der Chebulinsäure

Schuchhardt,  Der starke Wall und die breite, zuweilen erhöhte Berme bei frügeschichtlichen Burgen in Norddeutschland

Born,  Über anisotrope Flüssigkeiten

Planck,  Über die absolute Entropie einatomiger Körper

Haberlandt,  Blattepidermis und Lichtperzeption

Einstein,  Näherungsweise Integration der Feldgleichungen der Gravitation

Lüders,  Die Saubhikas.  Ein Beitrag zur Gecschichte des indischen Dramas

Karl Schwarzschild’s Radius: How Fame Eclipsed a Physicist’s own Legacy

In an ironic twist of the history of physics, Karl Schwarzschild’s fame has eclipsed his own legacy.  When asked who was Karl Schwarzschild (1873 – 1916), you would probably say he’s the guy who solved Einstein’s Field Equations of General Relativity and discovered the radius of black holes.  You may also know that he accomplished this Herculean feat while dying slowly behind the German lines on the Eastern Front in WWI.  But asked what else he did, and you would probably come up blank.  Yet Schwarzschild was one of the most wide-ranging physicists at the turn of the 20th century, which is saying something, because it places him into the same pantheon as Planck, Lorentz, Poincaré and Einstein.  Let’s take a look at the part of his career that hides in the shadow of his own radius.

A Radius of Interest

Karl Schwarzschild was born in Frankfurt, Germany, shortly after the Franco-Prussian war thrust Prussia onto the world stage as a major political force in Europe.  His family were Jewish merchants of longstanding reputation in the city, and Schwarzschild’s childhood was spent in the vibrant Jewish community.  One of his father’s friends was a professor at a university in Frankfurt, whose son, Paul Epstein (1871 – 1939), became a close friend of Karl’s at the Gymnasium.  Schwarzshild and Epstein would partially shadow each other’s careers despite the fact that Schwarzschild became an astronomer while Epstein became a famous mathematician and number theorist.  This was in part because Schwarzschild had large radius of interests that spanned the breadth of current mathematics and science, practicing both experiments and theory. 

Schwarzschild’s application of the Hamiltonian formalism for quantum systems set the stage for the later adoption of Hamiltonian methods in quantum mechanics. He came dangerously close to stating the uncertainty principle that catapulted Heisenberg to fame.

By the time Schwarzschild was sixteen, he had taught himself the mathematics of celestial mechanics to such depth that he published two papers on the orbits of binary stars.  He also became fascinated in astronomy and purchased lenses and other materials to construct his own telescope.  His interests were helped along by Epstein, two years older and whose father had his own private observatory.  When Epstein went to study at the University of Strasbourg (then part of the German Federation) Schwarzschild followed him.  But Schwarzschild’s main interest in astronomy diverged from Epstein’s main interest in mathematics, and Schwarzschild transferred to the University of Munich where he studied under Hugo von Seeliger (1849 – 1924), the premier German astronomer of his day.  Epstein remained at Strasbourg where he studied under Bruno Christoffel (1829 – 1900) and eventually became a professor, but he was forced to relinquish the post when Strasbourg was ceded to France after WWI. 

The Birth of Stellar Interferometry

Until the Hubble space telescope was launched in 1990 no star had ever been resolved as a direct image.  Within a year of its launch, using its spectacular resolving power, the Hubble optics resolved—just barely—the red supergiant Betelgeuse.  No other star (other than the Sun) is close enough or big enough to image the stellar disk, even for the Hubble far above our atmosphere.  The reason is that the diameter of the optical lenses and mirrors of the Hubble—as big as they are at 2.4 meter diameter—still produce a diffraction pattern that smears the image so that stars cannot be resolved.  Yet information on the size of a distant object is encoded as phase in the light waves that are emitted from the object, and this phase information is accessible to interferometry.

The first physicist who truly grasped the power of optical interferometry and who understood how to design the first interferometric metrology systems was the French physicist Armand Hippolyte Louis Fizeau (1819 – 1896).  Fizeau became interested in the properties of light when he collaborated with his friend Léon Foucault (1819–1868) on early uses of photography.  The two then embarked on a measurement of the speed of light but had a falling out before the experiment could be finished, and both continued the pursuit independently.  Fizeau achieved the first measurement using a toothed wheel rotating rapidly [1], while Foucault came in second using a more versatile system with a spinning mirror [2].  Yet Fizeau surpassed Foucault in optical design and became an expert in interference effects.  Interference apparatus had been developed earlier by Augustin Fresnel (the Fresnel bi-prism 1819), Humphrey Lloyd (Lloyd’s mirror 1834) and Jules Jamin (Jamin’s interferential refractor 1856).  They had found ways of redirecting light using refraction and reflection to cause interference fringes.  But Fizeau was one of the first to recognize that each emitting region of a light source was coherent with itself, and he used this insight and the use of lenses to design the first interferometer.

Fizeau’s interferometer used a lens with a with a tight focal spot masked off by an opaque screen with two open slits.  When the masked lens device was focused on an intense light source it produced two parallel pencils of light that were mutually coherent but spatially separated.  Fizeau used this apparatus to measure the speed of light in moving water in 1859 [3]

Fig. 1  Optical configuration of the source element of the Fizeau refractometer.

The working principle of the Fizeau refractometer is shown in Fig. 1.  The light source is at the bottom, and it is reflected by the partially-silvered beam splitter to pass through the lens and the mask containing two slits.  (Only the light paths that pass through the double-slit mask on the lens are shown in the figure.)  The slits produce two pencils of mutually coherent light that pass through a system (in the famous Fizeau ether drag experiment it was along two tubes of moving water) and are returned through the same slits, and they intersect at the view port where they produce interference fringes.  The fringe spacing is set by the separation of the two slits in the mask.  The Rayleigh region of the lens defines a region of spatial coherence even for a so-called “incoherent” source.  Therefore, this apparatus, by use of the lens, could convert an incoherent light source into a coherent probe to test the refractive index of test materials, which is why it was called a refractometer. 

Fizeau became adept at thinking of alternative optical designs of his refractometer and alternative applications.  In an address to the French Physical Society in 1868 he suggested that the double-slit mask could be used on a telescope to determine sizes of distant astronomical objects [4].  There were several subsequent attempts to use Fizeau’s configuration in astronomical observations, but none were conclusive and hence were not widely known.

An optical configuration and astronomical application that was very similar to Fizeau’s idea was proposed by Albert Michelson in 1890 [5].  He built the apparatus and used it to successfully measure the size of several moons of Jupiter [6].  The configuration of the Michelson stellar interferometer is shown in Fig. 2.  Light from a distant star passes through two slits in the mask in front of the collecting optics of a telescope.  When the two pencils of light intersect at the view port, they produce interference fringes.  Because of the finite size of the stellar source, the fringes are partially washed out.  By adjusting the slit separation, a certain separation can be found where the fringes completely wash out.  The size of the star is then related to the separation of the slits for which the fringe visibility vanishes.  This simple principle allows this type of stellar interferometry to measure the size of stars that are large and relatively close to Earth.  However, if stars are too far away even this approach cannot be used to measure their sizes because telescopes aren’t big enough.  This limitation is currently being bypassed by the use of long-baseline optical interferometers.

Fig. 2  Optical configuration of the Michelson stellar interferometer.  Fringes at the view port are partially washed out by the finite size of the star.  By adjusting the slit separation, the fringes can be made to vanish entirely, yielding an equation that can be solved for the size of the star.

One of the open questions in the history of interferometry is whether Michelson was aware of Fizeau’s proposal for the stellar interferometer made in 1868.  Michelson was well aware of Fizeau’s published research and acknowledged him as a direct inspiration of his own work in interference effects.  But Michelson also was unaware of the undercurrents in the French school of optical interference.  When he visited Paris in 1881, he met with many of the leading figures in this school (including Lippmann and Cornu), but there is no mention or any evidence that he met with Fizeau.  By this time Fizeau’s wife had passed away, and Fizeau spent most of his time in seclusion at his home outside Paris.  Therefore, it is unlikely that he would have been present during Michelson’s visit.  Because Michelson viewed Fizeau with such awe and respect, if he had met him, he most certainly would have mentioned it.  Therefore, Michelson’s invention of the stellar interferometer can be considered with some confidence to be a case of independent discovery.  It is perhaps not surprising that he hit on the same idea that Fizeau had in 1868, because Michelson was one of the few physicists who understood coherence and interference at the same depth as Fizeau.

Schwarzschild’s Stellar Interferometer

The physics of the Michelson stellar interferometer is very similar to the physics of Young’s double slit experiment.  The two slits in the aperture mask of the telescope objective act to produce a simple sinusoidal interference pattern at the image plane of the optical system.  The size of the stellar diameter is determined by using the wash-out effect of the fringes caused by the finite stellar size.  However, it is well known to physicists who work with diffraction gratings that a multiple-slit interference pattern has a much greater resolving power than a simple double slit. 

This realization must have hit von Seeliger and Schwarzschild, working together at Munich, when they saw the publication of Michelson’s theoretical analysis of his stellar interferometer in 1890, followed by his use of the apparatus to measure the size of Jupiter’s moons.  Schwarzschild and von Seeliger realized that by replacing the double-slit mask with a multiple-slit mask, the widths of the interference maxima would be much narrower.  Such a diffraction mask on a telescope would cause a star to produce a multiple set of images on the image plane of the telescope associated with the multiple diffraction orders.  More interestingly, if the target were a binary star, the diffraction would produce two sets of diffraction maxima—a double image!  If the “finesse” of the grating is high enough, the binary star separation could be resolved as a doublet in the diffraction pattern at the image, and the separation could be measured, giving the angular separation of the two stars of the binary system.  Such an approach to the binary separation would be a direct measurement, which was a distinct and clever improvement over the indirect Michelson configuration that required finding the extinction of the fringe visibility. 

Schwarzschild enlisted the help of a fine German instrument maker to create a multiple slit system that had an adjustable slit separation.  The device is shown in Fig. 3 from Schwarzschild’s 1896 publication on the use of the stellar interferometer to measure the separation of binary stars [7].  The device is ingenious.  By rotating the chain around the gear on the right-hand side of the apparatus, the two metal plates with four slits could be raised or lowered, cause the projection onto the objective plane to have variable slit spacings.  In the operation of the telescope, the changing height of the slits does not matter, because they are near a conjugate optical plane (the entrance pupil) of the optical system.  Using this adjustable multiple slit system, Schwarzschild (and two colleagues he enlisted) made multiple observations of well-known binary star systems, and they calculated the star separations.  Several of their published results are shown in Fig. 4.

Fig. 3  Illustration from Schwarzschild’s 1896 paper describing an improvement of the Michelson interferometer for measuring the separation of binary star systems Ref. [7].
Fig. 4  Data page from Schwarzschild’s 1896 paper measuring the angular separation of two well-known binary star systems: gamma Leonis and chsi Ursa Major. Ref. [7]

Schwarzschild’s publication demonstrated one of the very first uses of stellar interferometry—well before Michelson himself used his own configuration to measure the diameter of Betelgeuse in 1920.  Schwarzschild’s major achievement was performed before he had received his doctorate, on a topic orthogonal to his dissertation topic.  Yet this fact is virtually unknown to the broader physics community outside of astronomy.  If he had not become so famous later for his solution of Einstein’s field equations, Schwarzschild nonetheless might have been famous for his early contributions to stellar interferometry.  But even this was not the end of his unique contributions to physics.

Adiabatic Physics

As Schwarzschild worked for his doctorate under von Seeliger, his dissertation topic was on new theories by Henri Poincaré (1854 – 1912) on celestial mechanics.  Poincaré had made a big splash on the international stage with the publication of his prize-winning memoire in 1890 on the three-body problem.  This is the publication where Poincaré first described what would later become known as chaos theory.  The memoire was followed by his volumes on “New Methods in Celestial Mechanics” published between 1892 and 1899.  Poincaré’s work on celestial mechanics was based on his earlier work on the theory of dynamical systems where he discovered important invariant theorems, such as Liouville’s theorem on the conservation of phase space volume.  Schwarzshild applied Poincaré’s theorems to problems in celestial orbits.  He took his doctorate in 1896 and received a post at an astronomical observatory outside Vienna. 

While at Vienna, Schwarzschild performed his most important sustained contributions to the science of astronomy.  Astronomical observations had been dominated for centuries by the human eye, but photographic techniques had been making steady inroads since the time of Hermann Carl Vogel (1841 – 1907) in the 1880’s at the Potsdam observatory.  Photographic plates were used primarily to record star positions but were known to be unreliable for recording stellar intensities.  Schwarzschild developed a “out-of-focus” technique that blurred the star’s image, while making it larger and easier to measure the density of the exposed and developed photographic emulsions.  In this way, Schwarzschild measured the magnitudes of 367 stars.  Two of these stars had variable magnitudes that he was able to record and track.  Schwarzschild correctly explained the intensity variation caused by steady oscillations in heating and cooling of the stellar atmosphere.  This work established the properties of these Cepheid variables which would become some of the most important “standard candles” for the measurement of cosmological distances.  Based on the importance of this work, Schwarzschild returned to Munich as a teacher in 1899 and subsequently was appointed in 1901 as the director of the observatory at Göttingen established by Gauss eighty years earlier.

Schwarzschild’s years at Göttingen brought him into contact with some of the greatest mathematicians and physicists of that era.  The mathematicians included Felix Klein, David Hilbert and Hermann Minkowski.  The physicists included von Laue, a student of Woldemar Voigt.  This period was one of several “golden ages” of Göttingen.  The first golden age was the time of Gauss and Riemann in the mid-1800’s.  The second golden age, when Schwarzschild was present, began when Felix Klein arrived at Göttingen and attracted the top mathematicians of the time.  The third golden age of Göttingen was the time of Born and Jordan and Heisenberg at the birth of quantum mechanics in the mid 1920’s.

In 1906, the Austrian Physicist Paul Ehrenfest, freshly out of his PhD under the supervision of Boltzmann, arrived at Göttingen only weeks before Boltzmann took his own life.  Felix Klein at Göttingen had been relying on Boltzmann to provide a comprehensive review of statistical mechanics for the Mathematical Encyclopedia, so he now entrusted this project to the young Ehrenfest.  It was a monumental task, which was to take him and his physicist wife Tatyanya nearly five years to complete.  Part of the delay was the desire by the Ehrenfests to close some open problems that remained in Boltzmann’s work.  One of these was a mechanical theorem of Boltzmann’s that identified properties of statistical mechanical systems that remained unaltered through a very slow change in system parameters.  These properties would later be called adiabatic invariants by Einstein. 

Ehrenfest recognized that Wien’s displacement law, which had been a guiding light for Planck and his theory of black body radiation, had originally been derived by Wien using classical principles related to slow changes in the volume of a cavity.  Ehrenfest was struck by the fact that such slow changes would not induce changes in the quantum numbers of the quantized states, and hence that the quantum numbers must be adiabatic invariants of the black body system.  This not only explained why Wien’s displacement law continued to hold under quantum as well as classical considerations, but it also explained why Planck’s quantization of the energy of his simple oscillators was the only possible choice.  For a classical harmonic oscillator, the ratio of the energy of oscillation to the frequency of oscillation is an adiabatic invariant, which is immediately recognized as Planck’s quantum condition .  

Ehrenfest published his observations in 1913 [8], the same year that Bohr published his theory of the hydrogen atom, so Ehrenfest immediately applied the theory of adiabatic invariants to Bohr’s model and discovered that the quantum condition for the quantized energy levels was again the adiabatic invariants of the electron orbits, and not merely a consequence of integer multiples of angular momentum, which had seemed somewhat ad hoc

After eight exciting years at Göttingen, Schwarzschild was offered the position at the Potsdam Observatory in 1909 upon the retirement from that post of the famous German astronomer Carl Vogel who had made the first confirmed measurements of the optical Doppler effect.  Schwarzschild accepted and moved to Potsdam with a new family.  His son Martin Schwarzschild would follow him into his profession, becoming a famous astronomer at Princeton University and a theorist on stellar structure.  At the outbreak of WWI, Schwarzschild joined the German army out of a sense of patriotism.  Because of his advanced education he was made an officer of artillery with the job to calculate artillery trajectories, and after a short time on the Western Front in Belgium was transferred to the Eastern Front in Russia.  Though he was not in the trenches, he was in the midst of the chaos to the rear of the front.  Despite this situation, he found time to pursue his science through the year 1915. 

Schwarzschild was intrigued by Ehrenfest’s paper on adiabatic invariants and their similarity to several of the invariant theorems of Poincaré that he had studied for his doctorate.  Up until this time, mechanics had been mostly pursued through the Lagrangian formalism which could easily handle generalized forces associated with dissipation.  But celestial mechanics are conservative systems for which the Hamiltonian formalism is a more natural approach.  In particular, the Hamilton-Jacobi canonical transformations made it particularly easy to find pairs of generalized coordinates that had simple periodic behavior.  In his published paper [9], Schwarzschild called these “Action-Angle” coordinates because one was the action integral that was well-known in the principle of “Least Action”, and the other was like an angle variable that changed steadily in time (see Fig. 5). Action-angle coordinates have come to form the foundation of many of the properties of Hamiltonian chaos, Hamiltonian maps, and Hamiltonian tapestries.

Fig. 5  Description of the canonical transformation to action-angle coordinates (Ref. [9] pg. 549). Schwarzschild names the new coordinates “Wirkungsvariable” and “Winkelvariable”.

During lulls in bombardments, Schwarzschild translated the Hamilton-Jacobi methods of celestial mechanics to apply them to the new quantum mechanics of the Bohr orbits.  The phrase “quantum mechanics” had not yet been coined (that would come ten years later in a paper by Max Born), but it was clear that the Bohr quantization conditions were a new type of mechanics.  The periodicities that were inherent in the quantum systems were natural properties that could be mapped onto the periodicities of the angle variables, while Ehrenfest’s adiabatic invariants could be mapped onto the slowly varying action integrals.  Schwarzschild showed that action-angle coordinates were the only allowed choice of coordinates, because they enabled the separation of the Hamilton-Jacobi equations and hence provided the correct quantization conditions for the Bohr electron orbits.  Later, when Sommerfeld published his quantized elliptical orbits in 1916, the multiplicity of quantum conditions and orbits had caused concern, but Ehrenfest came to the rescue, showing that each of Sommerfeld’s quantum conditions were precisely Schwarzschild’s action-integral invariants of the classical electron dynamics [10].

The works by Schwarzschild, and a closely-related paper that amplified his ideas published by his friend Paul Epstein several months later [11], were the first to show the power of the Hamiltonian formulation of dynamics for quantum systems, foreshadowing the future importance of Hamiltonians for quantum theory.  An essential part of the Hamiltonian formalism is the concept of phase space.  In his paper, Schwarzschild showed that the phase space of quantum systems was divided into small but finite elementary regions whose areas were equal to Planck’s constant h-bar (see Fig. 6).  The areas were products of a small change in momentum coordinate Delta-p and a corresponding small change in position coordinate Delta-x.  Therefore, the product DxDp = h-bar.  This observation, made in 1915 by Schwarzschild, was only one step away from Heisenberg’s uncertainty relation, twelve years before Heisenberg discovered it.  However, in 1915 Born’s probabilistic interpretation of quantum mechanics had not yet been made, nor the idea of measurement uncertainty, so Schwarzschild did not have the appropriate context in which to have made the leap to the uncertainty principle.  However, by introducing the action-angle coordinates as well as the Hamiltonian formalism applied to quantum systems, with the natural structure of phase space, Schwarzschild laid the foundation for the future developments in quantum theory made by the next generation.

Fig. 6  Expression of the division of phase space into elemental areas of action equal to h-bar (Ref. [9] pg. 550).

All Quiet on the Eastern Front

Towards the end of his second stay in Munich in 1900, prior to joining the Göttingen faculty, Schwarzschild had presented a paper at a meeting of the German Astronomical Society held in Heidelberg in August.  The topic was unlike anything he had tackled before.  It considered the highly theoretical question of whether the universe was non-Euclidean, and more specifically if it had curvature.  He concluded from observation that if the universe were curved, the radius of curvature must be larger than between 50 light years and 2000 light years, depending on whether the geometry was hyperbolic or elliptical.  Schwarzschild was working out ideas of differential geometry and applying them to the universe at large at a time when Einstein was just graduating from the ETH where he skipped his math classes and had his friend Marcel Grossmann take notes for him.

The topic of Schwarzschild’s talk tells an important story about the warping of historical perspective by the “great man” syndrome.  In this case the great man is Einstein who is today given all the credit for discovering the warping of space.  His development of General Relativity is often portrayed as by a lone genius in the wilderness performing a blazing act of creation out of the void.  In fact, non-Euclidean geometry had been around for some time by 1900—five years before Einstein’s Special Theory and ten years before his first publications on the General Theory.  Gauss had developed the idea of intrinsic curvature of a manifold fifty years earlier, amplified by Riemann.  By the turn of the century alternative geometries were all the rage, and Schwarzschild considered whether there were sufficient astronomical observations to set limits on the size of curvature of the universe.  But revisionist history is just as prevalent in physics as in any field, and when someone like Einstein becomes so big in the mind’s eye, his shadow makes it difficult to see all the people standing behind him.

This is not meant to take away from the feat that Einstein accomplished.  The General Theory of Relativity, published by Einstein in its full form in 1915 was spectacular [12].  Einstein had taken vague notions about curved spaces and had made them specific, mathematically rigorous and intimately connected with physics through the mass-energy source term in his field equations.  His mathematics had gone beyond even what his mathematician friend and former collaborator Grossmann could achieve.  Yet Einstein’s field equations were nonlinear tensor differential equations in which the warping of space depended on the strength of energy fields, but the configuration of those energy fields depended on the warping of space.  This type of nonlinear equation is difficult to solve in general terms, and Einstein was not immediately aware of how to find the solutions to his own equations.

Therefore, it was no small surprise to him when he received a letter from the Eastern Front from an astronomer he barely knew who had found a solution—a simple solution (see Fig. 7) —to his field equations.  Einstein probably wondered how he could have missed it, but he was generous and forwarded the letter to the Reports of the Prussian Physical Society where it was published in 1916 [13].

Fig. 7  Schwarzschild’s solution of the Einstein Field Equations (Ref. [13] pg. 194).

In the same paper, Schwarzschild used his exact solution to find the exact equation that described the precession of the perihelion of Mercury that Einstein had only calculated approximately. The dynamical equations for Mercury are shown in Fig. 8.

Fig. 8  Explanation for the precession of the perihelion of Mercury ( Ref. [13]  pg. 195)

Schwarzschild’s solution to Einstein’s Field Equation of General Relativity was not a general solution, even for a point mass. He had constants of integration that could have arbitrary values, such as the characteristic length scale that Schwarzschild called “alpha”. It was David Hilbert who later expanded upon Schwarzschild’s work, giving the general solution and naming the characteristic length scale (where the metric diverges) after Schwarzschild. This is where the phrase “Schwarzschild Radius” got its name, and it stuck. In fact it stuck so well that Schwarzschild’s radius has now eclipsed much of the rest of Schwarzschild’s considerable accomplishments.

Unfortunately, Schwarzschild’s accomplishments were cut short when he contracted an autoimmune disease that may have been hereditary. It is ironic that in the carnage of the Eastern Front, it was a genetic disease that caused his death at the age of 42. He was already suffering from the effects of the disease as he worked on his last publications. He was sent home from the front to his family in Potsdam where he passed away several months later having shepherded his final two papers through the publication process. His last paper, on the action-angle variables in quantum systems , was published on the day that he died.

Schwarzschild’s Legacy

Schwarzschild’s legacy was assured when he solved Einstein’s field equations and Einstein communicated it to the world. But his hidden legacy is no less important.

Schwarzschild’s application of the Hamiltonian formalism of canonical transformations and phase space for quantum systems set the stage for the later adoption of Hamiltonian methods in quantum mechanics. He came dangerously close to stating the uncertainty principle that catapulted Heisenberg to later fame, although he could not express it in probabilistic terms because he came too early.

Schwarzschild is considered to be the greatest German astronomer of the last hundred years. This is in part based on his work at the birth of stellar interferometry and in part on his development of stellar photometry and the calibration of the Cepheid variable stars that went on to revolutionize our view of our place in the universe. Solving Einsteins field equations was just a sideline for him, a hobby to occupy his active and curious mind.


[1] Fizeau, H. L. (1849). “Sur une expérience relative à la vitesse de propagation de la lumière.” Comptes rendus de l’Académie des sciences 29: 90–92, 132.

[2] Foucault, J. L. (1862). “Détermination expérimentale de la vitesse de la lumière: parallaxe du Soleil.” Comptes rendus de l’Académie des sciences 55: 501–503, 792–596.

[3] Fizeau, H. (1859). “Sur les hypothèses relatives à l’éther lumineux.” Ann. Chim. Phys.  Ser. 4 57: 385–404.

[4] Fizeau, H. (1868). “Prix Bordin: Rapport sur le concours de l’annee 1867.” C. R. Acad. Sci. 66: 932.

[5] Michelson, A. A. (1890). “I. On the application of interference methods to astronomical measurements.” The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science 30(182): 1-21.

[6] Michelson, A. A. (1891). “Measurement of Jupiter’s Satellites by Interference.” Nature 45(1155): 160-161.

[7] Schwarzschild, K. (1896). “Über messung von doppelsternen durch interferenzen.” Astron. Nachr. 3335: 139.

[8] P. Ehrenfest, “Een mechanische theorema van Boltzmann en zijne betrekking tot de quanta theorie (A mechanical theorem of Boltzmann and its relation to the theory of energy quanta),” Verslag van de Gewoge Vergaderingen der Wis-en Natuurkungige Afdeeling, vol. 22, pp. 586-593, 1913.

[9] Schwarzschild, K. (1916). “Quantum hypothesis.” Sitzungsberichte Der Koniglich Preussischen Akademie Der Wissenschaften: 548-568.

[10] P. Ehrenfest, “Adiabatic invariables and quantum theory,” Annalen Der Physik, vol. 51, pp. 327-352, Oct 1916.

[11] Epstein, P. S. (1916). “The quantum theory.” Annalen Der Physik 51(18): 168-188.

[12] Einstein, A. (1915). “On the general theory of relativity.” Sitzungsberichte Der Koniglich Preussischen Akademie Der Wissenschaften: 778-786.

[13] Schwarzschild, K. (1916). “Über das Gravitationsfeld eines Massenpunktes nach der Einstein’schen Theorie.” Sitzungsberichte der Königlich-Preussischen Akademie der Wissenschaften: 189.

The Fast and the Slow of Grandfather Clocks

Imagine in your mind the stately grandfather clock.  The long slow pendulum swinging back and forth so purposefully with such majesty.  It harks back to slower simpler times—seemingly Victorian in character, although their origins go back to Christiaan Huygens in 1656.  In introductory physics classes the dynamics of the pendulum is taught as one of the simplest simple harmonic oscillators, only a bit more complicated than a mass on a spring.

But don’t be fooled!  This simplicity is an allusion, for the pendulum clock lies at the heart of modern dynamics.  It is a nonlinear autonomous oscillator with system gain that balances dissipation to maintain a dynamic equilibrium that ticks on resolutely as long as some energy source can continue to supply it (like the heavy clock weights).    

This analysis has converted the two-dimensional dynamics of the autonomous oscillator to a simple one-dimensional dynamics with a stable fixed point.

The dynamic equilibrium of the grandfather clock is known as a limit cycle, and they are the central feature of autonomous oscillators.  Autonomous oscillators are one of the building blocks of complex systems, providing the fundamental elements for biological oscillators, neural networks, business cycles, population dynamics, viral epidemics, and even the rings of Saturn.  The most famous autonomous oscillator (after the pendulum clock) is named for a Dutch physicist, Balthasar van der Pol (1889 – 1959), who discovered the laws that govern how electrons oscillate in vacuum tubes.  But this highly specialized physics problem has expanded to become the new guiding paradigm for the fundamental oscillating element of modern dynamics—the van der Pol oscillator.

The van der Pol Oscillator

The van der Pol (vdP) oscillator begins as a simple harmonic oscillator (SHO) in which the dissipation (loss of energy) is flipped to become gain of energy.  This is as simple as flipping the sign of the damping term in the SHO

where β is positive.  This 2nd-order ODE is re-written into a dynamical flow as

where γ = β/m is the system gain.  Clearly, the dynamics of this SHO with gain would lead to run-away as the oscillator grows without bound.             

But no real-world system can grow indefinitely.  It has to eventually be limited by things such as inelasticity.  One of the simplest ways to include such a limiting process in the mathematical model is to make the gain get smaller at larger amplitudes.  This can be accomplished by making the gain a function of the amplitude x as

When the amplitude x gets large, the gain decreases, becoming zero and changing sign when x = 1.  Putting this amplitude-dependent gain into the SHO equation yields

This is the van der Pol equation.  It is the quintessential example of a nonlinear autonomous oscillator.            

When the parameter ε is large, the vdP oscillator has can behave in strongly nonlinear ways, with strongly nonlinear and nonharmonic oscillations.  An example is shown in Fig. 2 for a = 5 and b = 2.5.  The oscillation is clearly non-harmonic.

Fig. 1 Time trace of the position and velocity of the vdP oscillator with w0 = 5 and ε = 2.5.
Fig. 2 State-space portrait of the vdP flow lines for w0 = 5 and ε = 2.5.
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Mon Apr 16 07:38:57 2018

@author: David Nolte
"""
import numpy as np
from scipy import integrate
from matplotlib import pyplot as plt

plt.close('all')

def solve_flow(param,lim = [-3,3,-3,3],max_time=10.0):
# van der pol 2D flow 
    def flow_deriv(x_y, t0, alpha,beta):
        x, y = x_y
        return [y,-alpha*x+beta*(1-x**2)*y]
    
    plt.figure()
    xmin = lim[0]
    xmax = lim[1]
    ymin = lim[2]
    ymax = lim[3]
    plt.axis([xmin, xmax, ymin, ymax])

    N=144
    colors = plt.cm.prism(np.linspace(0, 1, N))
    
    x0 = np.zeros(shape=(N,2))
    ind = -1
    for i in range(0,12):
        for j in range(0,12):
            ind = ind + 1;
            x0[ind,0] = ymin-1 + (ymax-ymin+2)*i/11
            x0[ind,1] = xmin-1 + (xmax-xmin+2)*j/11
             
    # Solve for the trajectories
    t = np.linspace(0, max_time, int(250*max_time))
    x_t = np.asarray([integrate.odeint(flow_deriv, x0i, t, param)
                      for x0i in x0])

    for i in range(N):
        x, y = x_t[i,:,:].T
        lines = plt.plot(x, y, '-', c=colors[i])
        plt.setp(lines, linewidth=1)

    plt.show()
    plt.title(model_title)
    plt.savefig('Flow2D')
    
    return t, x_t

def solve_flow2(param,max_time=20.0):
# van der pol 2D flow 
    def flow_deriv(x_y, t0, alpha,beta):
        #"""Compute the time-derivative of a Medio system."""
        x, y = x_y
        return [y,-alpha*x+beta*(1-x**2)*y]
    model_title = 'van der Pol Oscillator'
    x0 = np.zeros(shape=(2,))
    x0[0] = 0
    x0[1] = 4.5
    
    # Solve for the trajectories
    t = np.linspace(0, max_time, int(250*max_time))
    x_t = integrate.odeint(flow_deriv, x0, t, param)
 
    return t, x_t

param = (5, 2.5)             # van der Pol
lim = (-7,7,-10,10)

t, x_t = solve_flow(param,lim)

t, x_t = solve_flow2(param)
plt.figure(2)
lines = plt.plot(t,x_t[:,0],t,x_t[:,1],'-')

Separation of Time Scales

Nonlinear systems can have very complicated behavior that may be difficult to address analytically.  This is why the numerical ODE solver is a central tool of modern dynamics.  But there is a very neat analytical trick that can be applied to tame the nonlinearities (if they are not too large) and simplify the autonomous oscillator.  This trick is called separation of time scales (also known as secular perturbation theory)—it looks for simultaneous fast and slow behavior within the dynamics.  An example of fast and slow time scales in a well-known dynamical system is found in the simple spinning top in which nutation (fast oscillations) are superposed on precession (slow oscillations).             

For the autonomous van der Pol oscillator the fast time scale is the natural oscillation frequency, while the slow time scale is the approach to the limit cycle.  Let’s assign t0 = t and t1 = εt, where ε is a small parameter.  t0 is the slow period (approach to the limit cycle) and t1 is the fast period (natural oscillation frequency).  The solution in terms of these time scales is

where x0 is a slow response and acts as an envelope function for x1 that is the fast response. The total differential is

Similarly, to obtain a second derivative

Therefore, the vdP equation in terms of x0 and x1 is

to lowest order. Now separate the orders to zeroth and first orders in ε, respectively,

Solve the first equation (a simple harmonic oscillator)

and plug the solution it into the right-hand side of the second equation to give

The key to secular perturbation theory is to confine dynamics to their own time scales.  In other words, the slow dynamics provide the envelope that modulates the fast carrier frequency.  The envelope dynamics are contained in the time dependence of the coefficients A and B.  Furthermore, the dynamics of x1 should be a homogeneous function of time, which requires each term in the last equation to be zero.  Therefore, the dynamical equations for the envelope functions are

These can be transformed into polar coordinates. Because the envelope functions do not depend on the slow time scale, the time derivatives are

With these expressions, the slow dynamics become

where the angular velocity in the fast variable is equal to zero, leaving only the angular velocity of the unperturbed oscillator. (This is analogous to the rotating wave approximation (RWA) in optics, and also equivalent to studying the dynamics in the rotating frame of the unperturbed oscillator.)

Making a final substitution ρ = R/2 gives a very simple set of dynamical equations

These final equations capture the essential properties of the relaxation of the dynamics to the limit cycle. To lowest order (when the gain is weak) the angular frequency is unaffected, and the system oscillates at the natural frequency. The amplitude of the limit cycle equals 1. A deviation in the amplitude from 1 decays slowly back to the limit cycle making it a stable fixed point in the radial dynamics. This analysis has converted the two-dimensional dynamics of the autonomous oscillator to a simple one-dimensional dynamics with a stable fixed point on the radius variable. The phase-space portrait of this simplified autonomous oscillator is shown in Fig. 3. What could be simpler? This simplified autonomous oscillator can be found as a fundamental element of many complex systems.

Fig. 3 The state-space diagram of the simplified autonomous oscillator. Initial conditions relax onto the limit cycle. (Reprinted from Introduction to Modern Dynamics (Oxford, 2019) on pg. 8)

Further Reading

D. D. Nolte, Introduction to Modern Dynamics: Chaos, Networks, Space and Time, 2nd edition (Oxford University Press, 2019)

Pikovsky, A. S., M. G. Rosenblum and J. Kurths (2003). Synchronization: A Universal concept in nonlinear science. Cambridge, Cambridge University Press.

Orbiting Photons around a Black Hole

The physics of a path of light passing a gravitating body is one of the hardest concepts to understand in General Relativity, but it is also one of the easiest.  It is hard because there can be no force of gravity on light even though the path of a photon bends as it passes a gravitating body.  It is easy, because the photon is following the simplest possible path—a geodesic equation for force-free motion.

         This blog picks up where my last blog left off, having there defined the geodesic equation and presenting the Schwarzschild metric.  With those two equations in hand, we could simply solve for the null geodesics (a null geodesic is the path of a light beam through a manifold).  But there turns out to be a simpler approach that Einstein came up with himself (he never did like doing things the hard way).  He just had to sacrifice the fundamental postulate that he used to explain everything about Special Relativity.

Throwing Special Relativity Under the Bus

The fundamental postulate of Special Relativity states that the speed of light is the same for all observers.  Einstein posed this postulate, then used it to derive some of the most astonishing consequences of Special Relativity—like E = mc2.  This postulate is at the rock core of his theory of relativity and can be viewed as one of the simplest “truths” of our reality—or at least of our spacetime. 

            Yet as soon as Einstein began thinking how to extend SR to a more general situation, he realized almost immediately that he would have to throw this postulate out.   While the speed of light measured locally is always equal to c, the apparent speed of light observed by a distant observer (far from the gravitating body) is modified by gravitational time dilation and length contraction.  This means that the apparent speed of light, as observed at a distance, varies as a function of position.  From this simple conclusion Einstein derived a first estimate of the deflection of light by the Sun, though he initially was off by a factor of 2.  (The full story of Einstein’s derivation of the deflection of light by the Sun and the confirmation by Eddington is in Chapter 7 of Galileo Unbound (Oxford University Press, 2018).)

The “Optics” of Gravity

The invariant element for a light path moving radially in the Schwarzschild geometry is

The apparent speed of light is then

where c(r) is  always less than c, when observing it from flat space.  The “refractive index” of space is defined, as for any optical material, as the ratio of the constant speed divided by the observed speed

Because the Schwarzschild metric has the property

the effective refractive index of warped space-time is

with a divergence at the Schwarzschild radius.

            The refractive index of warped space-time in the limit of weak gravity can be used in the ray equation (also known as the Eikonal equation described in an earlier blog)

where the gradient of the refractive index of space is

The ray equation is then a four-variable flow

These equations represent a 4-dimensional flow for a light ray confined to a plane.  The trajectory of any light path is found by using an ODE solver subject to the initial conditions for the direction of the light ray.  This is simple for us to do today with Python or Matlab, but it was also that could be done long before the advent of computers by early theorists of relativity like Max von Laue  (1879 – 1960).

The Relativity of Max von Laue

In the Fall of 1905 in Berlin, a young German physicist by the name of Max Laue was sitting in the physics colloquium at the University listening to another Max, his doctoral supervisor Max Planck, deliver a seminar on Einstein’s new theory of relativity.  Laue was struck by the simplicity of the theory, in this sense “simplistic” and hence hard to believe, but the beauty of the theory stuck with him, and he began to think through the consequences for experiments like the Fizeau experiment on partial ether drag.

         Armand Hippolyte Louis Fizeau (1819 – 1896) in 1851 built one of the world’s first optical interferometers and used it to measure the speed of light inside moving fluids.  At that time the speed of light was believed to be a property of the luminiferous ether, and there were several opposing theories on how light would travel inside moving matter.  One theory would have the ether fully stationary, unaffected by moving matter, and hence the speed of light would be unaffected by motion.  An opposite theory would have the ether fully entrained by matter and hence the speed of light in moving matter would be a simple sum of speeds.  A middle theory considered that only part of the ether was dragged along with the moving matter.  This was Fresnel’s partial ether drag hypothesis that he had arrived at to explain why his friend Francois Arago had not observed any contribution to stellar aberration from the motion of the Earth through the ether.  When Fizeau performed his experiment, the results agreed closely with Fresnel’s drag coefficient, which seemed to settle the matter.  Yet when Michelson and Morley performed their experiments of 1887, there was no evidence for partial drag.

         Even after the exposition by Einstein on relativity in 1905, the disagreement of the Michelson-Morley results with Fizeau’s results was not fully reconciled until Laue showed in 1907 that the velocity addition theorem of relativity gave complete agreement with the Fizeau experiment.  The velocity observed in the lab frame is found using the velocity addition theorem of special relativity. For the Fizeau experiment, water with a refractive index of n is moving with a speed v and hence the speed in the lab frame is

The difference in the speed of light between the stationary and the moving water is the difference

where the last term is precisely the Fresnel drag coefficient.  This was one of the first definitive “proofs” of the validity of Einstein’s theory of relativity, and it made Laue one of relativity’s staunchest proponents.  Spurred on by his success with the Fresnel drag coefficient explanation, Laue wrote the first monograph on relativity theory, publishing it in 1910. 

Fig. 1 Front page of von Laue’s textbook, first published in 1910, on Special Relativity (this is a 4-th edition published in 1921).

A Nobel Prize for Crystal X-ray Diffraction

In 1909 Laue became a Privatdozent under Arnold Sommerfeld (1868 – 1951) at the university in Munich.  In the Spring of 1912 he was walking in the Englischer Garten on the northern edge of the city talking with Paul Ewald (1888 – 1985) who was finishing his doctorate with Sommerfed studying the structure of crystals.  Ewald was considering the interaction of optical wavelength with the periodic lattice when it struck Laue that x-rays would have the kind of short wavelengths that would allow the crystal to act as a diffraction grating to produce multiple diffraction orders.  Within a few weeks of that discussion, two of Sommerfeld’s students (Friedrich and Knipping) used an x-ray source and photographic film to look for the predicted diffraction spots from a copper sulfate crystal.  When the film was developed, it showed a constellation of dark spots for each of the diffraction orders of the x-rays scattered from the multiple periodicities of the crystal lattice.  Two years later, in 1914, Laue was awarded the Nobel prize in physics for the discovery.  That same year his father was elevated to the hereditary nobility in the Prussian empire and Max Laue became Max von Laue.

            Von Laue was not one to take risks, and he remained conservative in many of his interests.  He was immensely respected and played important roles in the administration of German science, but his scientific contributions after receiving the Nobel Prize were only modest.  Yet as the Nazis came to power in the early 1930’s, he was one of the few physicists to stand up and resist the Nazi take-over of German physics.  He was especially disturbed by the plight of the Jewish physicists.  In 1933 he was invited to give the keynote address at the conference of the German Physical Society in Wurzburg where he spoke out against the Nazi rejection of relativity as they branded it “Jewish science”.  In his speech he likened Einstein, the target of much of the propaganda, to Galileo.  He said, “No matter how great the repression, the representative of science can stand erect in the triumphant certainty that is expressed in the simple phrase: And yet it moves.”  Von Laue believed that truth would hold out in the face of the proscription against relativity theory by the Nazi regime.  The quote “And yet it moves” is supposed to have been muttered by Galileo just after his abjuration before the Inquisition, referring to the Earth moving around the Sun.  Although the quote is famous, it is believed to be a myth.

            In an odd side-note of history, von Laue sent his gold Nobel prize medal to Denmark for its safe keeping with Niels Bohr so that it would not be paraded about by the Nazi regime.  Yet when the Nazis invaded Denmark, to avoid having the medals fall into the hands of the Nazis, the medal was dissolved in aqua regia by a member of Bohr’s team, George de Hevesy.  The gold completely dissolved into an orange liquid that was stored in a beaker high on a shelf through the war.  When Denmark was finally freed, the dissolved gold was precipitated out and a new medal was struck by the Nobel committee and re-presented to von Laue in a ceremony in 1951. 

The Orbits of Light Rays

Von Laue’s interests always stayed close to the properties of light and electromagnetic radiation ever since he was introduced to the field when he studied with Woldemor Voigt at Göttingen in 1899.  This interest included the theory of relativity, and only a few years after Einstein published his theory of General Relativity and Gravitation, von Laue added to his earlier textbook on relativity by writing a second volume on the general theory.  The new volume was published in 1920 and included the theory of the deflection of light by gravity. 

         One of the very few illustrations in his second volume is of light coming into interaction with a super massive gravitational field characterized by a Schwarzschild radius.  (No one at the time called it a “black hole”, nor even mentioned Schwarzschild.  That terminology came much later.)  He shows in the drawing, how light, if incident at just the right impact parameter, would actually loop around the object.  This is the first time such a diagram appeared in print, showing the trajectory of light so strongly affected by gravity.

Fig. 2 A page from von Laue’s second volume on relativity (first published in 1920) showing the orbit of a photon around a compact mass with “gravitational cutoff” (later known as a “black hole:”). The figure is drawn semi-quantitatively, but the phenomenon was clearly understood by von Laue.

Python Code

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Tue May 28 11:50:24 2019

@author: nolte
"""

import numpy as np
import matplotlib as mpl
from mpl_toolkits.mplot3d import Axes3D
from scipy import integrate
from matplotlib import pyplot as plt
from matplotlib import cm
import time
import os

plt.close('all')

def create_circle():
	circle = plt.Circle((0,0), radius= 10, color = 'black')
	return circle

def show_shape(patch):
	ax=plt.gca()
	ax.add_patch(patch)
	plt.axis('scaled')
	plt.show()
    
def refindex(x,y):
    
    A = 10
    eps = 1e-6
    
    rp0 = np.sqrt(x**2 + y**2);
        
    n = 1/(1 - A/(rp0+eps))
    fac = np.abs((1-9*(A/rp0)**2/8))   # approx correction to Eikonal
    nx = -fac*n**2*A*x/(rp0+eps)**3
    ny = -fac*n**2*A*y/(rp0+eps)**3
     
    return [n,nx,ny]

def flow_deriv(x_y_z,tspan):
    x, y, z, w = x_y_z
    
    [n,nx,ny] = refindex(x,y)
        
    yp = np.zeros(shape=(4,))
    yp[0] = z/n
    yp[1] = w/n
    yp[2] = nx
    yp[3] = ny
    
    return yp
                
for loop in range(-5,30):
    
    xstart = -100
    ystart = -2.245 + 4*loop
    print(ystart)
    
    [n,nx,ny] = refindex(xstart,ystart)


    y0 = [xstart, ystart, n, 0]

    tspan = np.linspace(1,400,2000)

    y = integrate.odeint(flow_deriv, y0, tspan)

    xx = y[1:2000,0]
    yy = y[1:2000,1]


    plt.figure(1)
    lines = plt.plot(xx,yy)
    plt.setp(lines, linewidth=1)
    plt.show()
    plt.title('Photon Orbits')
    
c = create_circle()
show_shape(c)
axes = plt.gca()
axes.set_xlim([-100,100])
axes.set_ylim([-100,100])

# Now set up a circular photon orbit
xstart = 0
ystart = 15

[n,nx,ny] = refindex(xstart,ystart)

y0 = [xstart, ystart, n, 0]

tspan = np.linspace(1,94,1000)

y = integrate.odeint(flow_deriv, y0, tspan)

xx = y[1:1000,0]
yy = y[1:1000,1]

plt.figure(1)
lines = plt.plot(xx,yy)
plt.setp(lines, linewidth=2, color = 'black')
plt.show()

One of the most striking effects of gravity on photon trajectories is the possibility for a photon to orbit a black hole in a circular orbit. This is shown in Fig. 3 as the black circular ring for a photon at a radius equal to 1.5 times the Schwarzschild radius. This radius defines what is known as the photon sphere. However, the orbit is not stable. Slight deviations will send the photon spiraling outward or inward.

The Eikonal approximation does not strictly hold under strong gravity, but the Eikonal equations with the effective refractive index of space still yield semi-quantitative behavior. In the Python code, a correction factor is used to match the theory to the circular photon orbits, while still agreeing with trajectories far from the black hole. The results of the calculation are shown in Fig. 3. For large impact parameters, the rays are deflected through a finite angle. At a critical impact parameter, near 3 times the Schwarzschild radius, the ray loops around the black hole. For smaller impact parameters, the rays are captured by the black hole.

Fig. 3 Photon orbits near a black hole calculated using the Eikonal equation and the effective refractive index of warped space. One ray, near the critical impact parameter, loops around the black hole as predicted by von Laue. The central black circle is the black hole with a Schwarzschild radius of 10 units. The black ring is the circular photon orbit at a radius 1.5 times the Schwarzschild radius.

Photons pile up around the black hole at the photon sphere. The first image ever of the photon sphere of a black hole was made earlier this year (announced April 10, 2019). The image shows the shadow of the supermassive black hole in the center of Messier 87 (M87), an elliptical galaxy 55 million light-years from Earth. This black hole is 6.5 billion times the mass of the Sun. Imaging the photosphere required eight ground-based radio telescopes placed around the globe, operating together to form a single telescope with an optical aperture the size of our planet.  The resolution of such a large telescope would allow one to image a half-dollar coin on the surface of the Moon, although this telescope operates in the radio frequency range rather than the optical.

Fig. 4 Scientists have obtained the first image of a black hole, using Event Horizon Telescope observations of the center of the galaxy M87. The image shows a bright ring formed as light bends in the intense gravity around a black hole that is 6.5 billion times more massive than the Sun.

Further Reading

Introduction to Modern Dynamics: Chaos, Networks, Space and Time, 2nd Ed. (Oxford University Press, 2019)

B. Lavenda, The Optical Properties of Gravity, J. Mod. Phys, 8 8-3-838 (2017)

Getting Armstrong, Aldrin and Collins Home from the Moon: Apollo 11 and the Three-Body Problem

Fifty years ago on the 20th of July at nearly 11 o’clock at night, my brothers and I were peering through the screen door of a very small 1960’s Shasta compact car trailer watching the TV set on the picnic table outside the trailer door.  Our family was at a camp ground in southern Michigan and the mosquitos were fierce (hence why we were inside the trailer looking out through the screen).  Neil Armstrong was about to be the first human to step foot on the Moon.  The image on the TV was a fuzzy black and white, with barely recognizable shapes clouded even more by the dirt and dead bugs on the screen, but it is a memory etched in my mind.  I was 10 years old and I was convinced that when I grew up I would visit the Moon myself, because by then Moon travel would be like flying to Europe.  It didn’t turn out that way, and fifty years later it’s a struggle to even get back there. 

The dangers could have become life-threatening for the crew of Apollo 11. If they miscalculated their trajectory home and had bounced off the Earth’s atmosphere, they would have become a tragic demonstration of the chaos of three-body orbits.

So maybe I won’t get to the Moon, but maybe my grandchildren will.  And if they do, I hope they know something about the three-body problem in physics, because getting to and from the Moon isn’t as easy as it sounds.  Apollo 11 faced real danger at several critical points on its flight plan, but all went perfectly (except overshooting their landing site and that last boulder field right before Armstrong landed). Some of those dangers became life-threatening for the crew of Apollo 13, and if they had miscalculated their trajectory home and had bounced off the Earth’s atmosphere, they would have become a tragic demonstration of the chaos of three-body orbits.  In fact, their lifeless spaceship might have returned to the Moon and back to Earth over and over again, caught in an infinite chaotic web.

The complexities of trajectories in the three-body problem arise because there are too few constants of motion and too many degrees of freedom.  To get an intuitive picture of how the trajectory behaves, it is best to start with a problem known as the restricted three-body problem.

The Saturn V Booster, perhaps the pinnacle of “muscle and grit” space exploration.

The Restricted Three-Body Problem

The restricted three-body problem was first considered by Leonhard Euler in 1762 (for a further discussion of the history of the three-body problem, see my Blog from July 5).  For the special case of circular orbits of constant angular frequency, the motion of the third mass is described by the Lagrangian

where the potential is time dependent because of the motion of the two larger masses.  Lagrange approached the problem by adopting a rotating reference frame in which the two larger masses m1 and m2 move along the stationary line defined by their centers.  The new angle variable is theta-prime.  The Lagrangian in the rotating frame is

where the effective potential is now time independent.  The first term in the effective potential is the Coriolis effect and the second is the centrifugal term.  The dynamical flow in the plane is four dimensional, and the four-dimensional flow is

where the position vectors are in the center-of-mass frame

relative to the positions of the Earth and Moon (x1 and x2) in the rotating frame in which they are at rest along the x-axis.

A single trajectory solved for this flow is shown in Fig. 1 for a tiny object passing back and forth chaotically between the Earth and the Moon. The object is considered to be massless, or at least so small it does not perturb the Earth-Moon system. The energy of the object was selected to allow it to pass over the potential barrier of the Lagrange-Point L1 between the Earth and the Moon. The object spends most of its time around the Earth, but now and then will get into a transfer orbit that brings it around the Moon. This would have been the fate of Apollo 11 if their last thruster burn had failed.

Fig. 1 The trajectory of a tiny object in the planar three-body problem interacting with a large mass (Earth on the left) and a small mass (Moon on the right). The energy of the trajectory allows it to pass back and forth chaotically between proximity to the Earth and proximity to the Moon. The time-duration of the simulation is approximately one decade. The envelope of the trajectories is called the “Hill region” named after one of the the first US astrophysicists George William Hill (1838-1914) who studied the 3-body problem of the Moon.

Contrast the orbit of Fig. 1 with the simple flight plan of Apollo 11 on the banner figure. The chaotic character of the three-body problem emerges for a “random” initial condition. You can play with different initial conditions in the following Python code to explore the properties of this dynamical problem. Note that in this simulation, the mass of the Moon was chosen about 8 times larger than in nature to exaggerate the effect of the Moon.

Python Code

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Tue May 28 11:50:24 2019

@author: nolte
"""

import numpy as np
import matplotlib as mpl
from mpl_toolkits.mplot3d import Axes3D
from scipy import integrate
from matplotlib import pyplot as plt
from matplotlib import cm
import time
import os

plt.close('all')

womega = 1
R = 1
eps = 1e-6

M1 = 1     % Mass of the Earth
M2 = 1/10     % Mass of the Moon
chsi = M2/M1

x1 = -M2*R/(M1+M2)    % Earth location in rotating frame
x2 = x1 + R     % Moon location

def poten(y,c):
    
    rp0 = np.sqrt(y**2 + c**2);
    thetap0 = np.arctan(y/c);
        
    rp1 = np.sqrt(x1**2 + rp0**2 - 2*np.abs(rp0*x1)*np.cos(np.pi-thetap0));
    rp2 = np.sqrt(x2**2 + rp0**2 - 2*np.abs(rp0*x2)*np.cos(thetap0));
    V = -M1/rp1 -M2/rp2 - E;
     
    return [V]

def flow_deriv(x_y_z,tspan):
    x, y, z, w = x_y_z
    
    r1 = np.sqrt(x1**2 + x**2 - 2*np.abs(x*x1)*np.cos(np.pi-z));
    r2 = np.sqrt(x2**2 + x**2 - 2*np.abs(x*x2)*np.cos(z));
        
    yp = np.zeros(shape=(4,))
    yp[0] = y
    yp[1] = -womega**2*R**3*(np.abs(x)-np.abs(x1)*np.cos(np.pi-z))/(r1**3+eps) - womega**2*R**3*chsi*(np.abs(x)-abs(x2)*np.cos(z))/(r2**3+eps) + x*(w-womega)**2
    yp[2] = w
    yp[3] = 2*y*(womega-w)/x - womega**2*R**3*chsi*abs(x2)*np.sin(z)/(x*(r2**3+eps)) + womega**2*R**3*np.abs(x1)*np.sin(np.pi-z)/(x*(r1**3+eps))
    
    return yp
                
r0 = 0.64   % initial radius
v0 = 0.3    % initial radial speed
theta0 = 0   % initial angle
vrfrac = 1   % fraction of speed in radial versus angular directions

rp1 = np.sqrt(x1**2 + r0**2 - 2*np.abs(r0*x1)*np.cos(np.pi-theta0))
rp2 = np.sqrt(x2**2 + r0**2 - 2*np.abs(r0*x2)*np.cos(theta0))
V = -M1/rp1 - M2/rp2
T = 0.5*v0**2
E = T + V

vr = vrfrac*v0
W = (2*T - v0**2)/r0

y0 = [r0, vr, theta0, W]   % This is where you set the initial conditions

tspan = np.linspace(1,2000,20000)

y = integrate.odeint(flow_deriv, y0, tspan)

xx = y[1:20000,0]*np.cos(y[1:20000,2]);
yy = y[1:20000,0]*np.sin(y[1:20000,2]);

plt.figure(1)
lines = plt.plot(xx,yy)
plt.setp(lines, linewidth=0.5)
plt.show()

In the code, set the position and speed of the Apollo command module on lines 56-59 and put in the initial conditions on line 70. The mass of the Moon in nature is 1/81 of the mass of the Earth, which shrinks the L1 “bottleneck” to a much smaller region that you can explore to see what the fate of the Apollo missions could have been.

Further Reading

The Three-body Problem, Longitude at Sea, and Lagrange’s Points

Introduction to Modern Dynamics: Chaos, Networks, Space and Time, 2nd Ed. (Oxford University Press, 2019)

How to Teach General Relativity to Undergraduate Physics Majors

As a graduate student in physics at Berkeley in the 1980’s, I took General Relativity (aka GR), from Bruno Zumino, who was a world-famous physicist known as one of the originators of super-symmetry in quantum gravity (not to be confused with super-asymmetry of Cooper-Fowler Big Bang Theory fame).  The class textbook was Gravitation and cosmology: principles and applications of the general theory of relativity, by Steven Weinberg, another world-famous physicist, in this case known for grand unification of the electro-weak force with electromagnetism.  With so much expertise at hand, how could I fail but to absorb the simple essence of general relativity? 

The answer is that I failed miserably.  Somehow, I managed to pass the course, but I walked away with nothing!  And it bugged me for years.  What was so hard about GR?  It took me almost a decade teaching undergraduate physics classes at Purdue in the 90’s before I realized that it my biggest obstacle had been language:  I kept mistaking the words and terms of GR as if they were English.  Words like “general covariance” and “contravariant” and “contraction” and “covariant derivative”.  They sounded like English, with lots of “co” prefixes that were hard to keep straight, but they actually are part of a very different language that I call Physics-ese

Physics-ese is a language that has lots of words that sound like English, and so you think you know what the words mean, but the words have sometimes opposite meanings than what you would guess.  And the meanings of Physics-ese are precisely defined, and not something that can be left to interpretation.  I learned this while teaching the intro courses to non-majors, because so many times when the students were confused, it turned out that it was because they had mistaken a textbook jargon term to be English.  If you told them that the word wasn’t English, but just a token standing for a well-defined object or process, it would unshackle them from their misconceptions.

Then, in the early 00’s when I started to explore the physics of generalized trajectories related to some of my own research interests, I realized that the primary obstacle to my learning anything in the Gravitation course was Physics-ese.   So this raised the question in my mind: what would it take to teach GR to undergraduate physics majors in a relatively painless manner?  This is my answer. 

More on this topic can be found in Chapter 11 of the textbook IMD2: Introduction to Modern Dynamics, 2nd Edition, Oxford University Press, 2019

Trajectories as Flows

One of the culprits for my mind block learning GR was Newton himself.  His ubiquitous second law, taught as F = ma, is surprisingly misleading if one wants to have a more general understanding of what a trajectory is.  This is particularly the case for light paths, which can be bent by gravity, yet clearly cannot have any forces acting on them. 

The way to fix this is subtle yet simple.  First, express Newton’s second law as

which is actually closer to the way that Newton expressed the law in his Principia.  In three dimensions for a single particle, these equations represent a 6-dimensional dynamical space called phase space: three coordinate dimensions and three momentum dimensions.  Then generalize the vector quantities, like the position vector, to be expressed as xa for the six dynamics variables: x, y, z, px, py, and pz

Now, as part of Physics-ese, putting the index as a superscript instead as a subscript turns out to be a useful notation when working in higher-dimensional spaces.  This superscript is called a “contravariant index” which sounds like English but is uninterpretable without a Physics-ese-to-English dictionary.  All “contravariant index” means is “column vector component”.  In other words, xa is just the position vector expressed as a column vector

This superscripted index is called a “contravariant” index, but seriously dude, just forget that “contravariant” word from Physics-ese and just think “index”.  You already know it’s a column vector.

Then Newton’s second law becomes

where the index a runs from 1 to 6, and the function Fa is a vector function of the dynamic variables.  To spell it out, this is

so it’s a lot easier to write it in the one-line form with the index notation. 

The simple index notation equation is in the standard form for what is called, in Physics-ese, a “mathematical flow”.  It is an ODE that can be solved for any set of initial conditions for a given trajectory.  Or a whole field of solutions can be considered in a phase-space portrait that looks like the flow lines of hydrodynamics.  The phase-space portrait captures the essential physics of the system, whether it is a rock thrown off a cliff, or a photon orbiting a black hole.  But to get to that second problem, it is necessary to look deeper into the way that space is described by any set of coordinates, especially if those coordinates are changing from location to location.

What’s so Fictitious about Fictitious Forces?

Freshmen physics students are routinely admonished for talking about “centrifugal” forces (rather than centripetal) when describing circular motion, usually with the statement that centrifugal forces are fictitious—only appearing to be forces when the observer is in the rotating frame.  The same is said for the Coriolis force.  Yet for being such a “fictitious” force, the Coriolis effect is what drives hurricanes and the colossal devastation they cause.  Try telling a hurricane victim that they were wiped out by a fictitious force!  Looking closer at the Coriolis force is a good way of understanding how taking derivatives of vectors leads to effects often called “fictitious”, yet it opens the door on some of the simpler techniques in the topic of differential geometry.

To start, consider a vector in a uniformly rotating frame.  Such a frame is called “non-inertial” because of the angular acceleration associated with the uniform rotation.  For an observer in the rotating frame, vectors are attached to the frame, like pinning them down to the coordinate axes, but the axes themselves are changing in time (when viewed by an external observer in a fixed frame).  If the primed frame is the external fixed frame, then a position in the rotating frame is

where R is the position vector of the origin of the rotating frame and r is the position in the rotating frame relative to the origin.  The funny notation on the last term is called in Physics-ese a “contraction”, but it is just a simple inner product, or dot product, between the components of the position vector and the basis vectors.  A basis vector is like the old-fashioned i, j, k of vector calculus indicating unit basis vectors pointing along the x, y and z axes.  The format with one index up and one down in the product means to do a summation.  This is known as the Einstein summation convention, so it’s just

Taking the time derivative of the position vector gives

and by the chain rule this must be

where the last term has a time derivative of a basis vector.  This is non-zero because in the rotating frame the basis vector is changing orientation in time.  This term is non-inertial and can be shown fairly easily (see IMD2 Chapter 1) to be

which is where the centrifugal force comes from.  This shows how a so-called fictitious force arises from a derivative of a basis vector.  The fascinating point of this is that in GR, the force of gravity arises in almost the same way, making it tempting to call gravity a fictitious force, despite the fact that it can kill you if you fall out a window.  The question is, how does gravity arise from simple derivatives of basis vectors?

The Geodesic Equation

To teach GR to undergraduates, you cannot expect them to have taken a course in differential geometry, because most of them just don’t have the time in their schedule to take such an advanced mathematics course.  In addition, there is far more taught in differential geometry than is needed to make progress in GR.  So the simple approach is to teach what they need to understand GR with as little differential geometry as possible, expressed with clear English-to-Physics-ese translations. 

For example, consider the partial derivative of a vector expressed in index notation as

Taking the partial derivative, using the always-necessary chain rule, is

where the second term is just like the extra time-derivative term that showed up in the derivation of the Coriolis force.  The basis vector of a general coordinate system may change size and orientation as a function of position, so this derivative is not in general zero.  Because the derivative of a basis vector is so central to the ideas of GR, they are given their own symbol.  It is

where the new “Gamma” symbol is called a Christoffel symbol.  It has lots of indexes, both up and down, which looks daunting, but it can be interpreted as the beta-th derivative of the alpha-th component of the mu-th basis vector.  The partial derivative is now

For those of you who noticed that some of the indexes flipped from alpha to mu and vice versa, you’re right!  Swapping repeated indexes in these “contractions” is allowed and helps make derivations a lot easier, which is probably why Einstein invented this notation in the first place.

The last step in taking a partial derivative of a vector is to isolate a single vector component Va as

where a new symbol, the del-operator has been introduced.  This del-operator is known as the “covariant derivative” of the vector component.  Again, forget the “covariant” part and just think “gradient”.  Namely, taking the gradient of a vector in general includes changes in the vector component as well as changes in the basis vector.

Now that you know how to take the partial derivative of a vector using Christoffel symbols, you are ready to generate the central equation of General Relativity:  The geodesic equation. 

Everyone knows that a geodesic is the shortest path between two points, like a great circle route on the globe.  But it also turns out to be the straightest path, which can be derived using an idea known as “parallel transport”.  To start, consider transporting a vector along a curve in a flat metric.  The equation describing this process is

Because the Christoffel symbols are zero in a flat space, the covariant derivative and the partial derivative are equal, giving

If the vector is transported parallel to itself, then there is no change in V along the curve, so that

Finally, recognizing

and substituting this in gives

This is the geodesic equation! 

Fig. 1 The geodesic equation of motion is for force-free motion through a metric space. The curvature of the trajectory is analogous to acceleration, and the generalized gradient is analogous to a force. The geodesic equation is the “F = ma” of GR.

Putting this in the standard form of a flow gives the geodesic flow equations

The flow defines an ordinary differential equation that defines a curve that carries its own tangent vector onto itself.  The curve is parameterized by a parameter s that can be identified with path length.  It is the central equation of GR, because it describes how an object follows a force-free trajectory, like free fall, in any general coordinate system.  It can be applied to simple problems like the Coriolis effect, or it can be applied to seemingly difficult problems, like the trajectory of a light path past a black hole.

The Metric Connection

Arriving at the geodesic equation is a major accomplishment, and you have done it in just a few pages of this blog.  But there is still an important missing piece before we are doing General Relativity of gravitation.  We need to connect the Christoffel symbol in the geodesic equation to the warping of space-time around a gravitating object. 

The warping of space-time by matter and energy is another central piece of GR and is often the central focus of a graduate-level course on the subject.  This part of GR does have its challenges leading up to Einstein’s Field Equations that explain how matter makes space bend.  But at an undergraduate level, it is sufficient to just describe the bent coordinates as a starting point, then use the geodesic equation to solve for so many of the cool effects of black holes.

So, stating the way that matter bends space-time is as simple as writing down the length element for the Schwarzschild metric of a spherical gravitating mass as

where RS = GM/c2 is the Schwarzschild radius.  (The connection between the metric tensor gab and the Christoffel symbol can be found in Chapter 11 of IMD2.)  It takes only a little work to find that

This means that if we have the Schwarzschild metric, all we have to do is take first partial derivatives and we will arrive at the Christoffel symbols that go into the geodesic equation.  Solving for any type of force-free trajectory is then just a matter of solving ODEs with initial conditions (performed routinely with numerical ODE solvers in Python, Matlab, Mathematica, etc.).

The first problem we will tackle using the geodesic equation is the deflection of light by gravity.  This is the quintessential problem of GR because there cannot be any gravitational force on a photon, yet the path of the photon surely must bend in the presence of gravity.  This is possible through the geodesic motion of the photon through warped space time.  I’ll take up this problem in my next Blog.