triumvirate of Cambridge University in the mid-1800’s consisted of three
towering figures of mathematics and physics:
George Stokes (1819 – 1903), William Thomson (1824 – 1907) (Lord
Kelvin), and James Clerk Maxwell (1831 – 1879).
Their discoveries and methodology changed the nature of natural
philosophy, turning it into the subject that today we call physics. Stokes was the elder, establishing himself as
the predominant expert in British mathematical physics, setting the tone for
his close friend Thomson (close in age and temperament) as well as the younger
Maxwell and many other key figures of 19th century British physics.
George Gabriel Stokes was born in County Sligo in Ireland as the youngest son of the rector of Skreen parish of the Church of Ireland. No miraculous stories of his intellectual acumen seem to come from his childhood, as they did for the likes of William Hamilton (1805 – 1865) or George Green (1793 – 1841). Stokes was a good student, attending school in Skreen, then Dublin and Bristol before entering Pembroke College Cambridge in 1837. It was towards the end of his time at Cambridge that he emerged as a top mathematics student and as a candidate for Senior Wrangler.
Since 1748, the mathematics course at Cambridge University has held a yearly contest to identify the top graduating mathematics student. The winner of the contest is called the Senior Wrangler, and in the 1800’s the Senior Wrangler received a level of public fame and admiration for intellectual achievement that is somewhat like the fame reserved today for star athletes. In 1824 the mathematics course was reorganized into the Mathematical Tripos, and the contest became known as the Tripos Exam. The depth and length of the exam was legion. For instance, in 1854 when Edward Routh (1831 – 1907) beat out Maxwell for Senior Wrangler, the Tripos consisted of 16 papers spread over 8 days, totaling over 40 hours for a total number of 211 questions. The winner typically scored less than 50%. Famous Senior Wranglers include George Airy, John Herschel, Arthur Cayley, Lord Rayleigh, Arthur Eddington, J. E. Littlewood, Peter Guthrie Tait and Joseph Larmor.
In his second year at Cambridge, Stokes had begun studying under William Hopkins (1793 – 1866). It was common for mathematics students to have private tutors to prep for the Tripos exam, and Tripos tutors were sometimes as famous as the Senior Wranglers themselves, especially if a tutor (like Hopkins) was to have several students win the exam. George Stokes became Senior Wrangler in 1841, and the same year he won the Smith’s Prize in mathematics. The Tripos tested primarily on bookwork, while the Smith’s Prize tested on originality. To achieve top scores on both designated the student as the most capable and creative mathematician of his class. Stokes was immediately offered a fellowship at Pembroke College allowing him to teach and study whatever he willed.
After Stokes graduated, Hopkins
suggested that Stokes study hydrodynamics.
This may have been in part motivated by Hopkins’ own interest is
hydraulic problems in geology, but it was also a prescient suggestion, because
hydrodynamics was poised for a revolution.
Early History of Hydrodynamics
Leonardo da Vinci (1452 – 1519) believed that an artist, to capture the essence of a subject, needed to understand its fundamental nature. Therefore, when he was captivated by the idea of portraying the flow of water, he filled his notebooks with charcoal studies of the whorls and vortices of turbulent torrents and waterfalls. He was a budding experimental physicist, recording data on the complex phenomenon of hydrodynamics. Yet Leonardo was no mathematician, and although his understanding of turbulent flow was deep, he did not have the theoretical tools to explain what he saw. Two centuries later, Daniel Bernoulli (1700 – 1782) provided the first mathematical description of water flowing smoothly in his Hydrodynamica (1738). However, the modern language of calculus was only beginning to be used at that time, preventing Daniel from providing a rigorous derivation.
As for nearly all nascent
mathematical theories of the mid 1700’s, whether they be Newtonian dynamics or
the calculus of variations or number and graph theory or population dynamics or
almost anything, the person who placed the theory on firm mathematical
foundations, using modern notions and notations, was Leonhard Euler (1707 –
1783). In 1752 Euler published a treatise
that described the mathematical theory of inviscid flow—meaning flow without
viscosity. Euler’s chief results is
where ρ is the density, v is the velocity, p is pressure, z is the height of the fluid and φ is a velocity potential, while f(t) is a stream function that depends only on time. If the flow is in steady state, the time derivative vanishes, and the stream function is a constant. The key to the inviscid approximation is the dominance of momentum in fast flow, as opposed to creeping flow in which viscosity dominates. Euler’s equation, which expresses the well-known Bernoulli principle, works well under fast laminar conditions, but under slower flow conditions, internal friction ruins the inviscid approximation.
The violation of the inviscid flow
approximation became one of the important outstanding problems in theoretical
physics in the early 1800’s. For
instance, the flow of water around ship’s hulls was a key technological problem
in the strategic need for speed under sail.
In addition, understanding the creation and propagation of water waves
was critical for the safety of ships at sea.
For the growing empire of the British islands, built on the power of
their navy, the physics of hydrodynamics was more than an academic pursuit, and
their archenemy, the French, were leading the way.
1713 when Newton won his priority dispute with Leibniz over the invention of
calculus, it had the unintended consequence of setting back British mathematics
and physics for over a hundred years. Perhaps
lulled into complacency by their perceived superiority, Cambridge and Oxford
continued teaching classical mathematics, and natural philosophy became
dogmatic as Newton’s in Principia became canon.
Meanwhile Continental mathematical analysis went through a fundamental
transformation. Inspired by Newton’s
Principia rather than revering it, mathematicians such as the Swiss-German
Leonhard Euler, the Frenchwoman Emile du Chatelet and the Italian Joseph
Lagrange pushed mathematical physics far beyond Newton by developing Leibniz’
methods and notations for calculus.
By the early 1800’s, the leading mathematicians of Europe were in the French school led by Pierre-Simon Laplace along with Joseph Fourier, Siméon Denis Poisson and Augustin-Louis Cauchy. In their hands, functional analysis was going through rapid development, both theoretically and applied, far surpassing British mathematics. It was by reading the French analysts in the 1820’s that the Englishman George Green finally helped bring British mathematics back up to speed.
One member of the French school was the French engineer Claude-Louis Navier (1785 – 1836). He was educated at the Ecole Polytechnique and the School for Roads and Bridges where he became one of the leading architects for bridges in France. In addition to his engineering tasks, he also was involved in developing principles of work and kinetic energy that aided the later work of Coriolis, who was one of the first physicists to recognize the explicit interplay between kinetic energy and potential energy. One of Navier’s specialties was hydraulic engineering, and he edited a new edition of a classic work on hydraulics. In the process, he became aware of serious deficiencies in the theoretical treatment of creeping flow, especially with regards to dissipation. By adopting a molecular approach championed by Poisson, including appropriate boundary conditions, he derived a correction to the Euler flow equations that included a new term with a new material property of viscosity
Navier published his new flow equation in 1823, but the publication was followed by years of nasty in-fighting as his assumptions were assaulted by Poisson and others. This acrimony is partly to blame for why Navier was not hailed alone as the discoverer of this equation, which today bears the name “Navier-Stokes Equation”.
Despite the lead of the French mathematicians over the British in mathematical rigor, they were also bogged down by their insistence on mechanistic models that operated on the microscale action-reaction forces. This was true for their theories of elasticity, hydrodynamics as well as the luminiferous ether. George Green in England would change this. While Green was inspired by French mathematics, he made an important shift in thinking in which the fields became the quantities of interest rather than molecular forces. Differential equations describing macroscale phenomena could be “true” independently of any microscale mechanics. His theories on elasticity and light propagation relied on no underlying structure of matter or ether. Underlying models could change, but the differential equations remained true. Maxwell’s equations, a pinnacle of 19th-century British mathematical physics, were field equations that required no microscopic models, although Maxwell and others later tried to devise models of the ether.
George Stokes admired Green and
adopted his mathematics and outlook on natural philosophy. When he turned his attention to hydrodynamic
flow, he adopted a continuum approach that initially did not rely on molecular
interactions to explain viscosity and drag.
He replicated Navier’s results, but this time without relying on any
underlying microscale physics. Yet this
only took him so far. To explain some of
the essential features of fluid pressures he had to revert to microscopic
arguments of isotropy to explain why displacements were linear and why flow at
a boundary ceased. However, once these
functional dependences were set, the remainder of the problem was pure
continuum mechanics, establishing the Navier-Stokes equation for incompressible
flow. Stokes went on to apply no-slip
boundary conditions for fluids flowing through pipes of different geometric
cross sections to calculate flow rates as well as pressure drops along the pipe
caused by viscous drag.
Stokes then turned to experimental results to explain why a pendulum slowly oscillating in air lost amplitude due to dissipation. He reasoned that when the flow of air around the pendulum bob and stiff rod was slow enough the inertial effects would be negligible, simplifying the Navier-Stokes equation. He calculated the drag force on a spherical object moving slowly through a viscous liquid and obtained the now famous law known as Stokes’ Law of Drag
in which the drag force increases linearly with speed and is proportional to viscosity. With dramatic flair, he used his new law to explain why water droplets in clouds float buoyantly until they become big enough to fall as rain.
Lucasian Chair of Mathematics
There are rare individuals who become especially respected for the breadth and depth of their knowledge. In our time, already somewhat past, Steven Hawking embodied the ideal of the eminent (almost clairvoyant) scientist pushing his field to the extremes with the deepest understanding, while also being one of the most famous science popularizers of his day as well as an important chronicler of the history of physics. In his own time, Stokes was held in virtually the same level of esteem.
Just as Steven Hawking and Isaac Newton held the Lucasian Chair of Mathematics at Cambridge, Stokes became the Lucasian chair in 1849 and held the chair until his death in 1903. He was offered the chair in part because of the prestige he held as first wrangler and Smith’s prize winner, but also because of his imposing grasp of the central fields of his time. The Lucasian Chair of Mathematics at Cambridge is one of the most famous academic chairs in the world. It was established by Charles II in 1664, and the first Lucasian professor was Isaac Barrow followed by Isaac Newton who held the post for 33 years. Other famous Lucasian professors were Airy, Babbage, Larmor, Dirac as well as Hawking. During his tenure, Stokes made central contributions to hydrodynamics (as we have seen), but also the elasticity of solids, the behavior of waves in elastic solids, the diffraction of light, problems in light, gravity, sound, heat, meteorology, solar physics, and chemistry. Perhaps his most famous contribution was his explanation of fluorescence, for which he won the Rumford Medal. Certainly, if the Nobel Prize had existed in his time, he would have been a Nobel Laureate.
of Stokes’ Law
The flow field of an incompressible fluid around a smooth spherical object has zero divergence and satisfies Laplace’s equation. This allows the stream velocities to take the form in spherical coordinates
where the velocity components are defined in terms of the stream function ψ. The partial derivatives of pressure satisfy the equations
where the second-order operator is
The vanishing of the Laplacian of the stream function
allows the function to take the form
The no-slip boundary condition on the surface of the sphere, as well as the asymptotic velocity field far from the sphere taking the form v•cosθ gives the solution
Using this expression in the first equations yields the velocities, pressure and shear
The force on the sphere is obtained by integrating the pressure and the shear stress over the surface of the sphere. The two contributions are
Adding these together gives the final expression for Stokes’ Law
where two thirds of the force is caused by the shear stress and one third by the pressure.
1819 – Born County Sligo Parish of Skreen
1837 – Entered Pembroke College Cambridge
1841 – Graduation, Senior Wrangler, Smith’s Prize, Fellow of Pembroke
1845 – Viscosity
1845 – Viscoelastic solid and the luminiferous ether
1845 – Ether drag
1846 – Review of hydrodynamics (including French references)
1847 – Water waves
1847 – Expansion in periodic series (Fourier)
1848 – Jelly theory of the ether
1849 – Lucasian Professorship Cambridge
1849 – Geodesy and Clairaut’s theorem
1849 – Dynamical theory of diffraction
1850 – Damped pendulum, explanation of clouds (water droplets)
1850 – Haidinger’s brushes
1850 – Letter from Kelvin (Thomson) to Stokes on a theorem in vector calculus
1852 – Stokes’ 4 polarization parameters
1852 – Fluorescence and Rumford medal
1854 – Stokes sets “Stokes theorem” for the Smith’s Prize Exam
1857 – Marries
1857 – Effect of wind on sound intensity
1861 – Hankel publishes “Stokes theorem”
1880 – Form of highest waves
1885 – President of Royal Society
1887 – Member of Parliament
1889 – Knighted as baronet by Queen Victoria
1893 – Copley Medal
1903 – Dies
1945 – Cartan establishes modern form of Stokes’ theorem using differential forms
In one of my previous blog posts, as I was searching for Schwarzschild’s original papers on Einstein’s field equations and quantum theory, I obtained a copy of the January 1916 – June 1916 volume of the Proceedings of the Royal Prussian Academy of Sciences through interlibrary loan. The extremely thick volume arrived at Purdue about a week after I ordered it online. It arrived from Oberlin College in Ohio that had received it as a gift in 1928 from the library of Professor Friedrich Loofs of the University of Halle in Germany. Loofs had been the Haskell Lecturer at Oberlin for the 1911-1912 semesters.
As I browsed through the volume looking for Schwarzschild’s papers, I was amused to find a cornucopia of turn-of-the-century science topics recorded in its pages. There were papers on the overbite and lips of marsupials. There were papers on forgotten languages. There were papers on ancient Greek texts. On the origins of religion. On the philosophy of abstraction. Histories of Indian dramas. Reflections on cancer. But what I found most amazing was a snapshot of the field of physics and mathematics in 1916, with historic papers by historic scientists who changed how we view the world. Here is a snapshot in time and in space, a period of only six months from a single journal, containing papers from authors that reads like a who’s who of physics.
In 1916 there were three major centers of science in the world with leading science publications: London with the Philosophical Magazine and Proceedings of the Royal Society; Paris with the Comptes Rendus of the Académie des Sciences; and Berlin with the Proceedings of the Royal Prussian Academy of Sciences and Annalen der Physik. In Russia, there were the scientific Journals of St. Petersburg, but the Bolshevik Revolution was brewing that would overwhelm that country for decades. And in 1916 the academic life of the United States was barely worth noticing except for a few points of light at Yale and Johns Hopkins.
Berlin in 1916 was embroiled in war, but science proceeded relatively unmolested. The six-month volume of the Proceedings of the Royal Prussian Academy of Sciences contains a number of gems. Schwarzschild was one of the most prolific contributors, publishing three papers in just this half-year volume, plus his obituary written by Einstein. But joining Schwarzschild in this volume were Einstein, Planck, Born, Warburg, Frobenious, and Rubens among others—a pantheon of German scientists mostly cut off from the rest of the world at that time, but single-mindedly following their individual threads woven deep into the fabric of the physical world.
Karl Schwarzschild (1873 – 1916)
Schwarzschild had the unenviable yet effective motivation of his impending death to spur him to complete several projects that he must have known would make his name immortal. In this six-month volume he published his three most important papers. The first (pg. 189) was on the exact solution to Einstein’s field equations to general relativity. The solution was for the restricted case of a point mass, yet the derivation yielded the Schwarzschild radius that later became known as the event horizon of a non-roatating black hole. The second paper (pg. 424) expanded the general relativity solutions to a spherically symmetric incompressible liquid mass.
The subject, content and success of these two papers was wholly unexpected from this observational astronomer stationed on the Russian Front during WWI calculating trajectories for German bombardments. He would not have been considered a theoretical physicist but for the importance of his results and the sophistication of his methods. Within only a year after Einstein published his general theory, based as it was on the complicated tensor calculus of Levi-Civita, Christoffel and Ricci-Curbastro that had taken him years to master, Schwarzschild found a solution that evaded even Einstein.
Schwarzschild’s third and final paper (pg. 548) was on an entirely different topic, still not in his official field of astronomy, that positioned all future theoretical work in quantum physics to be phrased in the language of Hamiltonian dynamics and phase space. He proved that action-angle coordinates were the only acceptable canonical coordinates to be used when quantizing dynamical systems. This paper answered a central question that had been nagging Bohr and Einstein and Ehrenfest for years—how to quantize dynamical coordinates. Despite the simple way that Bohr’s quantized hydrogen atom is taught in modern physics, there was an ambiguity in the quantization conditions even for this simple single-electron atom. The ambiguity arose from the numerous possible canonical coordinate transformations that were admissible, yet which led to different forms of quantized motion.
Schwarzschild’s doctoral thesis had been a theoretical topic in astrophysics that applied the celestial mechanics theories of Henri Poincaré to binary star systems. Within Poincaré’s theory were integral invariants that were conserved quantities of the motion. When a dynamical system had as many constraints as degrees of freedom, then every coordinate had an integral invariant. In this unexpected last paper from Schwarzschild, he showed how canonical transformation to action-angle coordinates produced a unique representation in terms of action variables (whose dimensions are the same as Planck’s constant). These action coordinates, with their associated cyclical angle variables, are the only unambiguous representations that can be quantized. The important points of this paper were amplified a few months later in a publication by Schwarzschild’s friend Paul Epstein (1871 – 1939), solidifying this approach to quantum mechanics. Paul Ehrenfest (1880 – 1933) continued this work later in 1916 by defining adiabatic invariants whose quantum numbers remain unchanged under slowly varying conditions, and the program started by Schwarzschild was definitively completed by Paul Dirac (1902 – 1984) at the dawn of quantum mechanics in Göttingen in 1925.
Albert Einstein (1879 – 1955)
In 1916 Einstein was mopping up after publishing his definitive field equations of general relativity the year before. His interests were still cast wide, not restricted only to this latest project. In the 1916 Jan. to June volume of the Prussian Academy Einstein published two papers. Each is remarkably short relative to the other papers in the volume, yet the importance of the papers may stand in inverse proportion to their length.
The first paper (pg. 184) is placed right before Schwarzschild’s first paper on February 3. The subject of the paper is the expression of Maxwell’s equations in four-dimensional space time. It is notable and ironic that Einstein mentions Hermann Minkowski (1864 – 1909) in the first sentence of the paper. When Minkowski proposed his bold structure of spacetime in 1908, Einstein had been one of his harshest critics, writing letters to the editor about the absurdity of thinking of space and time as a single interchangeable coordinate system. This is ironic, because Einstein today is perhaps best known for the special relativity properties of spacetime, yet he was slow to adopt the spacetime viewpoint. Einstein only came around to spacetime when he realized around 1910 that a general approach to relativity required the mathematical structure of tensor manifolds, and Minkowski had provided just such a manifold—the pseudo-Riemannian manifold of space time. Einstein subsequently adopted spacetime with a passion and became its greatest champion, calling out Minkowski where possible to give him his due, although he had already died tragically of a burst appendix in 1909.
The importance of Einstein’s paper hinges on his derivation of the electromagnetic field energy density using electromagnetic four vectors. The energy density is part of the source term for his general relativity field equations. Any form of energy density can warp spacetime, including electromagnetic field energy. Furthermore, the Einstein field equations of general relativity are nonlinear as gravitational fields modify space and space modifies electromagnetic fields, producing a coupling between gravity and electromagnetism. This coupling is implicit in the case of the bending of light by gravity, but Einstein’s paper from 1916 makes the connection explicit.
Einstein’s second paper (pg. 688) is even shorter and hence one of the most daring publications of his career. Because the field equations of general relativity are nonlinear, they are not easy to solve exactly, and Einstein was exploring approximate solutions under conditions of slow speeds and weak fields. In this “non-relativistic” limit the metric tensor separates into a Minkowski metric as a background on which a small metric perturbation remains. This small perturbation has the properties of a wave equation for a disturbance of the gravitational field that propagates at the speed of light. Hence, in the June 22 issue of the Prussian Academy in 1916, Einstein predicts the existence and the properties of gravitational waves. Exactly one hundred years later in 2016, the LIGO collaboration announced the detection of gravitational waves generated by the merger of two black holes.
Max Planck (1858 – 1947)
Max Planck was active as the secretary of the Prussian Academy in 1916 yet was still fully active in his research. Although he had launched the quantum revolution with his quantum hypothesis of 1900, he was not a major proponent of quantum theory even as late as 1916. His primary interests lay in thermodynamics and the origins of entropy, following the theoretical approaches of Ludwig Boltzmann (1844 – 1906). In 1916 he was interested in how to best partition phase space as a way to count states and calculate entropy from first principles. His paper in the 1916 volume (pg. 653) calculated the entropy for single-atom solids.
Max Born (1882 – 1970)
Max Born was to be one of the leading champions of the quantum mechanical revolution based at the University of Göttingen in the 1920’s. But in 1916 he was on leave from the University of Berlin working on ranging for artillery. Yet he still pursued his academic interests, like Schwarzschild. On pg. 614 in the Proceedings of the Prussian Academy, Born published a paper on anisotropic liquids, such as liquid crystals and the effect of electric fields on them. It is astonishing to think that so many of the flat-panel displays we have today, whether on our watches or smart phones, are technological descendants of work by Born at the beginning of his career.
Ferdinand Frobenius (1849 – 1917)
Like Schwarzschild, Frobenius was at the end of his career in 1916 and would pass away one year later, but unlike Schwarzschild, his career had been a long one, receiving his doctorate under Weierstrass and exploring elliptic functions, differential equations, number theory and group theory. One of the papers that established him in group theory appears in the May 4th issue on page 542 where he explores the series expansion of a group.
Heinrich Rubens (1865 – 1922)
Max Planck owed his quantum breakthrough in part to the exquisitely accurate experimental measurements made by Heinrich Rubens on black body radiation. It was only by the precise shape of what came to be called the Planck spectrum that Planck could say with such confidence that his theory of quantized radiation interactions fit Rubens spectrum so perfectly. In 1916 Rubens was at the University of Berlin, having taken the position vacated by Paul Drude in 1906. He was a specialist in infrared spectroscopy, and on page 167 of the Proceedings he describes the spectrum of steam and its consequences for the quantum theory.
Emil Warburg (1946 – 1931)
Emil Warburg’s fame is primarily as the father of Otto Warburg who won the 1931 Nobel prize in physiology. On page 314 Warburg reports on photochemical processes in BrH gases. In an obscure and very indirect way, I am an academic descendant of Emil Warburg. One of his students was Robert Pohl who was a famous early researcher in solid state physics, sometimes called the “father of solid state physics”. Pohl was at the physics department in Göttingen in the 1920’s along with Born and Franck during the golden age of quantum mechanics. Robert Pohl’s son, Robert Otto Pohl, was my professor when I was a sophomore at Cornell University in 1978 for the course on introductory electromagnetism using a textbook by the Nobel laureate Edward Purcell, a quirky volume of the Berkeley Series of physics textbooks. This makes Emil Warburg my professor’s father’s professor.
Papers in the 1916 Vol. 1 of the Prussian Academy of Sciences
Schulze, Alt– und Neuindisches
Orth, Zur Frage nach den Beziehungen des Alkoholismus zur Tuberkulose
Schulze, Die Erhabunen auf der Lippin- und Wangenschleimhaut der Säugetiere
von Wilamwitz-Moellendorff, Die Samie des Menandros
Engler, Bericht über das >>Pflanzenreich<<
von Harnack, Bericht über die Ausgabe der griechischen Kirchenväter der dri ersten Jahrhunderte
Meinecke, Germanischer und romanischer Geist im Wandel der deutschen Geschichtsauffassung
Rubens und Hettner, Das langwellige Wasserdampfspektrum und seine Deutung durch die Quantentheorie
Einstein, Eine neue formale Deutung der Maxwellschen Feldgleichungen der Electrodynamic
Schwarschild, Über das Gravitationsfeld eines Massenpunktes nach der Einsteinschen Theorie
Helmreich, Handschriftliche Verbesserungen zu dem Hippokratesglossar des Galen
Prager, Über die Periode des veränderlichen Sterns RR Lyrae
Holl, Die Zeitfolge des ersten origenistischen Streits
Lüders, Zu den Upanisads. I. Die Samvargavidya
Warburg, Über den Energieumsatz bei photochemischen Vorgängen in Gasen. VI.
Hellman, Über die ägyptischen Witterungsangaben im Kalender von Claudius Ptolemaeus
Meyer-Lübke, Die Diphthonge im Provenzaslischen
Diels, Über die Schrift Antipocras des Nikolaus von Polen
Müller und Sieg, Maitrisimit und >>Tocharisch<<
Meyer, Ein altirischer Heilsegen
Schwarzschild, Über das Gravitationasfeld einer Kugel aus inkompressibler Flüssigkeit nach der Einsteinschen Theorie
Brauer, Die Verbreitung der Hyracoiden
Correns, Untersuchungen über Geschlechtsbestimmung bei Distelarten
Brahn, Weitere Untersuchungen über Fermente in der Lever von Krebskranken
Erdmann, Methodologische Konsequenzen aus der Theorie der Abstraktion
Bang, Studien zur vergleichenden Grammatik der Türksprachen. I.
Frobenius, Über die Kompositionsreihe einer Gruppe
Schwarzschild, Zur Quantenhypothese
Fischer und Bergmann,Über neue Galloylderivate des Traubenzuckers und ihren Vergleich mit der Chebulinsäure
Schuchhardt, Der starke Wall und die breite, zuweilen erhöhte Berme bei frügeschichtlichen Burgen in Norddeutschland
Born, Über anisotrope Flüssigkeiten
Planck, Über die absolute Entropie einatomiger Körper
Haberlandt, Blattepidermis und Lichtperzeption
Einstein, Näherungsweise Integration der Feldgleichungen der Gravitation
Lüders, Die Saubhikas. Ein Beitrag zur Gecschichte des indischen Dramas
In an ironic twist of the history of physics, Karl Schwarzschild’s fame has eclipsed his own legacy. When asked who was Karl Schwarzschild (1873 – 1916), you would probably say he’s the guy who solved Einstein’s Field Equations of General Relativity and discovered the radius of black holes. You may also know that he accomplished this Herculean feat while dying slowly behind the German lines on the Eastern Front in WWI. But asked what else he did, and you would probably come up blank. Yet Schwarzschild was one of the most wide-ranging physicists at the turn of the 20th century, which is saying something, because it places him into the same pantheon as Planck, Lorentz, Poincaré and Einstein. Let’s take a look at the part of his career that hides in the shadow of his own radius.
Radius of Interest
Karl Schwarzschild was born in Frankfurt, Germany, shortly after the Franco-Prussian war thrust Prussia onto the world stage as a major political force in Europe. His family were Jewish merchants of longstanding reputation in the city, and Schwarzschild’s childhood was spent in the vibrant Jewish community. One of his father’s friends was a professor at a university in Frankfurt, whose son, Paul Epstein (1871 – 1939), became a close friend of Karl’s at the Gymnasium. Schwarzshild and Epstein would partially shadow each other’s careers despite the fact that Schwarzschild became an astronomer while Epstein became a famous mathematician and number theorist. This was in part because Schwarzschild had large radius of interests that spanned the breadth of current mathematics and science, practicing both experiments and theory.
Schwarzschild’s application of the Hamiltonian formalism for quantum systems set the stage for the later adoption of Hamiltonian methods in quantum mechanics. He came dangerously close to stating the uncertainty principle that catapulted Heisenberg to fame.
By the time Schwarzschild was sixteen, he had taught himself the mathematics of celestial mechanics to such depth that he published two papers on the orbits of binary stars. He also became fascinated in astronomy and purchased lenses and other materials to construct his own telescope. His interests were helped along by Epstein, two years older and whose father had his own private observatory. When Epstein went to study at the University of Strasbourg (then part of the German Federation) Schwarzschild followed him. But Schwarzschild’s main interest in astronomy diverged from Epstein’s main interest in mathematics, and Schwarzschild transferred to the University of Munich where he studied under Hugo von Seeliger (1849 – 1924), the premier German astronomer of his day. Epstein remained at Strasbourg where he studied under Bruno Christoffel (1829 – 1900) and eventually became a professor, but he was forced to relinquish the post when Strasbourg was ceded to France after WWI.
Birth of Stellar Interferometry
the Hubble space telescope was launched in 1990 no star had ever been resolved
as a direct image. Within a year of its
launch, using its spectacular resolving power, the Hubble optics resolved—just
barely—the red supergiant Betelgeuse. No
other star (other than the Sun) is close enough or big enough to image the
stellar disk, even for the Hubble far above our atmosphere. The reason is that the diameter of the
optical lenses and mirrors of the Hubble—as big as they are at 2.4 meter
diameter—still produce a diffraction pattern that smears the image so that
stars cannot be resolved. Yet
information on the size of a distant object is encoded as phase in the light
waves that are emitted from the object, and this phase information is
accessible to interferometry.
The first physicist who truly grasped the power of optical interferometry and who understood how to design the first interferometric metrology systems was the French physicist Armand Hippolyte Louis Fizeau (1819 – 1896). Fizeau became interested in the properties of light when he collaborated with his friend Léon Foucault (1819–1868) on early uses of photography. The two then embarked on a measurement of the speed of light but had a falling out before the experiment could be finished, and both continued the pursuit independently. Fizeau achieved the first measurement using a toothed wheel rotating rapidly , while Foucault came in second using a more versatile system with a spinning mirror . Yet Fizeau surpassed Foucault in optical design and became an expert in interference effects. Interference apparatus had been developed earlier by Augustin Fresnel (the Fresnel bi-prism 1819), Humphrey Lloyd (Lloyd’s mirror 1834) and Jules Jamin (Jamin’s interferential refractor 1856). They had found ways of redirecting light using refraction and reflection to cause interference fringes. But Fizeau was one of the first to recognize that each emitting region of a light source was coherent with itself, and he used this insight and the use of lenses to design the first interferometer.
Fizeau’s interferometer used a lens with a with a tight focal spot masked off by an opaque screen with two open slits. When the masked lens device was focused on an intense light source it produced two parallel pencils of light that were mutually coherent but spatially separated. Fizeau used this apparatus to measure the speed of light in moving water in 1859 .
The working principle of the Fizeau refractometer is shown in Fig. 1. The light source is at the bottom, and it is reflected by the partially-silvered beam splitter to pass through the lens and the mask containing two slits. (Only the light paths that pass through the double-slit mask on the lens are shown in the figure.) The slits produce two pencils of mutually coherent light that pass through a system (in the famous Fizeau ether drag experiment it was along two tubes of moving water) and are returned through the same slits, and they intersect at the view port where they produce interference fringes. The fringe spacing is set by the separation of the two slits in the mask. The Rayleigh region of the lens defines a region of spatial coherence even for a so-called “incoherent” source. Therefore, this apparatus, by use of the lens, could convert an incoherent light source into a coherent probe to test the refractive index of test materials, which is why it was called a refractometer.
Fizeau became adept at thinking of alternative optical designs of his refractometer and alternative applications. In an address to the French Physical Society in 1868 he suggested that the double-slit mask could be used on a telescope to determine sizes of distant astronomical objects . There were several subsequent attempts to use Fizeau’s configuration in astronomical observations, but none were conclusive and hence were not widely known.
An optical configuration and astronomical application that was very similar to Fizeau’s idea was proposed by Albert Michelson in 1890 . He built the apparatus and used it to successfully measure the size of several moons of Jupiter . The configuration of the Michelson stellar interferometer is shown in Fig. 2. Light from a distant star passes through two slits in the mask in front of the collecting optics of a telescope. When the two pencils of light intersect at the view port, they produce interference fringes. Because of the finite size of the stellar source, the fringes are partially washed out. By adjusting the slit separation, a certain separation can be found where the fringes completely wash out. The size of the star is then related to the separation of the slits for which the fringe visibility vanishes. This simple principle allows this type of stellar interferometry to measure the size of stars that are large and relatively close to Earth. However, if stars are too far away even this approach cannot be used to measure their sizes because telescopes aren’t big enough. This limitation is currently being bypassed by the use of long-baseline optical interferometers.
One of the open questions in the history of interferometry is whether Michelson was aware of Fizeau’s proposal for the stellar interferometer made in 1868. Michelson was well aware of Fizeau’s published research and acknowledged him as a direct inspiration of his own work in interference effects. But Michelson also was unaware of the undercurrents in the French school of optical interference. When he visited Paris in 1881, he met with many of the leading figures in this school (including Lippmann and Cornu), but there is no mention or any evidence that he met with Fizeau. By this time Fizeau’s wife had passed away, and Fizeau spent most of his time in seclusion at his home outside Paris. Therefore, it is unlikely that he would have been present during Michelson’s visit. Because Michelson viewed Fizeau with such awe and respect, if he had met him, he most certainly would have mentioned it. Therefore, Michelson’s invention of the stellar interferometer can be considered with some confidence to be a case of independent discovery. It is perhaps not surprising that he hit on the same idea that Fizeau had in 1868, because Michelson was one of the few physicists who understood coherence and interference at the same depth as Fizeau.
The physics of the Michelson stellar interferometer is very similar to the physics of Young’s double slit experiment. The two slits in the aperture mask of the telescope objective act to produce a simple sinusoidal interference pattern at the image plane of the optical system. The size of the stellar diameter is determined by using the wash-out effect of the fringes caused by the finite stellar size. However, it is well known to physicists who work with diffraction gratings that a multiple-slit interference pattern has a much greater resolving power than a simple double slit.
This realization must have hit von Seeliger and Schwarzschild, working together at Munich, when they saw the publication of Michelson’s theoretical analysis of his stellar interferometer in 1890, followed by his use of the apparatus to measure the size of Jupiter’s moons. Schwarzschild and von Seeliger realized that by replacing the double-slit mask with a multiple-slit mask, the widths of the interference maxima would be much narrower. Such a diffraction mask on a telescope would cause a star to produce a multiple set of images on the image plane of the telescope associated with the multiple diffraction orders. More interestingly, if the target were a binary star, the diffraction would produce two sets of diffraction maxima—a double image! If the “finesse” of the grating is high enough, the binary star separation could be resolved as a doublet in the diffraction pattern at the image, and the separation could be measured, giving the angular separation of the two stars of the binary system. Such an approach to the binary separation would be a direct measurement, which was a distinct and clever improvement over the indirect Michelson configuration that required finding the extinction of the fringe visibility.
Schwarzschild enlisted the help of a fine German instrument maker to create a multiple slit system that had an adjustable slit separation. The device is shown in Fig. 3 from Schwarzschild’s 1896 publication on the use of the stellar interferometer to measure the separation of binary stars . The device is ingenious. By rotating the chain around the gear on the right-hand side of the apparatus, the two metal plates with four slits could be raised or lowered, cause the projection onto the objective plane to have variable slit spacings. In the operation of the telescope, the changing height of the slits does not matter, because they are near a conjugate optical plane (the entrance pupil) of the optical system. Using this adjustable multiple slit system, Schwarzschild (and two colleagues he enlisted) made multiple observations of well-known binary star systems, and they calculated the star separations. Several of their published results are shown in Fig. 4.
Schwarzschild’s publication demonstrated one of the very first uses of stellar interferometry—well before Michelson himself used his own configuration to measure the diameter of Betelgeuse in 1920. Schwarzschild’s major achievement was performed before he had received his doctorate, on a topic orthogonal to his dissertation topic. Yet this fact is virtually unknown to the broader physics community outside of astronomy. If he had not become so famous later for his solution of Einstein’s field equations, Schwarzschild nonetheless might have been famous for his early contributions to stellar interferometry. But even this was not the end of his unique contributions to physics.
As Schwarzschild worked for his doctorate under von Seeliger, his dissertation topic was on new theories by Henri Poincaré (1854 – 1912) on celestial mechanics. Poincaré had made a big splash on the international stage with the publication of his prize-winning memoire in 1890 on the three-body problem. This is the publication where Poincaré first described what would later become known as chaos theory. The memoire was followed by his volumes on “New Methods in Celestial Mechanics” published between 1892 and 1899. Poincaré’s work on celestial mechanics was based on his earlier work on the theory of dynamical systems where he discovered important invariant theorems, such as Liouville’s theorem on the conservation of phase space volume. Schwarzshild applied Poincaré’s theorems to problems in celestial orbits. He took his doctorate in 1896 and received a post at an astronomical observatory outside Vienna.
While at Vienna, Schwarzschild performed his most important sustained contributions to the science of astronomy. Astronomical observations had been dominated for centuries by the human eye, but photographic techniques had been making steady inroads since the time of Hermann Carl Vogel (1841 – 1907) in the 1880’s at the Potsdam observatory. Photographic plates were used primarily to record star positions but were known to be unreliable for recording stellar intensities. Schwarzschild developed a “out-of-focus” technique that blurred the star’s image, while making it larger and easier to measure the density of the exposed and developed photographic emulsions. In this way, Schwarzschild measured the magnitudes of 367 stars. Two of these stars had variable magnitudes that he was able to record and track. Schwarzschild correctly explained the intensity variation caused by steady oscillations in heating and cooling of the stellar atmosphere. This work established the properties of these Cepheid variables which would become some of the most important “standard candles” for the measurement of cosmological distances. Based on the importance of this work, Schwarzschild returned to Munich as a teacher in 1899 and subsequently was appointed in 1901 as the director of the observatory at Göttingen established by Gauss eighty years earlier.
Schwarzschild’s years at Göttingen brought him into contact with some of the greatest mathematicians and physicists of that era. The mathematicians included Felix Klein, David Hilbert and Hermann Minkowski. The physicists included von Laue, a student of Woldemar Voigt. This period was one of several “golden ages” of Göttingen. The first golden age was the time of Gauss and Riemann in the mid-1800’s. The second golden age, when Schwarzschild was present, began when Felix Klein arrived at Göttingen and attracted the top mathematicians of the time. The third golden age of Göttingen was the time of Born and Jordan and Heisenberg at the birth of quantum mechanics in the mid 1920’s.
In 1906, the Austrian Physicist Paul Ehrenfest, freshly out of his PhD under the supervision of Boltzmann, arrived at Göttingen only weeks before Boltzmann took his own life. Felix Klein at Göttingen had been relying on Boltzmann to provide a comprehensive review of statistical mechanics for the Mathematical Encyclopedia, so he now entrusted this project to the young Ehrenfest. It was a monumental task, which was to take him and his physicist wife Tatyanya nearly five years to complete. Part of the delay was the desire by the Ehrenfests to close some open problems that remained in Boltzmann’s work. One of these was a mechanical theorem of Boltzmann’s that identified properties of statistical mechanical systems that remained unaltered through a very slow change in system parameters. These properties would later be called adiabatic invariants by Einstein.
Ehrenfest recognized that Wien’s displacement law, which had been a guiding light for Planck and his theory of black body radiation, had originally been derived by Wien using classical principles related to slow changes in the volume of a cavity. Ehrenfest was struck by the fact that such slow changes would not induce changes in the quantum numbers of the quantized states, and hence that the quantum numbers must be adiabatic invariants of the black body system. This not only explained why Wien’s displacement law continued to hold under quantum as well as classical considerations, but it also explained why Planck’s quantization of the energy of his simple oscillators was the only possible choice. For a classical harmonic oscillator, the ratio of the energy of oscillation to the frequency of oscillation is an adiabatic invariant, which is immediately recognized as Planck’s quantum condition .
Ehrenfest published his observations in 1913 , the same year that Bohr published his theory of the hydrogen atom, so Ehrenfest immediately applied the theory of adiabatic invariants to Bohr’s model and discovered that the quantum condition for the quantized energy levels was again the adiabatic invariants of the electron orbits, and not merely a consequence of integer multiples of angular momentum, which had seemed somewhat ad hoc.
After eight exciting years at Göttingen, Schwarzschild was offered the position at the Potsdam Observatory in 1909 upon the retirement from that post of the famous German astronomer Carl Vogel who had made the first confirmed measurements of the optical Doppler effect. Schwarzschild accepted and moved to Potsdam with a new family. His son Martin Schwarzschild would follow him into his profession, becoming a famous astronomer at Princeton University and a theorist on stellar structure. At the outbreak of WWI, Schwarzschild joined the German army out of a sense of patriotism. Because of his advanced education he was made an officer of artillery with the job to calculate artillery trajectories, and after a short time on the Western Front in Belgium was transferred to the Eastern Front in Russia. Though he was not in the trenches, he was in the midst of the chaos to the rear of the front. Despite this situation, he found time to pursue his science through the year 1915.
Schwarzschild was intrigued by Ehrenfest’s paper on adiabatic invariants and their similarity to several of the invariant theorems of Poincaré that he had studied for his doctorate. Up until this time, mechanics had been mostly pursued through the Lagrangian formalism which could easily handle generalized forces associated with dissipation. But celestial mechanics are conservative systems for which the Hamiltonian formalism is a more natural approach. In particular, the Hamilton-Jacobi canonical transformations made it particularly easy to find pairs of generalized coordinates that had simple periodic behavior. In his published paper , Schwarzschild called these “Action-Angle” coordinates because one was the action integral that was well-known in the principle of “Least Action”, and the other was like an angle variable that changed steadily in time (see Fig. 5). Action-angle coordinates have come to form the foundation of many of the properties of Hamiltonian chaos, Hamiltonian maps, and Hamiltonian tapestries.
During lulls in bombardments, Schwarzschild translated the Hamilton-Jacobi methods of celestial mechanics to apply them to the new quantum mechanics of the Bohr orbits. The phrase “quantum mechanics” had not yet been coined (that would come ten years later in a paper by Max Born), but it was clear that the Bohr quantization conditions were a new type of mechanics. The periodicities that were inherent in the quantum systems were natural properties that could be mapped onto the periodicities of the angle variables, while Ehrenfest’s adiabatic invariants could be mapped onto the slowly varying action integrals. Schwarzschild showed that action-angle coordinates were the only allowed choice of coordinates, because they enabled the separation of the Hamilton-Jacobi equations and hence provided the correct quantization conditions for the Bohr electron orbits. Later, when Sommerfeld published his quantized elliptical orbits in 1916, the multiplicity of quantum conditions and orbits had caused concern, but Ehrenfest came to the rescue, showing that each of Sommerfeld’s quantum conditions were precisely Schwarzschild’s action-integral invariants of the classical electron dynamics .
The works by Schwarzschild, and a closely-related paper that amplified his ideas published by his friend Paul Epstein several months later , were the first to show the power of the Hamiltonian formulation of dynamics for quantum systems, foreshadowing the future importance of Hamiltonians for quantum theory. An essential part of the Hamiltonian formalism is the concept of phase space. In his paper, Schwarzschild showed that the phase space of quantum systems was divided into small but finite elementary regions whose areas were equal to Planck’s constant h-bar (see Fig. 6). The areas were products of a small change in momentum coordinate Delta-p and a corresponding small change in position coordinate Delta-x. Therefore, the product DxDp = h-bar. This observation, made in 1915 by Schwarzschild, was only one step away from Heisenberg’s uncertainty relation, twelve years before Heisenberg discovered it. However, in 1915 Born’s probabilistic interpretation of quantum mechanics had not yet been made, nor the idea of measurement uncertainty, so Schwarzschild did not have the appropriate context in which to have made the leap to the uncertainty principle. However, by introducing the action-angle coordinates as well as the Hamiltonian formalism applied to quantum systems, with the natural structure of phase space, Schwarzschild laid the foundation for the future developments in quantum theory made by the next generation.
Quiet on the Eastern Front
the end of his second stay in Munich in 1900, prior to joining the Göttingen
faculty, Schwarzschild had presented a paper at a meeting of the German Astronomical Society held in Heidelberg in
August. The topic was unlike anything he
had tackled before. It considered the
highly theoretical question of whether the universe was non-Euclidean, and more
specifically if it had curvature. He
concluded from observation that if the universe were curved, the radius of
curvature must be larger than between 50 light years and 2000 light years,
depending on whether the geometry was hyperbolic or elliptical. Schwarzschild was working out ideas of
differential geometry and applying them to the universe at large at a time when
Einstein was just graduating from the ETH where he skipped his math classes and
had his friend Marcel Grossmann take notes for him.
The topic of Schwarzschild’s talk tells an important story about the warping of historical perspective by the “great man” syndrome. In this case the great man is Einstein who is today given all the credit for discovering the warping of space. His development of General Relativity is often portrayed as by a lone genius in the wilderness performing a blazing act of creation out of the void. In fact, non-Euclidean geometry had been around for some time by 1900—five years before Einstein’s Special Theory and ten years before his first publications on the General Theory. Gauss had developed the idea of intrinsic curvature of a manifold fifty years earlier, amplified by Riemann. By the turn of the century alternative geometries were all the rage, and Schwarzschild considered whether there were sufficient astronomical observations to set limits on the size of curvature of the universe. But revisionist history is just as prevalent in physics as in any field, and when someone like Einstein becomes so big in the mind’s eye, his shadow makes it difficult to see all the people standing behind him.
This is not meant to take away from the feat that Einstein accomplished. The General Theory of Relativity, published by Einstein in its full form in 1915 was spectacular . Einstein had taken vague notions about curved spaces and had made them specific, mathematically rigorous and intimately connected with physics through the mass-energy source term in his field equations. His mathematics had gone beyond even what his mathematician friend and former collaborator Grossmann could achieve. Yet Einstein’s field equations were nonlinear tensor differential equations in which the warping of space depended on the strength of energy fields, but the configuration of those energy fields depended on the warping of space. This type of nonlinear equation is difficult to solve in general terms, and Einstein was not immediately aware of how to find the solutions to his own equations.
Therefore, it was no small surprise to him when he received a letter from the Eastern Front from an astronomer he barely knew who had found a solution—a simple solution (see Fig. 7) —to his field equations. Einstein probably wondered how he could have missed it, but he was generous and forwarded the letter to the Reports of the Prussian Physical Society where it was published in 1916 .
In the same paper, Schwarzschild used his exact solution to find the exact equation that described the precession of the perihelion of Mercury that Einstein had only calculated approximately. The dynamical equations for Mercury are shown in Fig. 8.
Schwarzschild’s solution to Einstein’s Field Equation of General Relativity was not a general solution, even for a point mass. He had constants of integration that could have arbitrary values, such as the characteristic length scale that Schwarzschild called “alpha”. It was David Hilbert who later expanded upon Schwarzschild’s work, giving the general solution and naming the characteristic length scale (where the metric diverges) after Schwarzschild. This is where the phrase “Schwarzschild Radius” got its name, and it stuck. In fact it stuck so well that Schwarzschild’s radius has now eclipsed much of the rest of Schwarzschild’s considerable accomplishments.
Unfortunately, Schwarzschild’s accomplishments were cut short when he contracted an autoimmune disease that may have been hereditary. It is ironic that in the carnage of the Eastern Front, it was a genetic disease that caused his death at the age of 42. He was already suffering from the effects of the disease as he worked on his last publications. He was sent home from the front to his family in Potsdam where he passed away several months later having shepherded his final two papers through the publication process. His last paper, on the action-angle variables in quantum systems , was published on the day that he died.
Schwarzschild’s legacy was assured when he solved Einstein’s field equations and Einstein communicated it to the world. But his hidden legacy is no less important.
Schwarzschild’s application of the Hamiltonian formalism of canonical transformations and phase space for quantum systems set the stage for the later adoption of Hamiltonian methods in quantum mechanics. He came dangerously close to stating the uncertainty principle that catapulted Heisenberg to later fame, although he could not express it in probabilistic terms because he came too early.
Schwarzschild is considered to be the greatest German astronomer of the last hundred years. This is in part based on his work at the birth of stellar interferometry and in part on his development of stellar photometry and the calibration of the Cepheid variable stars that went on to revolutionize our view of our place in the universe. Solving Einsteins field equations was just a sideline for him, a hobby to occupy his active and curious mind.
 Fizeau, H. L. (1849). “Sur une expérience relative à la
vitesse de propagation de la lumière.” Comptes rendus de l’Académie des
sciences 29: 90–92, 132.
 Foucault, J. L. (1862). “Détermination expérimentale de la
vitesse de la lumière: parallaxe du Soleil.” Comptes rendus de
l’Académie des sciences 55: 501–503, 792–596.
 Fizeau, H. (1859). “Sur les hypothèses relatives à l’éther
lumineux.” Ann. Chim. Phys. Ser.
4 57: 385–404.
 Fizeau, H. (1868). “Prix
Bordin: Rapport sur le concours de l’annee 1867.” C. R. Acad. Sci. 66:
 Michelson, A. A. (1890). “I. On the application of
interference methods to astronomical measurements.” The London,
Edinburgh, and Dublin Philosophical Magazine and Journal of Science 30(182):
 Michelson, A. A. (1891). “Measurement of Jupiter’s Satellites
by Interference.” Nature 45(1155): 160-161.
 Schwarzschild, K. (1896). “Über messung von doppelsternen
durch interferenzen.” Astron. Nachr. 3335: 139.
P. Ehrenfest, “Een mechanische theorema van Boltzmann en
zijne betrekking tot de quanta theorie (A mechanical theorem of Boltzmann and
its relation to the theory of energy quanta),” Verslag van de Gewoge
Vergaderingen der Wis-en Natuurkungige Afdeeling, vol. 22, pp. 586-593,
 Schwarzschild, K. (1916). “Quantum hypothesis.” Sitzungsberichte
Der Koniglich Preussischen Akademie Der Wissenschaften: 548-568.
P. Ehrenfest, “Adiabatic invariables and quantum
theory,” Annalen Der Physik, vol. 51, pp. 327-352, Oct 1916.
 Epstein, P. S. (1916). “The quantum theory.” Annalen
Der Physik 51(18): 168-188.
 Einstein, A. (1915). “On the general theory of
relativity.” Sitzungsberichte Der Koniglich Preussischen Akademie Der
 Schwarzschild, K. (1916). “Über das Gravitationsfeld eines
Massenpunktes nach der Einstein’schen Theorie.” Sitzungsberichte der
Königlich-Preussischen Akademie der Wissenschaften: 189.
physics of a path of light passing a gravitating body is one of the hardest
concepts to understand in General Relativity, but it is also one of the
easiest. It is hard because there can be
no force of gravity on light even though the path of a photon bends as it
passes a gravitating body. It is easy,
because the photon is following the simplest possible path—a geodesic equation
for force-free motion.
This blog picks up where my last blog left off, having there defined the geodesic equation and presenting the Schwarzschild metric. With those two equations in hand, we could simply solve for the null geodesics (a null geodesic is the path of a light beam through a manifold). But there turns out to be a simpler approach that Einstein came up with himself (he never did like doing things the hard way). He just had to sacrifice the fundamental postulate that he used to explain everything about Special Relativity.
Throwing Special Relativity Under the Bus
The fundamental postulate of Special Relativity states that the speed of light is the same for all observers. Einstein posed this postulate, then used it to derive some of the most astonishing consequences of Special Relativity—like E = mc2. This postulate is at the rock core of his theory of relativity and can be viewed as one of the simplest “truths” of our reality—or at least of our spacetime.
Yet as soon as Einstein began thinking how to extend SR to a more general situation, he realized almost immediately that he would have to throw this postulate out. While the speed of light measured locally is always equal to c, the apparent speed of light observed by a distant observer (far from the gravitating body) is modified by gravitational time dilation and length contraction. This means that the apparent speed of light, as observed at a distance, varies as a function of position. From this simple conclusion Einstein derived a first estimate of the deflection of light by the Sun, though he initially was off by a factor of 2. (The full story of Einstein’s derivation of the deflection of light by the Sun and the confirmation by Eddington is in Chapter 7 of Galileo Unbound (Oxford University Press, 2018).)
The “Optics” of Gravity
The invariant element for a light path moving radially in the Schwarzschild geometry is
The apparent speed of light is
where c(r) is always less than c, when observing it from
flat space. The “refractive index” of
space is defined, as for any optical material, as the ratio of the constant speed
divided by the observed speed
Because the Schwarzschild metric has the property
the effective refractive index of warped space-time is
with a divergence at the Schwarzschild
The refractive index of warped space-time in the limit of weak gravity can be used in the ray equation (also known as the Eikonal equation described in an earlier blog)
where the gradient of the refractive index of space is
The ray equation is then a four-variable flow
These equations represent a 4-dimensional flow for a light ray confined to a plane. The trajectory of any light path is found by using an ODE solver subject to the initial conditions for the direction of the light ray. This is simple for us to do today with Python or Matlab, but it was also that could be done long before the advent of computers by early theorists of relativity like Max von Laue (1879 – 1960).
The Relativity of Max von Laue
In the Fall of 1905 in Berlin, a young German physicist by the name of Max Laue was sitting in the physics colloquium at the University listening to another Max, his doctoral supervisor Max Planck, deliver a seminar on Einstein’s new theory of relativity. Laue was struck by the simplicity of the theory, in this sense “simplistic” and hence hard to believe, but the beauty of the theory stuck with him, and he began to think through the consequences for experiments like the Fizeau experiment on partial ether drag.
Armand Hippolyte Louis Fizeau (1819 – 1896) in 1851 built one of the world’s first optical interferometers and used it to measure the speed of light inside moving fluids. At that time the speed of light was believed to be a property of the luminiferous ether, and there were several opposing theories on how light would travel inside moving matter. One theory would have the ether fully stationary, unaffected by moving matter, and hence the speed of light would be unaffected by motion. An opposite theory would have the ether fully entrained by matter and hence the speed of light in moving matter would be a simple sum of speeds. A middle theory considered that only part of the ether was dragged along with the moving matter. This was Fresnel’s partial ether drag hypothesis that he had arrived at to explain why his friend Francois Arago had not observed any contribution to stellar aberration from the motion of the Earth through the ether. When Fizeau performed his experiment, the results agreed closely with Fresnel’s drag coefficient, which seemed to settle the matter. Yet when Michelson and Morley performed their experiments of 1887, there was no evidence for partial drag.
Even after the exposition by Einstein on relativity in 1905, the disagreement of the Michelson-Morley results with Fizeau’s results was not fully reconciled until Laue showed in 1907 that the velocity addition theorem of relativity gave complete agreement with the Fizeau experiment. The velocity observed in the lab frame is found using the velocity addition theorem of special relativity. For the Fizeau experiment, water with a refractive index of n is moving with a speed v and hence the speed in the lab frame is
The difference in the speed of light between the stationary and the moving water is the difference
where the last term is precisely the Fresnel drag coefficient. This was one of the first definitive “proofs” of the validity of Einstein’s theory of relativity, and it made Laue one of relativity’s staunchest proponents. Spurred on by his success with the Fresnel drag coefficient explanation, Laue wrote the first monograph on relativity theory, publishing it in 1910.
A Nobel Prize for Crystal X-ray Diffraction
In 1909 Laue became a Privatdozent under Arnold Sommerfeld (1868 – 1951) at the university in Munich. In the Spring of 1912 he was walking in the Englischer Garten on the northern edge of the city talking with Paul Ewald (1888 – 1985) who was finishing his doctorate with Sommerfed studying the structure of crystals. Ewald was considering the interaction of optical wavelength with the periodic lattice when it struck Laue that x-rays would have the kind of short wavelengths that would allow the crystal to act as a diffraction grating to produce multiple diffraction orders. Within a few weeks of that discussion, two of Sommerfeld’s students (Friedrich and Knipping) used an x-ray source and photographic film to look for the predicted diffraction spots from a copper sulfate crystal. When the film was developed, it showed a constellation of dark spots for each of the diffraction orders of the x-rays scattered from the multiple periodicities of the crystal lattice. Two years later, in 1914, Laue was awarded the Nobel prize in physics for the discovery. That same year his father was elevated to the hereditary nobility in the Prussian empire and Max Laue became Max von Laue.
Von Laue was not one to take risks, and he remained conservative in many of his interests. He was immensely respected and played important roles in the administration of German science, but his scientific contributions after receiving the Nobel Prize were only modest. Yet as the Nazis came to power in the early 1930’s, he was one of the few physicists to stand up and resist the Nazi take-over of German physics. He was especially disturbed by the plight of the Jewish physicists. In 1933 he was invited to give the keynote address at the conference of the German Physical Society in Wurzburg where he spoke out against the Nazi rejection of relativity as they branded it “Jewish science”. In his speech he likened Einstein, the target of much of the propaganda, to Galileo. He said, “No matter how great the repression, the representative of science can stand erect in the triumphant certainty that is expressed in the simple phrase: And yet it moves.” Von Laue believed that truth would hold out in the face of the proscription against relativity theory by the Nazi regime. The quote “And yet it moves” is supposed to have been muttered by Galileo just after his abjuration before the Inquisition, referring to the Earth moving around the Sun. Although the quote is famous, it is believed to be a myth.
In an odd side-note of history, von Laue sent his gold Nobel prize medal to Denmark for its safe keeping with Niels Bohr so that it would not be paraded about by the Nazi regime. Yet when the Nazis invaded Denmark, to avoid having the medals fall into the hands of the Nazis, the medal was dissolved in aqua regia by a member of Bohr’s team, George de Hevesy. The gold completely dissolved into an orange liquid that was stored in a beaker high on a shelf through the war. When Denmark was finally freed, the dissolved gold was precipitated out and a new medal was struck by the Nobel committee and re-presented to von Laue in a ceremony in 1951.
The Orbits of Light Rays
Von Laue’s interests always stayed close to the properties of light and electromagnetic radiation ever since he was introduced to the field when he studied with Woldemor Voigt at Göttingen in 1899. This interest included the theory of relativity, and only a few years after Einstein published his theory of General Relativity and Gravitation, von Laue added to his earlier textbook on relativity by writing a second volume on the general theory. The new volume was published in 1920 and included the theory of the deflection of light by gravity.
One of the very few illustrations in his second volume is of light coming into interaction with a super massive gravitational field characterized by a Schwarzschild radius. (No one at the time called it a “black hole”, nor even mentioned Schwarzschild. That terminology came much later.) He shows in the drawing, how light, if incident at just the right impact parameter, would actually loop around the object. This is the first time such a diagram appeared in print, showing the trajectory of light so strongly affected by gravity.
# -*- coding: utf-8 -*-
Created on Tue May 28 11:50:24 2019
import numpy as np
import matplotlib as mpl
from mpl_toolkits.mplot3d import Axes3D
from scipy import integrate
from matplotlib import pyplot as plt
from matplotlib import cm
circle = plt.Circle((0,0), radius= 10, color = 'black')
A = 10
eps = 1e-6
rp0 = np.sqrt(x**2 + y**2);
n = 1/(1 - A/(rp0+eps))
fac = np.abs((1-9*(A/rp0)**2/8)) # approx correction to Eikonal
nx = -fac*n**2*A*x/(rp0+eps)**3
ny = -fac*n**2*A*y/(rp0+eps)**3
x, y, z, w = x_y_z
[n,nx,ny] = refindex(x,y)
yp = np.zeros(shape=(4,))
yp = z/n
yp = w/n
yp = nx
yp = ny
for loop in range(-5,30):
xstart = -100
ystart = -2.245 + 4*loop
[n,nx,ny] = refindex(xstart,ystart)
y0 = [xstart, ystart, n, 0]
tspan = np.linspace(1,400,2000)
y = integrate.odeint(flow_deriv, y0, tspan)
xx = y[1:2000,0]
yy = y[1:2000,1]
lines = plt.plot(xx,yy)
c = create_circle()
axes = plt.gca()
# Now set up a circular photon orbit
xstart = 0
ystart = 15
[n,nx,ny] = refindex(xstart,ystart)
y0 = [xstart, ystart, n, 0]
tspan = np.linspace(1,94,1000)
y = integrate.odeint(flow_deriv, y0, tspan)
xx = y[1:1000,0]
yy = y[1:1000,1]
lines = plt.plot(xx,yy)
plt.setp(lines, linewidth=2, color = 'black')
One of the most striking effects of gravity on photon trajectories is the possibility for a photon to orbit a black hole in a circular orbit. This is shown in Fig. 3 as the black circular ring for a photon at a radius equal to 1.5 times the Schwarzschild radius. This radius defines what is known as the photon sphere. However, the orbit is not stable. Slight deviations will send the photon spiraling outward or inward.
The Eikonal approximation does not strictly hold under strong gravity, but the Eikonal equations with the effective refractive index of space still yield semi-quantitative behavior. In the Python code, a correction factor is used to match the theory to the circular photon orbits, while still agreeing with trajectories far from the black hole. The results of the calculation are shown in Fig. 3. For large impact parameters, the rays are deflected through a finite angle. At a critical impact parameter, near 3 times the Schwarzschild radius, the ray loops around the black hole. For smaller impact parameters, the rays are captured by the black hole.
Photons pile up around the black hole at the photon sphere. The first image ever of the photon sphere of a black hole was made earlier this year (announced April 10, 2019). The image shows the shadow of the supermassive black hole in the center of Messier 87 (M87), an elliptical galaxy 55 million light-years from Earth. This black hole is 6.5 billion times the mass of the Sun. Imaging the photosphere required eight ground-based radio telescopes placed around the globe, operating together to form a single telescope with an optical aperture the size of our planet. The resolution of such a large telescope would allow one to image a half-dollar coin on the surface of the Moon, although this telescope operates in the radio frequency range rather than the optical.
When Newton developed his theory of universal gravitation, the first problem he tackled was Kepler’s elliptical orbits of the planets around the sun, and he succeeded beyond compare. The second problem he tackled was of more practical importance than the tracks of distant planets, namely the path of the Earth’s own moon, and he was never satisfied.
Newton’s Principia and the Problem of
Measuring the precise location of the moon at very exact times against the backdrop of the celestial sphere was a method for ships at sea to find their longitude. Yet the moon’s orbit around the Earth is irregular, and Newton recognized that because gravity was universal, every planet exerted a force on each other, and the moon was being tugged upon by the sun as well as by the Earth.
Newton’s attempt with the Moon was his last significant scientific endeavor
In Propositions 65 and 66 of Book 1
of the Principia, Newton applied his
new theory to attempt to pin down the moon’s trajectory, but was thwarted by
the complexity of the three bodies of the Earth-Moon-Sun system. For instance, the force of the sun on the
moon is greater than the force of the Earth on the moon, which raised the
question of why the moon continued to circle the Earth rather than being pulled
away to the sun. Newton correctly recognized that it was the Earth-moon system that was in orbit around the sun,
and hence the sun caused only a perturbation on the Moon’s orbit around the
Earth. However, because the Moon’s orbit
is approximately elliptical, the Sun’s pull on the Moon is not constant as it
swings around in its orbit, and Newton only succeeded in making estimates of
Unsatisfied with his results in the Principia, Newton tried again, beginning
in the summer of 1694, but the problem was to too great even for him. In 1702 he published his research, as far as
he was able to take it, on the orbital trajectory of the Moon. He could pin down the motion to within 10 arc
minutes, but this was not accurate enough for reliable navigation, representing
an uncertainty of over 10 kilometers at sea—error enough to run aground at
night on unseen shoals. Newton’s attempt
with the Moon was his last significant scientific endeavor, and afterwards this
great scientist withdrew into administrative activities and other occult
interests that consumed his remaining time.
Race for the Moon
The importance of the Moon for navigation was too pressing to ignore, and in the 1740’s a heated competition to be the first to pin down the Moon’s motion developed among three of the leading mathematicians of the day—Leonhard Euler, Jean Le Rond D’Alembert and Alexis Clairaut—who began attacking the lunar problem and each other . Euler in 1736 had published the first textbook on dynamics that used the calculus, and Clairaut had recently returned from Lapland with Maupertuis. D’Alembert, for his part, had placed dynamics on a firm physical foundation with his 1743 textbook. Euler was first to publish with a lunar table in 1746, but there remained problems in his theory that frustrated his attempt at attaining the required level of accuracy.
At nearly the same time Clairaut and
D’Alembert revisited Newton’s foiled lunar theory and found additional terms in
the perturbation expansion that Newton had neglected. They rushed to beat each other into print, but
Clairaut was distracted by a prize competition for the most accurate lunar
theory, announced by the Russian Academy of Sciences and refereed by Euler,
while D’Alembert ignored the competition, certain that Euler would rule in
favor of Clairaut. Clairaut won the
prize, but D’Alembert beat him into print.
The rivalry over the moon did not
end there. Clairaut continued to improve lunar tables by combining theory and
observation, while D’Alembert remained more purely theoretical. A growing animosity between Clairaut and
D’Alembert spilled out into the public eye and became a daily topic of
conversation in the Paris salons. The
difference in their approaches matched the difference in their personalities,
with the more flamboyant and pragmatic Clairaut disdaining the purist approach
and philosophy of D’Alembert. Clairaut
succeeded in publishing improved lunar theory and tables in 1752, followed by
Euler in 1753, while D’Alembert’s interests were drawn away towards his
activities for Diderot’s Encyclopedia.
The battle over the Moon in the late 1740’s was carried out on the battlefield of perturbation theory. To lowest order, the orbit of the Moon around the Earth is a Keplerian ellipse, and the effect of the Sun, though creating problems for the use of the Moon for navigation, produces only a small modification—a perturbation—of its overall motion. Within a decade or two, the accuracy of perturbation theory calculations, combined with empirical observations, had improved to the point that accurate lunar tables had sufficient accuracy to allow ships to locate their longitude to within a kilometer at sea. The most accurate tables were made by Tobias Mayer, who was awarded posthumously a prize of 3000 pounds by the British Parliament in 1763 for the determination of longitude at sea. Euler received 300 pounds for helping Mayer with his calculations. This was the same prize that was coveted by the famous clockmaker John Harrison and depicted so brilliantly in Dava Sobel’s Longitude (1995).
Several years later in 1772 Lagrange discovered an interesting special solution to the planar three-body problem with three massive points each executing an elliptic orbit around the center of mass of the system, but configured such that their positions always coincided with the vertices of an equilateral triangle . He found a more important special solution in the restricted three-body problem that emerged when a massless third body was found to have two stable equilibrium points in the combined gravitational potentials of two massive bodies. These two stable equilibrium points are known as the L4 and L5 Lagrange points. Small objects can orbit these points, and in the Sun-Jupiter system these points are occupied by the Trojan asteroids. Similarly stable Lagrange points exist in the Earth-Moon system where space stations or satellites could be parked.
For the special case of circular orbits of constant angular frequency w, the motion of the third mass is described by the Lagrangian
where the potential is time dependent because of the motion of the two larger masses. Lagrange approached the problem by adopting a rotating reference frame in which the two larger masses m1 and m2 move along the stationary line defined by their centers. The Lagrangian in the rotating frame is
where the effective potential is now time independent. The first term in the effective potential is the Coriolis effect and the second is the centrifugal term.
The effective potential is shown in the figure for m3 = 10m2. There are five locations where the gradient of the effective potential equals zero. The point L1 is the equilibrium position between the two larger masses. The points L2 and L3 are at positions where the centrifugal force balances the gravitational attraction to the two larger masses. These are also the points that separate local orbits around a single mass from global orbits that orbit the two-body system. The last two Lagrange points at L4 and L5 are at one of the vertices of an equilateral triangle, with the other two vertices at the positions of the larger masses. The first three Lagrange points are saddle points. The last two are at maxima of the effective potential.
L1, lies between Earth and the sun at about 1 million miles from Earth. L1 gets an uninterrupted view of the sun, and is currently occupied by the Solar and Heliospheric Observatory (SOHO) and the Deep Space Climate Observatory. L2 also lies a million miles from Earth, but in the opposite direction of the sun. At this point, with the Earth, moon and sun behind it, a spacecraft can get a clear view of deep space. NASA’s Wilkinson Microwave Anisotropy Probe (WMAP) is currently at this spot measuring the cosmic background radiation left over from the Big Bang. The James Webb Space Telescope will move into this region in 2021.
Gutzwiller, M. C. (1998). “Moon-Earth-Sun: The oldest
three-body problem.” Reviews of Modern Physics70(2):
 J.L. Lagrange Essai sur le problème des trois corps,
1772, Oeuvres tome 6
The 1960’s are known as a time of cultural revolution, but perhaps less known was the revolution that occurred in the science of dynamics. Three towering figures of that revolution were Stephen Smale (1930 – ) at Berkeley, Andrey Kolmogorov (1903 – 1987) in Moscow and his student Vladimir Arnold (1937 – 2010). Arnold was only 20 years old in 1957 when he solved Hilbert’s thirteenth problem (that any continuous function of several variables can be constructed with a finite number of two-variable functions). Only a few years later his work on the problem of small denominators in dynamical systems provided the finishing touches on the long elusive explanation of the stability of the solar system (the problem for which Poincaré won the King Oscar Prize in mathematics in 1889 when he discovered chaotic dynamics ). This theory is known as KAM-theory, using the first initials of the names of Kolmogorov, Arnold and Moser . Building on his breakthrough in celestial mechanics, Arnold’s work through the 1960’s remade the theory of Hamiltonian systems, creating a shift in perspective that has permanently altered how physicists look at dynamical systems.
Hamiltonian Physics on a Torus
Traditionally, Hamiltonian physics is associated with systems of inertial objects that conserve the sum of kinetic and potential energy, in other words, conservative non-dissipative systems. But a modern view (after Arnold) of Hamiltonian systems sees them as hyperdimensional mathematical mappings that conserve volume. The space that these mappings inhabit is phase space, and the conservation of phase-space volume is known as Liouville’s Theorem . The geometry of phase space is called symplectic geometry, and the universal position that symplectic geometry now holds in the physics of Hamiltonian mechanics is largely due to Arnold’s textbook Mathematical Methods of Classical Mechanics (1974, English translation 1978) . Arnold’s famous quote from that text is “Hamiltonian mechanics is geometry in phase space”.
One of the striking aspects of this textbook is the reduction of phase-space geometry to the geometry of a hyperdimensional torus for a large number of Hamiltonian systems. If there are as many conserved quantities as there are degrees of freedom in a Hamiltonian system, then the system is called “integrable” (because you can integrated the equations of motion to find a constant of the motion). Then it is possible to map the physics onto a hyperdimensional torus through the transformation of dynamical coordinates into what are known as “action-angle” coordinates . Each independent angle has an associated action that is conserved during the motion of the system. The periodicity of the dynamical angle coordinate makes it possible to identify it with the angular coordinate of a multi-dimensional torus. Therefore, every integrable Hamiltonian system can be mapped to motion on a multi-dimensional torus (one dimension for each degree of freedom of the system).
Actually, integrable Hamiltonian systems are among the most boring dynamical systems you can imagine. They literally just go in circles (around the torus). But as soon as you add a small perturbation that cannot be integrated they produce some of the most complex and beautiful patterns of all dynamical systems. It was Arnold’s focus on motions on a torus, and perturbations that shift the dynamics off the torus, that led him to propose a simple mapping that captured the essence of Hamiltonian chaos.
The Arnold Cat Map
Motion on a two-dimensional torus is defined by two angles, and trajectories on a two-dimensional torus are simple helixes. If the periodicities of the motion in the two angles have an integer ratio, the helix repeats itself. However, if the ratio of periods (also known as the winding number) is irrational, then the helix never repeats and passes arbitrarily closely to any point on the surface of the torus. This last case leads to an “ergodic” system, which is a term introduced by Boltzmann to describe a physical system whose trajectory fills phase space. The behavior of a helix for rational or irrational winding number is not terribly interesting. It’s just an orbit going in circles like an integrable Hamiltonian system. The helix can never even cross itself.
However, if you could add a new dimension to the torus (or add a new degree of freedom to the dynamical system), then the helix could pass over or under itself by moving into the new dimension. By weaving around itself, a trajectory can become chaotic, and the set of many trajectories can become as mixed up as a bowl of spaghetti. This can be a little hard to visualize, especially in higher dimensions, but Arnold thought of a very simple mathematical mapping that captures the essential motion on a torus, preserving volume as required for a Hamiltonian system, but with the ability for regions to become all mixed up, just like trajectories in a nonintegrable Hamiltonian system.
A unit square is isomorphic to a two-dimensional torus. This means that there is a one-to-one mapping of each point on the unit square to each point on the surface of a torus. Imagine taking a sheet of paper and forming a tube out of it. One of the dimensions of the sheet of paper is now an angle coordinate that is cyclic, going around the circumference of the tube. Now if the sheet of paper is flexible (like it is made of thin rubber) you can bend the tube around and connect the top of the tube with the bottom, like a bicycle inner tube. The other dimension of the sheet of paper is now also an angle coordinate that is cyclic. In this way a flat sheet is converted (with some bending) into a torus.
Arnold’s key idea was to create a transformation that takes the torus into itself, preserving volume, yet including the ability for regions to pass around each other. Arnold accomplished this with the simple map
where the modulus 1 takes the unit square into itself. This transformation can also be expressed as a matrix
followed by taking modulus 1. The transformation matrix is called a Floquet matrix, and the determinant of the matrix is equal to unity, which ensures that volume is conserved.
Arnold decided to illustrate this mapping by using a crude image of the face of a cat (See Fig. 1). Successive applications of the transformation stretch and shear the cat, which is then folded back into the unit square. The stretching and folding preserve the volume, but the image becomes all mixed up, just like mixing in a chaotic Hamiltonian system, or like an immiscible dye in water that is stirred.
When the transformation matrix is applied to continuous values, it produces a continuous range of transformed values that become thinner and thinner until the unit square is uniformly mixed. However, if the unit square is discrete, made up of pixels, then something very different happens (see Fig. 3). The image of the cat in this case is composed of a 50×50 array of pixels. For early iterations, the image becomes stretched and mixed, but at iteration 50 there are 4 low-resolution upside-down versions of the cat, and at iteration 75 the cat fully reforms, but is upside-down. Continuing on, the cat eventually reappears fully reformed and upright at iteration 150. Therefore, the discrete case displays a recurrence and the mapping is periodic. Calculating the period of the cat map on lattices can lead to interesting patterns, especially if the lattice is composed of prime numbers .
The Cat Map and the Golden Mean
The golden mean, or the golden ratio, 1.618033988749895 is never far away when working with Hamiltonian systems. Because the golden mean is the “most irrational” of all irrational numbers, it plays an essential role in KAM theory on the stability of the solar system. In the case of Arnold’s cat map, it pops up its head in several ways. For instance, the transformation matrix has eigenvalues
with the remarkable property that
which guarantees conservation of area.
Selected V. I. Arnold Publications
V. I. “FUNCTIONS OF 3 VARIABLES.” Doklady Akademii Nauk Sssr 114(4):
V. I. “GENERATION OF QUASI-PERIODIC MOTION FROM A FAMILY OF PERIODIC
MOTIONS.” Doklady Akademii Nauk Sssr 138(1): 13-&.
V. I. “STABILITY OF EQUILIBRIUM POSITION OF A HAMILTONIAN SYSTEM OF
ORDINARY DIFFERENTIAL EQUATIONS IN GENERAL ELLIPTIC CASE.” Doklady
Akademii Nauk Sssr 137(2): 255-&. (1961)
V. I. “BEHAVIOUR OF AN ADIABATIC INVARIANT WHEN HAMILTONS FUNCTION IS
UNDERGOING A SLOW PERIODIC VARIATION.” Doklady Akademii Nauk Sssr 142(4):
V. I. “CLASSICAL THEORY OF PERTURBATIONS AND PROBLEM OF STABILITY OF
PLANETARY SYSTEMS.” Doklady Akademii Nauk Sssr 145(3):
V. I. “BEHAVIOUR OF AN ADIABATIC INVARIANT WHEN HAMILTONS FUNCTION IS
UNDERGOING A SLOW PERIODIC VARIATION.” Doklady Akademii Nauk Sssr 142(4):
V. I. and Y. G. Sinai. “SMALL PERTURBATIONS OF AUTHOMORPHISMS OF A
TORE.” Doklady Akademii Nauk Sssr 144(4): 695-&. (1962)
V. I. “Small denominators and problems of the stability of motion in
classical and celestial mechanics (in Russian).” Usp. Mat. Nauk. 18:
V. I. and A. L. Krylov. “UNIFORM DISTRIBUTION OF POINTS ON A SPHERE AND
SOME ERGODIC PROPERTIES OF SOLUTIONS TO LINEAR ORDINARY DIFFERENTIAL EQUATIONS
IN COMPLEX REGION.” Doklady Akademii Nauk Sssr 148(1):
V. I. “INSTABILITY OF DYNAMICAL SYSTEMS WITH MANY DEGREES OF
FREEDOM.” Doklady Akademii Nauk Sssr 156(1): 9-&. (1964)
V. “SUR UNE PROPRIETE TOPOLOGIQUE DES APPLICATIONS GLOBALEMENT CANONIQUES
DE LA MECANIQUE CLASSIQUE.” Comptes Rendus Hebdomadaires Des Seances De
L Academie Des Sciences 261(19): 3719-&. (1965)
Arnold, V. I. “APPLICABILITY CONDITIONS AND ERROR ESTIMATION BY AVERAGING FOR SYSTEMS WHICH GO THROUGH RESONANCES IN COURSE OF EVOLUTION.” Doklady Akademii Nauk Sssr 161(1): 9-&. (1965)
 Dumas, H. S. The KAM Story: A friendly introduction to the content, history and significance of Classical Kolmogorov-Arnold-Moser Theory, World Scientific. (2014)
 See Chapter 6, “The Tangled Tale of Phase Space” in Galileo Unbound (D. D. Nolte, Oxford University Press, 2018)
 V. I. Arnold, Mathematical Methods of Classical Mechanics (Nauk 1974, English translation Springer 1978)
Nature loves the path of steepest descent. Place a ball on a smooth curved surface and release it, and it will instantansouly accelerate in the direction of steepest descent. Shoot a laser beam from an oblique angle onto a piece of glass to hit a target inside, and the path taken by the beam is the only path that decreases the distance to the target in the shortest time. Diffract a stream of electrons from the surface of a crystal, and quantum detection events are greatest at the positions where the troughs and peaks of the deBroglie waves converge the most. The first example is Newton’s second law. The second example is Fermat’s principle. The third example is Feynman’s path-integral formulation of quantum mechanics. They all share in common a minimization principle—the principle of least action—that the path of a dynamical system is the one that minimizes a property known as “action”.
The Eikonal Equation is the “F = ma” of ray optics. It’s solutions describe the paths of light rays through complicated media.
The principle of least action, first proposed by the French physicist Maupertuis through mechanical analogy, became a principle of Lagrangian mechanics in the hands of Lagrange, but was still restricted to mechanical systems of particles. The principle was generalized forty years later by Hamilton, who began by considering the propagation of light waves, and ended by transforming mechanics into a study of pure geometry divorced from forces and inertia. Optics played a key role in the development of mechanics, and mechanics returned the favor by giving optics the Eikonal Equation. The Eikonal Equation is the “F = ma” of ray optics. It’s solutions describe the paths of light rays through complicated media.
Anyone who has taken a course in optics knows that Étienne-Louis Malus (1775-1812) discovered the polarization of light, but little else is taught about this French mathematician who was one of the savants Napoleon had taken along with himself when he invaded Egypt in 1798. After experiencing numerous horrors of war and plague, Malus returned to France damaged but wiser. He discovered the polarization of light in the Fall of 1808 as he was playing with crystals of icelandic spar at sunset and happened to view last rays of the sun reflected from the windows of the Luxumbourg palace. Icelandic spar produces double images in natural light because it is birefringent. Malus discovered that he could extinguish one of the double images of the Luxumbourg windows by rotating the crystal a certain way, demonstrating that light is polarized by reflection. The degree to which light is extinguished as a function of the angle of the polarizing crystal is known as Malus’ Law.
Malus had picked up an interest in the general properties of light and imaging during lulls in his ordeal in Egypt. He was an emissionist following his compatriot Laplace, rather than an undulationist following Thomas Young. It is ironic that the French scientists were staunchly supporting Newton on the nature of light, while the British scientist Thomas Young was trying to upend Netwonian optics. Almost all physicists at that time were emissionists, only a few years after Young’s double-slit experiment of 1804, and few serious scientists accepted Young’s theory of the wave nature of light until Fresnel and Arago supplied the rigorous theory and experimental proofs much later in 1819.
As a prelude to his later discovery of polarization, Malus had earlier proven a theorem about trajectories that particles of light take through an optical system. One of the key questions about the particles of light in an optical system was how they formed images. The physics of light particles moving through lenses was too complex to treat at that time, but reflection was relatively easy based on the simple reflection law. Malus proved a theorem mathematically that after a reflection from a curved mirror, a set of rays perpendicular to an initial nonplanar surface would remain perpendicular at a later surface after reflection (this property is closely related to the conservation of optical etendue). This is known as Malus’ Theorem, and he thought it only held true after a single reflection, but later mathematicians proved that it remains true even after an arbitrary number of reflections, even in cases when the rays intersect to form an optical effect known as a caustic. The mathematics of caustics would catch the interest of an Irish mathematician and physicist who helped launch a new field of mathematical physics.
Hamilton’s Characteristic Function
William Rowan Hamilton (1805 – 1865) was a child prodigy who taught himself thirteen languages by the time he was thirteen years old (with the help of his linguist uncle), but mathematics became his primary focus at Trinity College at the University in Dublin. His mathematical prowess was so great that he was made the Astronomer Royal of Ireland while still an undergraduate student. He also became fascinated in the theory of envelopes of curves and in particular to the mathematics of caustic curves in optics.
In 1823 at the age of 18, he wrote a paper titled Caustics that was read to the Royal Irish Academy. In this paper, Hamilton gave an exceedingly simple proof of Malus’ Law, but that was perhaps the simplest part of the paper. Other aspects were mathematically obscure and reviewers requested further additions and refinements before publication. Over the next four years, as Hamilton expanded this work on optics, he developed a new theory of optics, the first part of which was published as Theory of Systems of Rays in 1827 with two following supplements completed by 1833 but never published.
Hamilton’s most important contribution
to optical theory (and eventually to mechanics) he called his characteristic
function. By applying the principle of
Fermat’s least time, which he called his principle of stationary action, he
sought to find a single unique function that characterized every path through
an optical system. By first proving
Malus’ Theorem and then applying the theorem to any system of rays using the
principle of stationary action, he was able to construct two partial
differential equations whose solution, if it could be found, defined every ray
through the optical system. This result
was completely general and could be extended to include curved rays passing
through inhomogeneous media. Because it
mapped input rays to output rays, it was the most general characterization of
any defined optical system. The
characteristic function defined surfaces of constant action whose normal
vectors were the rays of the optical system.
Today these surfaces of constant action are called the Eikonal function
(but how it got its name is the next chapter of this story). Using his characteristic function, Hamilton
predicted a phenomenon known as conical refraction in 1832, which was
subsequently observed, launching him to a level of fame unusual for an
Once Hamilton had established his principle of stationary action of curved light rays, it was an easy step to extend it to apply to mechanical systems of particles with curved trajectories. This step produced his most famous work On a General Method in Dynamics published in two parts in 1834 and 1835  in which he developed what became known as Hamiltonian dynamics. As his mechanical work was extended by others including Jacobi, Darboux and Poincaré, Hamilton’s work on optics was overshadowed, overlooked and eventually lost. It was rediscovered when Schrödinger, in his famous paper of 1926, invoked Hamilton’s optical work as a direct example of the wave-particle duality of quantum mechanics . Yet in the interim, a German mathematician tackled the same optical problems that Hamilton had seventy years earlier, and gave the Eikonal Equation its name.
The German mathematician Heinrich Bruns (1848-1919) was engaged chiefly with the measurement of the Earth, or geodesy. He was a professor of mathematics in Berlin and later Leipzig. One claim fame was that one of his graduate students was Felix Hausdorff  who would go on to much greater fame in the field of set theory and measure theory (the Hausdorff dimension was a precursor to the fractal dimension). Possibly motivated by his studies done with Hausdorff on refraction of light by the atmosphere, Bruns became interested in Malus’ Theorem for the same reasons and with the same goals as Hamilton, yet was unaware of Hamilton’s work in optics.
The mathematical process of creating “images”, in the sense of a mathematical mapping, made Bruns think of the Greek word eikwn which literally means “icon” or “image”, and he published a small book in 1895 with the title Das Eikonal in which he derived a general equation for the path of rays through an optical system. His approach was heavily geometrical and is not easily recognized as an equation arising from variational principals. It rediscovered most of the results of Hamilton’s paper on the Theory of Systems of Rays and was thus not groundbreaking in the sense of new discovery. But it did reintroduce the world to the problem of systems of rays, and his name of Eikonal for the equations of the ray paths stuck, and was used with increasing frequency in subsequent years. Arnold Sommerfeld (1868 – 1951) was one of the early proponents of the Eikonal equation and recognized its connection with action principles in mechanics. He discussed the Eikonal equation in a 1911 optics paper with Runge  and in 1916 used action principles to extend Bohr’s model of the hydrogen atom . While the Eikonal approach was not used often, it became popular in the 1960’s when computational optics made numerical solutions possible.
Lagrangian Dynamics of Light Rays
In physical optics, one of the most important properties of a ray passing through an optical system is known as the optical path length (OPL). The OPL is the central quantity that is used in problems of interferometry, and it is the central property that appears in Fermat’s principle that leads to Snell’s Law. The OPL played an important role in the history of the calculus when Johann Bernoulli in 1697 used it to derive the path taken by a light ray as an analogy of a brachistochrone curve – the curve of least time taken by a particle between two points.
The OPL between two points in a refractive medium is the sum of the piecewise product of the refractive index n with infinitesimal elements of the path length ds. In integral form, this is expressed as
where the “dot” is a derivative
with respedt to s. The optical
Lagrangian is recognized as
The Lagrangian is inserted into the Euler equations to yield (after some algebra, see Introduction to Modern Dynamics pg. 336)
This is a second-order
ordinary differential equation in the variables xa that define the
ray path through the system. It is
literally a “trajectory” of the ray, and the Eikonal equation becomes the F =
ma of ray optics.
In a paraxial system (in which
the rays never make large angles relative to the optic axis) it is common to
select the position z as a single parameter to define the curve of the ray path
so that the trajectory is parameterized as
where the derivatives
are with respect to z, and the effective Lagrangian is recognized as
formulation is derived from the Lagrangian by defining an optical Hamiltonian
as the Legendre transform of the Lagrangian.
To start, the Lagrangian is expressed in terms of the generalized
coordinates and momenta. The generalized
optical momenta are defined as
This relationship leads
to an alternative expression for the Eikonal equation (also known as the scalar
Eikonal equation) expressed as
where S(x,y,z) = const. is the eikonal function. The
momentum vectors are perpendicular to the surfaces of constant S, which
are recognized as the wavefronts of a propagating wave.
Lagrangian can be restated as a function of the generalized momenta as
and the Legendre
transform that takes the Lagrangian into the Hamiltonian is
The trajectory of the
rays is the solution to Hamilton’s equations of motion applied to this
If the optical rays are
restricted to the x-y plane, then Hamilton’s equations of motion can be
expressed relative to the path length ds, and the momenta are pa =
ndxa/ds. The ray equations are
(simply expressing the 2 second-order Eikonal equation as 4 first-order
where the dot is a derivative
with respect to the element ds.
As an example, consider a radial refractive index profile in the x-y plane
where r is the radius on the x-y plane. Putting this refractive index profile into the Eikonal equations creates a two-dimensional orbit in the x-y plane. The following Python code solves for individual trajectories.
Python Code: raysimple.py
# -*- coding: utf-8 -*-
Created on Tue May 28 11:50:24 2019
import numpy as np
import matplotlib as mpl
from mpl_toolkits.mplot3d import Axes3D
from scipy import integrate
from matplotlib import pyplot as plt
from matplotlib import cm
# selection 1 = Gaussian
# selection 2 = Donut
selection = 1
if selection == 1:
sig = 10
n = 1 + np.exp(-(x**2 + y**2)/2/sig**2)
nx = (-2*x/2/sig**2)*np.exp(-(x**2 + y**2)/2/sig**2)
ny = (-2*y/2/sig**2)*np.exp(-(x**2 + y**2)/2/sig**2)
elif selection == 2:
sig = 10;
r2 = (x**2 + y**2)
r1 = np.sqrt(r2)
np.expon = np.exp(-r2/2/sig**2)
n = 1+0.3*r1*np.expon;
nx = 0.3*r1*(-2*x/2/sig**2)*np.expon + 0.3*np.expon*2*x/r1
ny = 0.3*r1*(-2*y/2/sig**2)*np.expon + 0.3*np.expon*2*y/r1
x, y, z, w = x_y_z
n, nx, ny = refindex(x,y)
yp = np.zeros(shape=(4,))
yp = z/n
yp = w/n
yp = nx
yp = ny
V = np.zeros(shape=(100,100))
for xloop in range(100):
xx = -20 + 40*xloop/100
for yloop in range(100):
yy = -20 + 40*yloop/100
n,nx,ny = refindex(xx,yy)
V[yloop,xloop] = n
fig = plt.figure(1)
contr = plt.contourf(V,100, cmap=cm.coolwarm, vmin = 1, vmax = 3)
fig.colorbar(contr, shrink=0.5, aspect=5)
fig = plt.show()
v1 = 0.707 # Change this initial condition
v2 = np.sqrt(1-v1**2)
y0 = [12, v1, 0, v2] # Change these initial conditions
tspan = np.linspace(1,1700,1700)
y = integrate.odeint(flow_deriv, y0, tspan)
lines = plt.plot(y[1:1550,0],y[1:1550,1])
An excellent textbook on geometric optics from Hamilton’s point of view is K. B. Wolf, Geometric Optics in Phase Space (Springer, 2004). Another is H. A. Buchdahl, An Introduction to Hamiltonian Optics (Dover, 1992).
A rather older textbook on geometrical optics is by J. L. Synge, Geometrical Optics: An Introduction to Hamilton’s Method (Cambridge University Press, 1962) showing the derivation of the ray equations in the final chapter using variational methods. Synge takes a dim view of Bruns’ term “Eikonal” since Hamilton got there first and Bruns was unaware of it.
A book that makes an especially strong case for the Optical-Mechanical analogy of Fermat’s principle, connecting the trajectories of mechanics to the paths of optical rays is Daryl Holm, Geometric Mechanics: Part I Dynamics and Symmetry (Imperial College Press 2008).
 Hamilton, W. R. “On a general method in dynamics I.” Mathematical Papers, I ,103-161: 247-308. (1834); Hamilton, W. R. “On a general method in dynamics II.” Mathematical Papers, I ,103-161: 95-144. (1835)
 Schrodinger, E. “Quantification of the eigen-value problem.” Annalen Der Physik 79(6): 489-527. (1926)